What’s Trending on Open Library?
A major update to the Open Library search engine now makes it easy for patrons to find books that are receiving spikes of interest.
You may be familiar with the trending books featured on Open Library’s home page. Actually, you might be very familiar with them, because many seldom change! Our previous trending algorithm approximated popularity by tracking how often patrons clicked that they wanted to read a book. While this approach is great for showcasing popular books, the results often remain the same for weeks at a time.
The new trending algorithm, developed by
Benjamin Deitch
(Open Library volunteer and Engineering Fellow) and
Drini Cami
(Open Library staff and senior software engineer) uses hour-by-hour statistics to give patrons fresh, timely, high-interest books that are gaining traction now on Open Library.
This improved algorithm now powers much of the Open Library homepage and is fully integrated into Open Library’s search engine, meaning:
A patron can sort any search on Open Library by trending scores. Check out what’s trending today in
Sci-fi
,
Romance
, and
Short-reads in French
.
A more diverse selection of books should be displayed within the carousels on the homepage, the library explorer, and on subject pages.
Librarians can leverage sort-by-trending to discover which high-traffic book records may be impactful to prioritize.
Sorting by Trending
From the search results page, a patron may change the “Relevance” sort dropdown to “Trending” to sort results by the new z-score trending algorithm:
The Algorithm
Open Library’s Trending algorithm is implemented by computing a
z-score
for each book, which compares each book’s: (a) “activity score” over the last 24 hours with (b) the total “activity score” of the last 7 days.
Activity scores are computed for a given time interval by adding the book’s total human page views (how often is the book page visited) with an amplified count of its reading log events (e.g. when a patron marks a book as “Want to Read”). Here, amplified means that a single reading log event has a higher impact on the activity score than a single page view.
All of the intermediary data used to compose the z-score is stored and accessible from our search engine in the following ways:
For Developers
While the
trending_z_score
is the ultimate value used to determine a book’s trending score on Open Library, developers may also query the search engine directly to access many of the intermediary, raw values used to compute this score.
For instance, we’ve been experimenting with the
trending_score_daily_[0-6]
fields and the
trending_score_hourly_sum
field to create useful ways of visualizing trending data as a chart over time:
The search engine may be
queried
and filter by:
trending_score_hourly_sum
– Find books with the highest accumulative hourly score for today, as opposed to the computed weekly trending score.
trending_score_daily_0
through
trending_score_daily_6
– Find books…
Descubre más desde Hoy En Perspectiva
Suscríbete y recibe las últimas entradas en tu correo electrónico.