Simple popularity algorithm with decay

Most social web sites have user created objects such as pictures, posts, links, etc.. that need to be shown in some dynamic popularity or ranking order on website. The measurement of popularity is typically some user activity such as votes, downvotes, number of clicks or shares per object.

It would be trivial just to calculate total sum of the actions, and sort objects by the sum but this doesn’t take into account the bandwagon effect where people start voting the already top items so the top list freezes quickly and no new items have change to bubble on top. Another problem of this approach is that once object reaches high score it stays high even when no actions would anymore happen. Good popularity measurement needs to decay somehow.

First page of top list (typically 10-20 items)  should always show fresh “hot” items together with long time favorites. We want that older objects fade and need more and more actions to stay on the top list as time passes by.

Simple way to do this is with relative scoring, where base score  grows constantly so that score for younger object is always naturally higher than for old objects with same number of actions. Object creation timestamp can be used trivially as base score. For example:

object_score = created_timestamp + number_of_actions

created_timestamp is object creation time as number of hours or days from some arbitrary fixed reference time . The popularity list is simply all objects sorted by the score. New objects have naturally higher score and old ones need more and more actions to stay on top.

Example in Javascript

Fixed reference time 1st Jan 2000.  Timestamp as hours

Object A created on midnight 1st May 2012, 30 votes

(new Date('2012-5-1') - new Date('2000-1-1'))/1000/60/60 + 30 = 108126

Object B created on midnight 2nd May 2012 2 votes

(new Date('2012-5-2') - new Date('2000-1-1'))/1000/60/60 + 2 = 108122

So, on 2nd May the object A score would still be ahead B but the B needs only 5 more votes to pass A. The A will fall behind quickly unless it gets constantly more votes. In fact with this trivial formula object has to get 24 actions per day just to keep up with the increasing base score.

Major benefit of growing base score is that its one pass, i.e. app  updates score for every new action and thats it. There is no need to do computation on query and no periodic maintenance is required. It’s also compatible with Map-Reduce based views (e.g. CouchDB) as its not dependent of query time.

I assume many high traffic web sites (Digg for sure) use  some derivative of this basic method, but each site needs to tune it based on their actions, traffic and user behavior.

See also Simple Cooldown and Rate Control Algorithm for another perspective to time based scoring.

One Response to Simple popularity algorithm with decay

  1. Pingback: Simple Cooldown and Rate Control Algorithm | Brave New Method

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: