Mathematicians/statisticians - I need your help!

Morrus

Well, that was fun
Staff member
I'm putting together the annual Most Popular News Items of the Year. This is based on page views of each news item.

Usually, I simply order them in descending order by view count. However, this favours older news items (those from the first part of the year) over newer ones which haven't had the same time to rack up the views.

The obvious - and easy - solution is just to count page views in the first month since the item was posted. Unfortunately, I don't have that data. All I have is post date and total view count.

So, is there any valid way I can mathematically adjust the numbers to account for age? I considered simply dividing by the number of months the item has been available, but that has the opposite effect and favours newer items over older ones.
 

log in or register to remove this ad

EdAbbey

Explorer
I would assume that most items would show a pretty typical curve of diminishing views per month. There is likely some calculus that can be used to plot a fancy graph. But I can't recall any of my calculus so I wonder if you can simplify matters somewhat. Maybe this?:

1. Group your page view data according to the number of months the item has been available
2. Calculate the average number of views for each of these 12 groups
3. For each item, compare the total number of views to the average of its particular group and express it as a percentage
4. Order the items according to the degree (percentage) they deviate from average rather than view count

Does this make any sense? I'm sure there are serious statistical issues with this approach but its the best I can come up with.

Cheers
 

aramis erak

Legend
I would assume that most items would show a pretty typical curve of diminishing views per month. There is likely some calculus that can be used to plot a fancy graph. But I can't recall any of my calculus so I wonder if you can simplify matters somewhat. Maybe this?:

1. Group your page view data according to the number of months the item has been available
2. Calculate the average number of views for each of these 12 groups
3. For each item, compare the total number of views to the average of its particular group and express it as a percentage
4. Order the items according to the degree (percentage) they deviate from average rather than view count

Does this make any sense? I'm sure there are serious statistical issues with this approach but its the best I can come up with.

Cheers

Makes sense, but from an admin standpoint, a real pain to do (I run a different VBulletin site). There are ways to capture the data Morrus wants using SQL queries... but I don't speak SQL.
 

Nagol

Unimportant
The simplest way would be to pick a curve of weights and divide the counts by the sum of the weights the post has experienced.

For example, say you decide that a post's first week is worth 1 and each successive week is worth half as much as the previous.

Post A has 8,000 views and is 2 weeks old. It has a score of 8,000 / 1.5 = 5,333.
Post B has 10,000 views and is 8 weeks old. It has a score of 10,000 / 1.9921875 (which is 1 + 0.5 + 0.25 + 0.125 + 0.0625...) = 5019

For threads, I'd probably stop the weighting one week after the last post (assuming it wasn't necroed).
 

Morrus

Well, that was fun
Staff member
Hmm. That may be a good approach, [MENTION=23935]Nagol[/MENTION]. Make it months rather than weeks, and it might work. I’ll give it a try!
 

Remove ads

AD6_gamerati_skyscraper

Remove ads

Recent & Upcoming Releases

Top