When an edit happens:
- if the edit is by a bot it is ignored https://gerrit.wikimedia.org/r/322787
- if the edit is in a namespace other than 0 it is ignored https://gerrit.wikimedia.org/r/322784
- if not the edit is recorded in a store https://gerrit.wikimedia.org/r/320628
When I query the store for a page I should be able to see:
- How many edits it has had https://gerrit.wikimedia.org/r/320628
- Whether the page is new https://gerrit.wikimedia.org/r/323897
- When the page was added to the store https://gerrit.wikimedia.org/r/320628
- When the page was last updated https://gerrit.wikimedia.org/r/320628
- The total amount of bytes changed since it was added to the store https://gerrit.wikimedia.org/r/323324
- the number of anonymous edits https://gerrit.wikimedia.org/r/322975
- the names of the editors https://gerrit.wikimedia.org/r/323072
- how biased the article is. This is calculated like so: https://gerrit.wikimedia.org/r/323072
If user A edited the page 1 times and user B edited it 9 times, the bias is ( 9/10 - 1/10 ) = 0.8
If user A edits 5 times and user B 5 times the bias is (5/10 - 5/10) = 0
If an article was only edited by user A then the bias is 1
- If a move action is detected than the trending service should update the local store to remove the old page and copy over its content to the new page. https://gerrit.wikimedia.org/r/323042
- If a delete action is detected the page should be removed from the local store https://gerrit.wikimedia.org/r/323043
-
If a protect action is detected the page in the store should be marked as protected~[not needed in v1] - Actually subscribe to move/delete events https://gerrit.wikimedia.org/r/323196
An edit should only be processed once and the store should be accessible to all rest endpoints.
Every 100 edits, a purge script is run. During the purge pages which meet the following criteria are removed from the store:
- It's edit speed (in edits per minute) drops below 0.1 edits per minute (configurable) https://gerrit.wikimedia.org/r/323239
- It hasn't been updated in 40 minutes (configurable) https://gerrit.wikimedia.org/r/323239
- It was added to the store more than a day ago (configurable) https://gerrit.wikimedia.org/r/323239
Production requirement:
- The amount of pages stored in the service's memory should never exceed a configurable number