Page MenuHomePhabricator

Have PageAssessments store all the assessments in ExtensionData until the page is finished parsing
Closed, ResolvedPublic3 Story Points

Description

Right now PageAssessments immediately writes each assessment to the database as soon as each ((#assessment}} parser function is parsed. This has two disadvantages:

  1. We have no way of knowing when an assessment has been removed from the page and thus needs to be deleted.
  2. Each page parse may trigger several separate database inserts. It would be better if we could batch the inserts (and deletions) in a single transaction.

To improve things, we should instead have PageAssessments temporarily store each assessment in the ParserOutput's ExtensionData. Then, once the page parsing is complete, we can retrieve all the data using getExtensionData(), figure out which updates need to be made, and batch the updates into a single transaction (using begin and commit).

Event Timeline

kaldari created this task.Dec 10 2015, 4:23 AM
kaldari raised the priority of this task from to Needs Triage.
kaldari updated the task description. (Show Details)
kaldari added a project: Community-Tech-Sprint.
kaldari added a subscriber: kaldari.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptDec 10 2015, 4:23 AM

I ran into a fundamental problem with using getExtensionData() - we need to know the keys to use it. Which we don't.
The set would go like: ( project => array( class, importance ) ) but to fetch that data back, I have to know the list of wikiprojects used in the page.

DannyH edited a custom field.
brion added a subscriber: brion.Dec 10 2015, 7:03 PM

Could also do this by storing the list in a known key, or by just putting everything under a single array or object that's indexed by a known key. I'm a bit leery of exposing the full list of other extensions' internal data...

Could also do this by storing the list in a known key, or by just putting everything under a single array or object that's indexed by a known key.

I need to store multiple key, value pairs at different times during the parsing. To do the above would mean to repeatedly call getExtensionData( 'known key' ) --> Append to the array --> setExtensionData( 'known key', 'changed array' ). Seems a bit hacky.

I'm a bit leery of exposing the full list of other extensions' internal data...

Could you tell me any possible concerns for doing this? It'd be interesting to know how/why this could be a problem.

@brion, @NiharikaKohli: Another option would be to create a function for appending data to a known key, rather than replacing it. How does that sound?

Change 259522 had a related patch set uploaded (by Niharika29):
Initial commit for PageAssessments extension

https://gerrit.wikimedia.org/r/259522

Niharika claimed this task.Dec 16 2015, 4:53 PM

This is basically finished, but the patch is awaiting security review.

Change 259522 merged by jenkins-bot:
Initial commit for PageAssessments extension

https://gerrit.wikimedia.org/r/259522

DannyH closed this task as Resolved.Jan 11 2016, 9:31 PM
DannyH moved this task from Q3 2018-19 to Q1 2018-19 on the Community-Tech-Sprint board.
DannyH added a subscriber: DannyH.
DannyH moved this task from Untriaged to Archive on the Community-Tech board.