User Details
- User Since
- Nov 29 2016, 4:04 AM (285 w, 4 d)
- Availability
- Available
- IRC Nick
- hall1467
- LDAP User
- Unknown
- MediaWiki User
- Hall1467 [ Global Accounts ]
Oct 1 2018
I updated the ORES utility to 1.2.0 and rerun. It ran for a little bit and then hung for 20 minutes. I then hit control-c and received the following traceback:
Jun 8 2017
@Ladsgroup: I believe this is the only related open patch right now: https://gerrit.wikimedia.org/r/#/c/355104/. Is that right, @hoo?
May 22 2017
@hoo: elwiki sounds good to me.
Apr 12 2017
@jcrespo: Related to your first comment, the patch that I provided a link to (in my comment from one month ago) is now invalid since we are no longer planning on using a separate table in order to implement statement tracking. A new patch should be ready soon implementing statement tracking with wbc_entity_usage.
@jcrespo: We have decided to update the existing wbc_entity_usage table in order to allow for statement tracking (probably done via the "eu_aspect_id" field). We would do an initial deployment to a medium-sized Wikipedia at first and calculate database IO load. We will be able to undo any database changes made in this initial deployment, it will not affect current tracking being done in wbc_entity_usage. Three medium-sized Wikipedias that @hoo and I have considered for this deployment are :
Mar 29 2017
@madhuvishy Yes, the file is there now. Thank you!
Mar 7 2017
@jcrespo: Asking my question on here per your request.
Feb 17 2017
Dec 23 2016
To follow up on @Halfak's database usage assessments, the estimate of 5 properties per entity/page relationship seems reasonable and conservative since the average number of statements per entity is in fact ~5 as seen here: https://grafana.wikimedia.org/dashboard/db/wikidata-datamodel-statements (its worth noting the average is increasing). This assumes that most of the time, a Wikipedia page is referencing data from just its corresponding Wikidata entity.
Dec 19 2016
If we choose to go the database route instead of EventLogging, my initial thoughts would be to create a new table containing (modeled after @Halfak's reference and the current entity usage documentation: https://github.com/wikimedia/mediawiki-extensions-Wikibase/blob/master/docs/usagetracking.wiki):
- page_id(UNSIGNED INT) -- Wikipedia page using the entity's data
- entity_id(UNSIGNED INT) -- the entity id
- property_id(UNSIGNED INT) -- the property used by the entity
Dec 15 2016
I'd be happy to take the lead on the implementation of this task. Would that be okay?
Dec 14 2016
I think a good way to go about this tracking would be by overriding the Lua direct access method for the table that represents the entity.