Page MenuHomePhabricator

Detailed Reports from game DB
Closed, ResolvedPublic

Authored By
Tarrow
May 25 2020, 11:41 AM
Referenced Files
F31947136: item_property_value_ratio.csv
Jul 24 2020, 6:15 PM
F31947135: datatype_ratio.csv
Jul 24 2020, 6:15 PM
F31947134: property_ratio.csv
Jul 24 2020, 6:15 PM
F31942426: item_property_value_ratio.csv
Jul 20 2020, 10:39 PM
F31942424: datatype_ratio.csv
Jul 20 2020, 10:39 PM
F31942425: property_ratio.csv
Jul 20 2020, 10:39 PM
F31868192: image.png
Jun 16 2020, 4:39 PM
F31845337: rejected-matches.csv
May 28 2020, 12:48 PM

Description

Both the dev team and the community want to better understand which references that we found are good and bad. The dev team will use this information to see how to improve the extraction process in the future. The community will use this information to better understand which parts of the found references can be imported without additional human curation.

In order to make these decisions we need an overview of the decisions already made in the reference hunter game. We want an overview similar to this:

image.png (178×705 px, 18 KB)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I believe the most helpful/highest priority ones are:

  • per external ID: total number of accepted references, total number of rejected references, ratio between the two <- with this we can know which Properties to potentially blacklist as well as which references we can potentially just import whole-sale
  • per datatype: total number of accepted references, total number of rejected references, ratio between the two <- with this we can know which of the data types are especially problematic and worthy of improving to increase the acceptance rate
  • per Property: total number of accepted references, total number of rejected references, ratio between the two <- with this we can know if there are Properties like country which are particularly difficult for us to get correct references for

Ah and I forgot the examples: an example of 5 random references for each of those cases would help get a better understanding of them

ItamarWMDE subscribed.

This task depends on T251111, and can only be picked up when it is complete.

Ah and I forgot the examples: an example of 5 random references for each of those cases would help get a better understanding of them

@Lydia_Pintscher Random rejected or random accepted? (assuming rejected for now)

Follow-up after our meeting @darthmon_wmde:

Please someone ping me when we have the data for this and let me know where do the data live.

Also, I can now see some sample data on https://tools.wmflabs.org/wd-ref-island/stats.php from T251111.

Ah and I forgot the examples: an example of 5 random references for each of those cases would help get a better understanding of them

@Lydia_Pintscher Random rejected or random accepted? (assuming rejected for now)

Both would be useful I think.

Follow-up after our meeting @darthmon_wmde:

Please someone ping me when we have the data for this and let me know where do the data live.

Also, I can now see some sample data on https://tools.wmflabs.org/wd-ref-island/stats.php from T251111.

I uploaded the accepted and rejected matches here. It's not a sample. It's all of them:


@GoranSMilovanovic, @ItamarWMDE will be adding you to toolforge, where that data lives.

@GoranSMilovanovic Regarding when we will be having all the data: As for now, there are already potential references on the game. So there must be some data already. Other than that, the scraping is on and we will wait to have the full dump of potential references, which can take a minimum of 20 days, prior to dump it on to the game. It would be great to be able to visualise the acceptance of the references from that moment on if that's possible.

Follow-up after our meeting @darthmon_wmde:

Please someone ping me when we have the data for this and let me know where do the data live.

Also, I can now see some sample data on https://tools.wmflabs.org/wd-ref-island/stats.php from T251111.

Hi Goran,

Nice to (virtually) meet you :). Following up after a quick talk with @darthmon_wmde, In order to access the data we will need to add you to the game's toolforge account as a maintainer, there you will have access to run your own scripts on the game's database. Can you please provide me with your toolforge username so I can add you?

Also, as Monica might have mentioned, we are currently running a pipeline to find new matches for the game, which we estimate will be done in a few days or so. @Ladsgroup can notify you once the pipeline has completed running and the game's database has been populated, so that you can run your analysis. You can find more details about the data and how it is structured in https://github.com/wmde/reference-island and in the following files in particular (documentation WIP):

Fee free to ping us if there are any additional questions or requirements.

@Ladsgroup Thanks for the datasets.
@darthmon_wmde Thanks for the follow up.
@ItamarWMDE Nice to meet you too :) My LDAP is GoranSMilovanovic and I was able to login to Toolforge from it (+2FA,) just a minute ago.

Preliminary results based on T253552#6172533 @Ladsgroup datasets:

Per datatype:

datatype    accepted rejected ratio
entity-type      419      119  3.52
text               3        1  3   
time              66        6 11

Per property:

   Property accepted rejected  ratio
1       P17      105        1 105.00
2      P495       46        1  46.00
3       P27       44        3  14.67
4      P580       13        1  13.00
5      P407       81        7  11.57
6      P569       29        3   9.67
7      P571        5        1   5.00
8       P21       67       15   4.47
9     P2561        3        1   3.00
10     P136        8        4   2.00
11     P277        9        5   1.80
12     P175        4        7   0.57
13     P921       11       25   0.44
14     P361        6       34   0.18
15      P50        1       16   0.06
16      P19        4       NA     NA
17      P20       21       NA     NA
18     P404       11       NA     NA
19     P570        5       NA     NA
20     P577       14       NA     NA
21     P735        1       NA     NA

Note. NA values in the ratio column are the consequences of division by zero, i.e. when rejected is NA which means that rejected is 0.

Note. @Lydia_Pintscher In respect to T253552#6162674, external IDs are not in the scope of this preliminary analysis because only one property which is an external ID was found in this sample.

@Lydia_Pintscher as agreed in our 1:1 today:

  • criterion: do not consider property-value pairs that were not reviewed by at least 5 editors;
  • crtierion: 95% of acceptance rate, meaning that everything up to 19 decisions must have a consensus.

Opened the ticket for making the list of accepted references available for download as well as discussed with Goran today: T255583

@Lydia_Pintscher

From what we now have on https://wd-ref-island.toolforge.org/stats.php (thanks @ItamarWMDE):

  • three .csv files are delivered here (see below):
  • datatype_ratio.csv - per datatype statistics (aggregated),
  • property_ratio.csv - per property statistics (aggregated), and finally
  • item_property_value_ratio.csv - statistics for each Item x Property x Extracted Value combination.

Columns:

  • accepted - number of users who have accepted the suggested value;
  • rejected - number of users who have rejected the suggested value;
  • ratio - the ratio of accepted to rejected;
  • percent_accepted - % of users who have accepted the suggested value;
  • total_decisions - total number of users who have assessed the suggested value (i.e. num.accepted + num.rejected observations);
  • is_accepted (only in item_property_value_ratio.csv): see T253552#6227594, we decide to accept the suggested value, for a given property and a given item, if (1) there were at least 5 total_decisions made, and (2) percent_accepted is >= 95%.

Notes:

  • only complete observations were analyzed (e.g. if a datatype was not parsed, the observation was discounted; it happens only once in these datasets);
  • still no data on any external identifiers;
  • all datasets are sorted in a decreasing order of the number of total_decisions made;
  • 1162 item x property x extracted value combinations were observed in item_property_value_ratio.csv;
  • only 26 of these combinations have received >= 5 decisions;
  • only 5 of these combinations satisfy both criteria (1, 2) to be accepted.

Files:

Thank you! The property_ratio and datatype_ratio files seem to be identical.

@Lydia_Pintscher

The property_ratio and datatype_ratio files seem to be identical.

Sorry, the same variable name was erroneously re-used in my code, here they are:

@Lydia_Pintscher @GoranSMilovanovic

  • We want to have a dashboard developed for this
  • and updated regularly.

@Lydia_Pintscher What is the status of this ticket? Do we need any additional features or work invested here or should we close it?

@Lydia_Pintscher

As per your request in a recent e-mail: