Page MenuHomePhabricator

Structured Data on Commons entities returned by mw.wikibase.getEntity lua function differ based on language of the viewer
Open, Needs TriagePublic

Description

mw.wikibase.getEntity lua function, activated by T223792 to work with Structured Data on Commons (SDC) returns SDC entity. The best way to view it is by mw.dumpObject function. The output of mw.dumpObject|mw.dumpObject (with collapsed "statements" table) for File:Indoor_Climbing_Kid.jpg (M4184419) is

`table#1 {

metatable = table#2
["id"] = "M4184419",
["labels"] = table#3 {
  metatable = table#4
  ["en"] = table#5 {
    ["language"] = "en",
    ["value"] = "A five year old hanging around bouldering wall in Sportrock climbing gym in Alexandria, Virginia, USA",
  },
},
["schemaVersion"] = 2,
["statements"] = table#6 { ... },
["type"] = "mediainfo",

}`

which is correct as the file has only English caption. That changes when I switch my language from English to Polish and than I get

`table#1 {

metatable = table#2
["id"] = "M4184419",
["labels"] = table#3 {
  metatable = table#4
  ["en"] = table#5 {
    ["language"] = "en",
    ["value"] = "A five year old hanging around bouldering wall in Sportrock climbing gym in Alexandria, Virginia, USA",
  },
  ["pl"] = table#6 {
    ["language"] = "en",
    ["value"] = "A five year old hanging around bouldering wall in Sportrock climbing gym in Alexandria, Virginia, USA",
  },
},
["schemaVersion"] = 2,
["statements"] = table#7 { ... },
["type"] = "mediainfo",

}`

The entity returned should not depend on user's language.

Event Timeline

Jarekt created this task.Nov 17 2019, 4:50 AM
Restricted Application added a project: Wikidata. · View Herald TranscriptNov 17 2019, 4:50 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I am able to reproduce this on Commons, but not with my local setup.
There are a few other related patches coming up soon-ish, and I'll pick this back up once those are settled.

Based on our conversation in T231952, a couple of files on Commons that I suspect have been affected by this bug are:

File:Póvoa de Varzim -i---i- (25379025808).jpg : has a caption in English, while wbc_entity_usage indicates usage of both an English and a Portuguese caption:

SELECT *
    -> FROM wbc_entity_usage
    -> WHERE eu_page_id = 68860692;
+------------+--------------+-----------+------------+
| eu_row_id  | eu_entity_id | eu_aspect | eu_page_id |
+------------+--------------+-----------+------------+
| 2051749177 | M68860692    | L.en      |   68860692 |
| 2055570857 | M68860692    | L.pt      |   68860692 |
+------------+--------------+-----------+------------+
2 rows in set (0.00 sec)

File:Bolide.jpg : has captions in English and Chinese, while wbc_entity_usage indicates 31 languages:

SELECT *
    -> FROM wbc_entity_usage
    -> WHERE eu_page_id = 10184478;
+------------+--------------+-------------+------------+
| eu_row_id  | eu_entity_id | eu_aspect   | eu_page_id |
+------------+--------------+-------------+------------+
| 2037902202 | M10184478    | L.fr        |   10184478 |
| 2059352313 | M10184478    | L.tr        |   10184478 |
| 2063734518 | M10184478    | L.be-tarask |   10184478 |
| 2064893164 | M10184478    | L.ky        |   10184478 |
| 2065176878 | M10184478    | L.en        |   10184478 |
| 2067319129 | M10184478    | L.zh-hk     |   10184478 |
| 2069482808 | M10184478    | L.ru        |   10184478 |
| 2070527405 | M10184478    | L.hr        |   10184478 |
| 2074289864 | M10184478    | L.az        |   10184478 |
| 2074787102 | M10184478    | L.uk        |   10184478 |
| 2079819984 | M10184478    | L.fa        |   10184478 |
| 2079915614 | M10184478    | L.zh        |   10184478 |
| 2082093218 | M10184478    | L.et        |   10184478 |
| 2083792222 | M10184478    | L.zh-tw     |   10184478 |
| 2085056560 | M10184478    | L.ar        |   10184478 |
| 2085263594 | M10184478    | L.lv        |   10184478 |
| 2091202957 | M10184478    | L.pl        |   10184478 |
| 2097044668 | M10184478    | L.fi        |   10184478 |
| 2097635187 | M10184478    | L.cs        |   10184478 |
| 2098189412 | M10184478    | L.ta        |   10184478 |
| 2100409677 | M10184478    | L.mk        |   10184478 |
| 2102465127 | M10184478    | L.sr        |   10184478 |
| 2105562109 | M10184478    | L.sh        |   10184478 |
| 2106360150 | M10184478    | L.ko        |   10184478 |
| 2110227952 | M10184478    | L.kk        |   10184478 |
| 2116158523 | M10184478    | L.nl        |   10184478 |
| 2130720957 | M10184478    | L.be        |   10184478 |
| 2131162549 | M10184478    | L.af        |   10184478 |
| 2142902515 | M10184478    | L.zh-cn     |   10184478 |
| 2143168287 | M10184478    | L.vi        |   10184478 |
| 2149523316 | M10184478    | L.de        |   10184478 |
+------------+--------------+-------------+------------+
31 rows in set (0.00 sec)