Page MenuHomePhabricator

Investigate coupling of i18n messages used in both places
Closed, ResolvedPublic

Description

  • Messages in Lib are probably used in both
  • Hint: Check ResourceLoader resource files.

Event Timeline

Restricted Application added a project: User-Ladsgroup. · View Herald TranscriptJun 12 2020, 7:07 PM

I wrote a script to grep each one of i18n in the code base and analyze them.
This is the result:

Empties:
wikibase-lib-desc
specialpages-group-wikibase
wikibase-deletedentity-item
wikibase-deletedentity-property
wikibase-deletedentity-query
wikibase-error-save-connection
wikibase-error-remove-connection
wikibase-error-autocomplete-connection
wikibase-error-autocomplete-response
wikibase-sitelinks-wikipedia
wikibase-tooltip-error-details
wikibase-validator-label-too-short
wikibase-validator-description-too-short
wikibase-validator-alias-too-short
datatypes-type-string
datatypes-type-quantity
datatypes-type-monolingualtext
datatypes-type-multilingualtext
datatypes-type-number
datatypes-type-mediawiki-title
datatypes-type-unknown
datatypes-type-boolean
datatypes-type-globe-coordinate
datatypes-type-time
datatypes-type-wikibase-item
datatypes-type-wikibase-property
datatypes-type-commonsMedia
datatypes-type-geo-shape
datatypes-type-tabular-data
datatypes-type-external-id
datatypes-type-entity-schema
wikibase-entity-summary-wbsetitem
wikibase-entity-summary-wbcreate-new
wikibase-entity-summary-wbcreateredirect
wikibase-entity-summary-wbeditentity
wikibase-entity-summary-wbeditentity-create
wikibase-entity-summary-wbeditentity-update
wikibase-entity-summary-wbeditentity-override
wikibase-entity-summary-wbeditentity-update-languages
wikibase-entity-summary-wbeditentity-update-languages-short
wikibase-entity-summary-wbeditentity-update-languages-and-other
wikibase-entity-summary-wbeditentity-update-languages-and-other-short
wikibase-entity-summary-wbsetreference
wikibase-entity-summary-wbsetreference-add
wikibase-entity-summary-wbsetreference-set
wikibase-entity-summary-wbsetlabel-set
wikibase-entity-summary-wbsetlabel-remove
wikibase-entity-summary-wbsetdescription-add
wikibase-entity-summary-wbsetdescription-set
wikibase-entity-summary-wbsetdescription-remove
wikibase-entity-summary-wbsetaliases-set
wikibase-entity-summary-wbsetaliases-add-remove
wikibase-entity-summary-wbsetaliases-add
wikibase-entity-summary-wbsetaliases-remove
wikibase-entity-summary-wbsetaliases-update
wikibase-entity-summary-wbsetlabeldescriptionaliases
wikibase-entity-summary-wbsetsitelink-add
wikibase-entity-summary-wbsetsitelink-add-both
wikibase-entity-summary-wbsetsitelink-set
wikibase-entity-summary-wbsetsitelink-set-badges
wikibase-entity-summary-wbsetsitelink-set-both
wikibase-entity-summary-wbsetsitelink-remove
wikibase-entity-summary-wblinktitles-create
wikibase-entity-summary-wblinktitles-connect
wikibase-entity-summary-wbcreateclaim-value
wikibase-entity-summary-wbcreateclaim-novalue
wikibase-entity-summary-wbcreateclaim-somevalue
wikibase-entity-summary-wbcreateclaim
wikibase-entity-summary-wbsetclaimvalue
wikibase-entity-summary-wbremoveclaims
wikibase-entity-summary-wbremoveclaims-remove
wikibase-entity-summary-wbremoveclaims-update
wikibase-entity-summary-special-create-item
wikibase-entity-summary-wbcreateclaim-create
wikibase-entity-summary-wbsetclaim-update
wikibase-entity-summary-wbsetclaim-create
wikibase-entity-summary-wbsetclaim-update-qualifiers
wikibase-entity-summary-wbsetclaim-update-references
wikibase-entity-summary-wbsetclaim-update-rank
wikibase-entity-summary-clientsitelink-update
wikibase-entity-summary-clientsitelink-remove
wikibase-entity-summary-wbsetqualifier-add
wikibase-entity-summary-wbsetqualifier-update
wikibase-entity-summary-wbremovequalifiers-remove
wikibase-entity-summary-wbremovereferences-remove
wikibase-entity-summary-wbmergeitems-from
wikibase-entity-summary-wbmergeitems-to
wikibase-item-summary-wbcreate-new
wikibase-item-summary-wbeditentity
wikibase-item-summary-wbeditentity-create
wikibase-item-summary-wbeditentity-create-item
wikibase-item-summary-wbeditentity-update
wikibase-item-summary-wbeditentity-override
wikibase-item-summary-wblinktitles-create
wikibase-item-summary-wblinktitles-connect
wikibase-property-summary-wbcreate-new
wikibase-property-summary-wbeditentity-create
wikibase-property-summary-wbeditentity-create-property
wikibase-property-summary-wbeditentity-update
wikibase-property-summary-wbeditentity-override
wikibase-property-summary-special-create-property
------
Repo:
wikibase-error-entity-too-big 	 php,php
wikibase-parse-error 	 php,php,php,php,php,php
wikibase-parse-error-coordinate 	 php
wikibase-parse-error-entity-id 	 php
wikibase-parse-error-quantity 	 php
wikibase-parse-error-time 	 php
wikibase-validator-invalid 	 php,php
wikibase-validator-missing-field 	 php
wikibase-validator-bad-type 	 php
wikibase-validator-too-long 	 php,php
wikibase-validator-label-too-long 	 php,php
wikibase-validator-description-too-long 	 php,php
wikibase-validator-alias-too-long 	 php,php
wikibase-validator-too-short 	 php,php
wikibase-validator-too-high 	 php
wikibase-validator-too-low 	 php
wikibase-validator-malformed-value 	 php
wikibase-validator-bad-entity-id 	 php
wikibase-validator-bad-entity-type 	 php
wikibase-validator-no-such-entity 	 php
wikibase-validator-no-such-property 	 php
wikibase-validator-bad-value 	 php
wikibase-validator-bad-value-type 	 php
wikibase-validator-bad-data-type 	 php
wikibase-validator-bad-url 	 php,php
wikibase-validator-url-scheme-missing 	 php
wikibase-validator-bad-url-scheme 	 php
wikibase-validator-unknown-unit 	 php
wikibase-validator-not-allowed 	 php
wikibase-validator-no-validators 	 php
wikibase-time-precision-Gannum 	 php
wikibase-time-precision-Mannum 	 php
wikibase-time-precision-annum 	 php
wikibase-time-precision-millennium 	 php
wikibase-time-precision-century 	 php
wikibase-time-precision-10annum 	 php
wikibase-time-precision-BCE-Gannum 	 php
wikibase-time-precision-BCE-Mannum 	 php
wikibase-time-precision-BCE-annum 	 php
wikibase-time-precision-BCE-millennium 	 php
wikibase-time-precision-BCE-century 	 php
wikibase-time-precision-BCE-10annum 	 php
------
Lib:
wikibase-error-unknown 	 js,js,js
wikibase-error-save-generic 	 js,js
wikibase-error-remove-generic 	 js,js
wikibase-error-save-timeout 	 js,js
wikibase-error-remove-timeout 	 js,js,js
wikibase-error-ui-edit-conflict 	 js,js
wikibase-language-fallback-transliteration-hint 	 php
wikibase-quantitydetails-amount 	 php
wikibase-quantitydetails-upperbound 	 php
wikibase-quantitydetails-lowerbound 	 php
wikibase-quantitydetails-unit 	 php
wikibase-timedetails-time 	 php
wikibase-timedetails-isotime 	 php
wikibase-timedetails-timezone 	 php
wikibase-timedetails-calendar 	 php
wikibase-timedetails-precision 	 php
wikibase-timedetails-before 	 php
wikibase-timedetails-after 	 php
wikibase-globedetails-longitude 	 php
wikibase-globedetails-latitude 	 php
wikibase-globedetails-precision 	 php
wikibase-globedetails-globe 	 php
wikibase-snakview-snaktypeselector-somevalue 	 php
wikibase-snakview-snaktypeselector-novalue 	 php
wikibase-undeserializable-value 	 php,php
version-wikibase 	 php
wikibase-time-calendar-gregorian 	 php
wikibase-time-calendar-julian 	 php
wikibase-monolingualtext 	 php
wikibase-snakformatter-valuetype-mismatch 	 php
wikibase-snakformatter-property-not-found 	 php
wikibase-snakformatter-formatting-exception 	 php
wikibase-entity-summary-wbsetlabel-add 	 php
wikibase-reference-formatter-snak-separator 	 php
wikibase-reference-formatter-snak-terminator 	 php
wikibase-reference-formatter-snak-retrieved 	 php
------
Client:
wikibase-sitelinks-sitename-columnheading 	 js,js
wikibase-sitelinks-link-columnheading 	 js,js
------
Multiple:
wikibase-error-unexpected 	 js,js,js,php,php,js,js
wikibase-error-ui-no-external-page 	 js,js,js,js,php
wikibase-time-precision-BCE 	 php,php,php,php,php,php,php,php
wikibase-SortedProperties 	 php,php,php

Basically they can be grouped into five categories:

  • Empty: The grep didn't return any result, either they are unused and can be dropped or (more likely) they are being built dynamically. From a glance, most of them are being used in dynamic manner and in the lib code, we can just move them into "lib" group.
  • Client: These are only used in client code base, move them to client.
  • Repo: Similar, the weird thing is that there are lots of cases in this group and it feels that moving them might break client. Double check before moving them to repo (I think lib/includes/Formatters/MwTimeIsoFormatter.php is building these dynamically)
  • Lib: A large group of i18n messages are not coupled to repo or client, but they are coupled to lib itself.
  • Multiple: Four messages are being used in multiple components.

I put the raw result in P11486

I don't have a concrete proposal on how to handle this but it seems most of the i18n messages are coupled to lib code and they should go wherever lib goes.

Hours spent: 4 (Awesome, tictac is down).

I wrote a script to grep each one of i18n in the code base and analyze them.

Would you mind sharing the script with the team as well? It might be useful, e.g. when checking the situation after (some chunk of) refactoring has been performed.

Re possibly unused messages (although you mention that most of the message are likely used "dynamically") - is there any way to find out what message keys have been ultimately used on Wikidata? E.g. some cache that we could check what keys are being stored in?
Given this is not the focus of the current work, I suggest to not spend time looking into the unused messages unless the easy way to identify these is already known.

I wrote a script to grep each one of i18n in the code base and analyze them.

Would you mind sharing the script with the team as well? It might be useful, e.g. when checking the situation after (some chunk of) refactoring has been performed

Sure it's really simple python script:

import json
import sys
import subprocess
wikibase_path = '/var/lib/mediawiki2/extensions/Wikibase/'
with open(wikibase_path + 'lib/i18n/en.json', 'r') as f:
    keys = json.loads(f.read()).keys()

result = {}
paths = [
    'lib/includes/',
    'lib/resources/',
    'repo/resources/',
    'repo/includes/',
    'client/includes/',
    'client/resources/'
]
for key in keys:
    if key.startswith('@'):
        continue
    case = []
    for path in paths:
        res = subprocess.run(["grep", "-ir", key, wikibase_path + path], stdout=subprocess.PIPE)
        result_str = res.stdout.decode('utf-8')
        if result_str:
            for line in result_str.split('\n'):
                line = line.split(':')[0].replace(wikibase_path, '')
                if not line:
                    continue
                case.append(line)
    result[key] = case

print(json.dumps(result))
print(json.dumps({i: len(result[i]) for i in result}))

Re possibly unused messages (although you mention that most of the message are likely used "dynamically") - is there any way to find out what message keys have been ultimately used on Wikidata? E.g. some cache that we could check what keys are being stored in?
Given this is not the focus of the current work, I suggest to not spend time looking into the unused messages unless the easy way to identify these is already known.

Good point, I try to take a look.

Double-checked some of the messages:

  • The “Client” ones I agree can move to Client, even just the word “columnheading” doesn’t occur anywhere else so I think the risk of those keys being dynamically built is very low.
  • The validator-*, parse-error and entity-too-big Repo messages are really only used in Repo as far as I can tell (and conceptually, this makes sense, too), and can probably move to Repo.
  • The time-precision-* Repo messages are indeed built dynamically in Lib and can, I think, also be used in Client.
  • The datatypes-type-* unused messages are constructed dynamically and apparently used in Lib and View.
  • All of the *-summary-* unused messages are almost certainly constructed dynamically and used at least in Client, probably in Repo too.

I also noticed that we send the datatypes-type-* messages to browsers (as part of a ResourceLoader module), even though, as far as I can tell, they haven’t been used there for ages. Removing in Ib060a31449.

More of the empties:

  • specialpages-group-wikibase is the title of the Wikibase group on Special:SpecialPages.
  • wikibase-lib-desc is the descriptionmsg in extension-lib-wip, shown on Special:Version.
  • wikibase-deletedentity-* is constructed and used in Lib.
  • wikibase-error-save-connection and wikibase-error-remove-connection are unused since Ic1881c23b8 and no longer sent with ResourceLoader since I8ad97a4ecb. Remove.
  • wikibase-error-autocomplete-connection and wikibase-error-autocomplete-response are unused since Iabda9f7469 and also no longer sent with ResourceLoader since I8ad97a4ecb. Remove. (One test seems to use wikibase-error-autocomplete-response, but looks like it could just as well use any other message.)
  • wikibase-sitelinks-wikipedia is the title of the Wikipedia sitelinks group, though it’s not clear to me why this message is in Wikibase whereas wikibase-sitelinks-wikibooks, wikibase-sitelinks-wikivoyage etc. are in WikimediaMessages. I would think that wikibase-sitelinks-wikipedia also belongs in WikimediaMessages, but maybe Wikipedia is a default sitelink group of Wikibase?
  • wikibase-tooltip-error-details is truly unused, since T141879. Remove.
  • wikibase-validator-*-too-short are almost certainly constructed in Repo’s StringLengthValidator, just not mentioned in the “possible messages” comment. Mention them.
ItamarWMDE added a comment.EditedJun 19 2020, 10:43 AM

Copy over from meeting notes:

SummaryLib contains many messages shared between Repo and Client
Risks, threats, challenges identifiedPHP code that depends on i18n messages is harder to move into separate composer packages.
Opportunities noticedSome Lib messages were only used in one component or fully unused. We moved (T255866) or removed (T255650) those.
Other remarks
Ladsgroup closed this task as Resolved.Jun 24 2020, 10:10 AM