Page MenuHomePhabricator

Register Abstract Wikipedia as a special kind of wiki with Wikidata so it can be linked to/from distinctly from being just a Wikipedia?
Closed, ResolvedPublic

Description

We want Abstract Wikipedia to be a Wikidata target, so that sitelinks can be added both from and to it. (i.e. abstract.wikipedia.org/view/en/Q42 should cross-link to wikidata.org/wiki/Q42 etc.)

However, it's not really "a Wikipedia", so we probably want to show it apart from the language list of Wikipedias, and instead as a special wiki like we do for Commons. Is this feasible?

Related Objects

Event Timeline

Change #1248589 had a related patch set uploaded (by Jforrester; author: Jforrester):

[mediawiki/extensions/WikimediaMessages@master] [WIP] wikimedia: Add labels for Wikidata to let Abstract Wikipedia be a source

https://gerrit.wikimedia.org/r/1248589

Change #1254359 had a related patch set uploaded (by Jforrester; author: Jforrester):

[operations/mediawiki-config@master] [DNM] Make abstractwiki a multi-lingual Wikidata client

https://gerrit.wikimedia.org/r/1254359

Jdforrester-WMF changed the task status from Open to In Progress.Mar 18 2026, 5:11 PM
Jdforrester-WMF claimed this task.
Jdforrester-WMF triaged this task as Medium priority.

This is pending advice from WMDE about whether this is OK from a Product/Engineering perspective.

From my side this makes sense but I think we need a decision on T421151#11808877.

One thing to be aware of: https://www.wikidata.org/wiki/Q5296 and potentially others already have sitelinks in the Wikipedia section. That'd probably need cleanup.

One thing to be aware of: https://www.wikidata.org/wiki/Q5296 and potentially others already have sitelinks in the Wikipedia section. That'd probably need cleanup.

Yes, the community are manually adding the links after @Zabe created Abstract Wikipedia as a "normal" Wikipedia on Wikidata. I'd love to get a steer on what steps we need to do so that we can correct things so they don't get worse, assuming we can do the clean-up afterwards?

One thing to be aware of: https://www.wikidata.org/wiki/Q5296 and potentially others already have sitelinks in the Wikipedia section. That'd probably need cleanup.

Yes, the community are manually adding the links after @Zabe created Abstract Wikipedia as a "normal" Wikipedia on Wikidata. I'd love to get a steer on what steps we need to do so that we can correct things so they don't get worse, assuming we can do the clean-up afterwards?

i.e. the last time running populateSitesTable to update the sites table, it added abstractwiki as a wikipedia, and from what I can tell it still would do that, so some config change appears to be necessary here.

MariaDB [wikidatawiki_p]> select * from sites where site_global_key = 'abstractwiki';
+---------+-----------------+-----------+------------+-------------+---------------+---------------+-------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------+-------------+
| site_id | site_global_key | site_type | site_group | site_source | site_language | site_protocol | site_domain             | site_data                                                                                                                                         | site_forward | site_config |
+---------+-----------------+-----------+------------+-------------+---------------+---------------+-------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------+-------------+
|    1077 | abstractwiki    | mediawiki | wikipedia  | local       | abstract      | https         | gro.aidepikiw.tcartsba. | a:1:{s:5:"paths";a:2:{s:9:"file_path";s:35:"https://abstract.wikipedia.org/w/$1";s:9:"page_path";s:38:"https://abstract.wikipedia.org/wiki/$1";}} |            0 | a:0:{}      |
+---------+-----------------+-----------+------------+-------------+---------------+---------------+-------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------+-------------+
1 row in set (0,002 sec)

MariaDB [wikidatawiki_p]>

i.e. the last time running populateSitesTable to update the sites table, it added abstractwiki as a wikipedia, and from what I can tell it still would do that, so some config change appears to be necessary here.

Yup! The technical changes to treat it as a special wiki hadn't been made yet. That's what we filed this task about, but it was overtaken by the very keen creation of T420615: Post-creation work for abstractwiki which I was still editing to say DON'T DO THINGS YET when… they were already done. Oops.

i.e. the last time running populateSitesTable to update the sites table, it added abstractwiki as a wikipedia, and from what I can tell it still would do that, so some config change appears to be necessary here.

Yup! The technical changes to treat it as a special wiki hadn't been made yet. That's what we filed this task about, but it was overtaken by the very keen creation of T420615: Post-creation work for abstractwiki which I was still editing to say DON'T DO THINGS YET when… they were already done. Oops.

Yeah, I am very sorry, shoud have checked.

One thing to be aware of: https://www.wikidata.org/wiki/Q5296 and potentially others already have sitelinks in the Wikipedia section. That'd probably need cleanup.

Yes, the community are manually adding the links after @Zabe created Abstract Wikipedia as a "normal" Wikipedia on Wikidata. I'd love to get a steer on what steps we need to do so that we can correct things so they don't get worse, assuming we can do the clean-up afterwards?

The section isn’t stored in the saved data (only the site, title and badges are), so I don’t think that’s a problem – if the site is moved to another section, the sitelinks should just move automatically (after a purge).

The section isn’t stored in the saved data (only the site, title and badges are), so I don’t think that’s a problem – if the site is moved to another section, the sitelinks should just move automatically (after a purge).

Aha, excellent. So if we land 1254359 and run the purge it should Just Work™? Or is there more to do?

I think that should Just Work™, yeah. (Can also be tested on mwdebug during the config deploy, of course.) For the purge, I think you’d need to get the list of pages with an abstractwiki via SQL and purge those via the API – I don’t think there’s a generator= module that yields “all entities with a sitelink to wiki X” – or just don’t bother and let the items fall out of the parser cache naturally, I guess.

Change #1248589 merged by jenkins-bot:

[mediawiki/extensions/WikimediaMessages@master] wikimedia: Add labels for Wikidata to let Abstract Wikipedia be a source

https://gerrit.wikimedia.org/r/1248589

Change #1254359 merged by jenkins-bot:

[operations/mediawiki-config@master] Make abstractwiki a multi-lingual Wikidata client

https://gerrit.wikimedia.org/r/1254359

Mentioned in SAL (#wikimedia-operations) [2026-04-16T20:43:13Z] <stran@deploy1003> Started scap sync-world: Backport for [[gerrit:1270872|Deploy IRS to enwiki's Event Talk namespace (T423042)]], [[gerrit:1254359|Make abstractwiki a multi-lingual Wikidata client (T420420)]], [[gerrit:1272770|Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545)]]

Mentioned in SAL (#wikimedia-operations) [2026-04-16T20:44:54Z] <stran@deploy1003> aaron, stran, jforrester: Backport for [[gerrit:1270872|Deploy IRS to enwiki's Event Talk namespace (T423042)]], [[gerrit:1254359|Make abstractwiki a multi-lingual Wikidata client (T420420)]], [[gerrit:1272770|Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-04-16T20:51:49Z] <stran@deploy1003> Finished scap sync-world: Backport for [[gerrit:1270872|Deploy IRS to enwiki's Event Talk namespace (T423042)]], [[gerrit:1254359|Make abstractwiki a multi-lingual Wikidata client (T420420)]], [[gerrit:1272770|Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545)]] (duration: 08m 36s)

I think that should Just Work™, yeah. (Can also be tested on mwdebug during the config deploy, of course.)

Sadly not; even though it's now in $wgWBRepoSettings['specialSiteLinkGroups'] on Wikidata.org, and abstract.wikipedia.org has $wbSiteGroup set to "abstract", Wikidata still wants to put abstract entries into the Wikipedias list, which is I think not Wikidata's fault but MW config's:

$sitesModule = new SitesModule( WikibaseSettings::isClientEnabled() ? WikibaseClient::getSettings() : null, WikibaseSettings::isRepoEnabled() ? WikibaseRepo::getSettings() : null, MediaWikiServices::getInstance()->getSiteStore(), MediaWikiServices::getInstance()->getLocalServerObjectCache(), new LanguageNameLookupFactory( MediaWikiServices::getInstance()->getLanguageNameUtils(), new MediaWikiMessageInLanguageProvider() ) );
$exposedSM = TestingAccessWrapper::newFromObject( $sitesModule );

$exposedSM->getSetting('siteLinkGroups');
 = [
    "wikipedia",
    "wikibooks",
    "wikinews",
    "wikiquote",
    "wikisource",
    "wikiversity",
    "wikivoyage",
    "wiktionary",
    "special",
  ]

$exposedSM->getSetting('specialSiteLinkGroups');
 = [
    "commons",
    "foundation",
    "mediawiki",
    "meta",
    "species",
    "wikidata",
    "wikimania",
    "sources",
    "outreach",
    "wikifunctions",
    "abstract",
  ]

MediaWikiServices::getInstance()->getSiteStore()->getSites()->hasSite('abstractwiki');
 = true

$exposedSM->shouldSiteBeIncluded( MediaWikiServices::getInstance()->getSiteStore()->getSites()->getSite('abstractwiki'), ['special'] );
 = false

$exposedSM->shouldSiteBeIncluded( MediaWikiServices::getInstance()->getSiteStore()->getSites()->getSite('abstractwiki'), ['wikipedia'] );
 = true

MediaWikiServices::getInstance()->getSiteStore()->getSites()->getSite('abstractwiki')->getGroup();
= "wikipedia"

Aha, yes, our entry in the sites table is wrong:

> SELECT site_global_key, site_type, site_group  FROM sites WHERE site_global_key = 'abstractwiki' LIMIT 1;
stdClass Object
(
    [site_global_key] => abstractwiki
    [site_type] => mediawiki
    [site_group] => wikipedia
)

Compare with Meta:

> SELECT site_global_key, site_type, site_group  FROM sites WHERE site_global_key = 'metawiki' LIMIT 1;
stdClass Object
(
    [site_global_key] => metawiki
    [site_type] => mediawiki
    [site_group] => meta
)

Running an SQL UPDATE feels like a bad move, not least that this is (rightly) cached a lot. Do we have a protocol for this?

OK, this has worked! E.g. on https://www.wikidata.org/wiki/Q1 "abstract" is listed down in the Multilingual sites section, and not in the Wikipedias. On e.g. https://simple.wikipedia.org/wiki/Universe it's no longer listed in the ULS language list.

It's not shown up in the sidebar ("In other projects") bit from Wikidata, however. Let's call that a different task.