Page MenuHomePhabricator

Investigate state of LQT where it is still live
Open, Needs TriagePublic

Description

The following wikis still have LQT enabled. Investigate whether anything is blocking them from being converted:

  • enwikinews
  • enwiktionary
  • huwiki
  • svwikisource

And the frozen wikis:

  • fiwikimedia
  • mediawikiwiki
  • officewiki // replaced with Flow 2014-11-25
  • sewikimedia
  • strategywiki
  • testwiki
  • test2wiki
  • wikimania2010wiki

Check off for each wiki once status is in https://phabricator.wikimedia.org/T385290

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
OpenNone
OpenNone
ResolvedTrizek-WMF
DuplicateNone
OpenNone
ResolvedTrizek-WMF
DuplicateNone
ResolvedSgs
ResolvedSgs
ResolvedTrizek-WMF
OpenNone
ResolvedMimurawil
ResolvedTchanders
In ProgressNone
OpenNone
DeclinedNone
OpenNone
OpenNone
ResolvedEsanders
ResolvedNone
ResolvedDLynch
OpenDLynch
ResolvedUrbanecm_WMF
ResolvedDLynch
ResolvedDLynch
OpenEsanders
ResolvedRyasmeen
ResolvedUrbanecm_WMF
ResolvedDLynch
ResolvedTrizek-WMF
Resolvedzoe
ResolvedRyasmeen
ResolvedBUG REPORTEtonkovidova
ResolvedTrizek-WMF
ResolvedNone
ResolvedPRODUCTION ERRORhubaishan
ResolvedTrizek-WMF
ResolvedDLynch
ResolvedTrizek-WMF
Resolvedppelberg
ResolvedQuiddity
Resolvedppelberg
Resolvedzoe
Resolvedzoe
ResolvedRyasmeen
Resolvedzoe
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedmatmarex
OpenNone

Event Timeline

On enwiktionary, LQT was only ever enabled on a limited number of pages, mostly user talk pages. When pinged regarding the potential removal of LQT, these users did not comment (most of them are only sporadically active). I do not foresee any community concern if LQT were to be removed from enwiktionary.

My suggestion for undeploying LQT from enwiktionary is to write a script that converts LQT discussions back to standard wikitext discussions using == headers, : indentation, wikitext signatures and oldest-to-newest ordering of threads, and inserts them onto the user's talk page in the correct chronological ordering among the existing wikitext discussions. Then the Thread and Summary namespaces can simply be deleted.

Out of curiosity I decided to check how the other wikis are using LQT:

WikiActivity in Thread: namespace in last 30 daysLQT is currently enabled on...
enwikinews2 edits + 2 page deletionsEntire Comments: namespace (this is set up so that readers can leave blog-style threaded comments on news articles using LQT)
enwiktionary1 edit (on a user talk page)40 pages - a handful of user talk pages of inactive or sporadically active users & a handful of gadget talk pages in MediaWiki talk: space
huwiki0~8 pages (test pages and a few user talks) - but it seems like it was previously enabled on more pages judging by the content of the Thread: namespace
ptwikibooks1 edit (on a user talk page)All talk pages by default (LQT is set to opt-out mode on this wiki)
svwikisource0Archive pages only

For live wikis

  • The script for T397422: Identify talk pages with LQT threads where LQT has been disabled on ptwikibooks is probably worth checking on every wiki where LQT is live as that's a generic issue that can happen with any LQT use, although ptwikibooks oddities aggravated it (and you've already written the code). For most of these wikis, {{#useliquidthreads:0}} does nothing, so what you need to check for is the absence of {{#useliquidthreads:1}}
    • But don't actually implement the fixes, that may well break things, just report pages that need fixing.
  • svwikisource is probably a good first test case - all of its LQT pages are years-old archives anyway. Enwiktionary and huwiki both make some live use of LQT but it has taken a second-class citizen role and can probably be deprecated without complaints.
  • Enwikinews has built entire community processes around the LQT "Comments:" namespace. Probably DiscussionTools could have a similar role, but going any further would need heavy community involvement.
  • Ptwikibooks is the godawful mess you already know about.

For frozen wikis:

  • The thing to check for here is "Are there any "Thread:" pages that aren't redirects?" (and also, for paranoia, are there any "Thread talk:" or "Summary talk:" pages, although there almost certainly won't be). If there are any then someone should review them to make sure important content isn't lost forever. (This also applies to wikis that become frozen after a Flow import, although if there were no old-lost LQT pages there shouldn't be any new-lost LQT pages either).
    • I already did this check on MediaWiki.org (as one of my home wikis) in December 2024, so (in my opinion, haven't discussed this with the rest of the community), it would be safe to uninstall LQT and just delete all "Thread:" and "Summary:" pages without replacing them with anything right now.
    • Fiwikimedia appears to be totally clean
    • I obviously can't access officewiki
    • Siwikimedia also appears to be totally clean in this regard
    • Strategywiki is in a kind of broken state and needs a closer review
    • Testwiki is technically unclean but it's testwiki so I don't think I care
    • Test2wiki is also unclean, and some of the content lost is actually meta-discussion of test2wiki itself rather than test posts, so needs a closer look
    • Wikimania2010wiki is in a similar state as strategywiki.

I'll probably have to work my way through this sporadically, so I apologise in advance for things coming in comment by comment.

I've confirmed that svwikisource has 1396 threads, the most recent of which was in 2018. This seems like a good candidate to freeze immediately. However, it doesn't have Flow installed so if we want to do the Flow workflow we either need to install it (ew) or write an archival tool.

I'm having a quick poke to see if I can extract thread contents and get a feel for how hard conversion might be.

Quick survey of what's got Flow and what has LQT:

(edit: and active users, so we can get a feel for impact)

SiteFlowLQTActive UsersThreads
en.wiktionary.org2,4328598
hu.wikipedia.org1,2652562
en.wikinews.org14410958
sv.wikisource.org201397
strategy.wikimedia.org29914
wikimania2010.wikimedia.org125
mediawiki.org1,57358892
test.wikipedia.org226167
test2.wikipedia.org30142
se.wikimedia.org232223
fi.wikimedia.org1022
officewiki

Reminder: "freezing" a LQT wiki ends up making the content of all of its threads much harder to access, rather than just making them read-only; it's implemented by forcibly making all pages not LQT pages.

Ah, good catch. Well, I'll be treading delicately...

I've modified my script which downloads thread info to accept an endpoint and an output file, and summarised the sizes of these wikis in terms of LQT threads:

$ cat tmp/uris | while IFS=" " read -r wiki endpoint; do echo -ne "$wiki "; bzcat cache/lqt-$wiki.json.bz2 | jq length; done
enwikinews 10956
enwiktionary 8598
huwiki 2562
svwikisource 1397
fiwikimedia 22
mediawikiwiki 58892
sewikimedia 2223
strategywiki 9914
testwiki 167
test2wiki 142
wikimania2010wiki 25

I've updated the table in the comment above.

Next up I'll be modifying the other script, the one that uses database dumps to pull out revisions, to see if we need to do any fixups for any of these wikis. I _think_ I can avoid downloading the complete dumps by using toolforge, I thinkI need to bounce past the login host to get my hands on a python venv though.

Logging some useful commands. On toolforge having done become zoe-oneoff-scripts:

cat easy-uris | while IFS=" " read -r wiki endpoint; do toolforge jobs run $wiki --mem 2G --email all --image python3.9 --command "python3 lqt-porting-tools/build-lqt-repairs.py --inputdir /public/dumps/public/$wiki/latest --outputdir lqt-porting-tools/output --cachedir lqt-porting-tools/cache --threadfile lqt-porting-tools/input/lqt-$wiki.json.bz2 $wiki"; done

easy-uris being a subset of uris, being a mapping from uris to api endpoints – I only really need the wiki names here.

Here's the successes (no bzip so I had to move them locally again):

outputs $ for x in *-proposed*; do echo -n "$x: "; bzcat $x | jq length; done
fiwikimedia-proposed-lqt-repairs.json.bz2: 2
sewikimedia-proposed-lqt-repairs.json.bz2: 405
strategywiki-proposed-lqt-repairs.json.bz2: 0
svwikisource-proposed-lqt-repairs.json.bz2: 0
testwiki-proposed-lqt-repairs.json.bz2: 0
wikimania2010wiki-proposed-lqt-repairs.json.bz2: 0

and for missing threads:

outputs $ for x in *-missing*; do echo -n "$x: "; bzcat $x | jq length; done
fiwikimedia-missing-lqt-parents.json.bz2: 0
sewikimedia-missing-lqt-parents.json.bz2: 1
strategywiki-missing-lqt-parents.json.bz2: 79
svwikisource-missing-lqt-parents.json.bz2: 1
testwiki-missing-lqt-parents.json.bz2: 27
wikimania2010wiki-missing-lqt-parents.json.bz2: 0

Outputs here:

Ongoing issues to resolve if I keep on this path:

  1. The really big wikis don't have a single-file dump and my code doesn't account for that. Should be a simple fix.
  2. I'm hitting OOM for the medium wikis like testwiki, test2wiki and mediawikiwiki.
  3. Output files are not resilient against failure: either delete 'em on an exception or create temporary files and then move on success.

It probably makes sense to switch approaches to an API-based one. The decision to use dumps was because I already had a local copy, already had the XML streaming code, and could iterate without hitting rate limits. At this point it's probably prudent to instead query the APIs. There's probably an endpoint I can use to search for Talk pages with LQT disabled, but if there's not, I have a relatively small candidate list of pages to fetch.

Also, on a more personal note my health is going to shit again so @Esanders is likely to be picking up from here.

i fixed up my caches so that partial caches from failed runs don't mess up the subsequent run. I tried lxml, which allegedly is better for avoiding OOMs while parsing XML, but it's still OOMing so possibly it wasn't there.

I think the move is probably to query the API instead.

Some notes from a quick glance:

  • Don't consider LQT threads that are themselves redirects (this wasn't a problem on ptwikibooks but it is a problem on wikis that have already migrated to Flow)
  • On ptwikibooks LQT was enabled by default, so you had to check for pages that explicitly used {{#useliquidthreads:0}}. On most of these other wikis, LQT is not enabled by default, so the thing to check for is the absence of {{#useliquidthreads:1}} (I said this above, but it's not clear if you noticed it).

It might not actually be worth doing this, now that I think about it, since if it's not done the only consequence is that you have to run the export to Flow script in an extra pass, which is probably easier than updating your preemptive tooling.

zoe removed zoe as the assignee of this task.Dec 11 2025, 6:23 PM
ppelberg subscribed.

@Esanders: what (if any) work is left to be done here and when do you think we ought to prioritize doing it and why?