Sat, Mar 9
A year has passed, and this bug hasn't even been triaged. The memory problems on the English Wiktionary are still there, and ugly workarounds are needed to get some pages to render without errors.
Wed, Mar 6
Wed, Feb 20
Fix was deployed a long time ago; closing this now.
Jan 20 2019
I pinged some editors who work with hieroglyphs on the English Wiktionary and got a reply from @Vorziblix:
Jan 19 2019
Oct 1 2018
I am very curious about this statement. How do you think you can convinced 100+ communities to change the way data are structured?
Sep 29 2018
Sep 5 2018
what's missing? frwikt support?
Sep 2 2018
Aug 31 2018
Bot approval vote passes 11-0-0.
Aug 23 2018
Then, do a test run (under the bot account) on some 10-50 entries until you’re certain everything goes well (bots with a single function, like interwiki, should not run more than 25 edits). If the edits are minor (as most bot-performed tasks should be), please mark the edits as minor so that they do not swamp the Recent Changes list. If all goes well, post a request for bot status at the votes page.
Jul 11 2018
Sorry for the confusion, yes this is correct (I have not been through the process).
Jul 10 2018
Jun 7 2018
@Cyberpower678 do you only need translations for frwikt? Is anything else needed from enwikt?
May 15 2018
@Lea_Lacroix_WMDE my main planned use case right now is T190210. Maybe it could also be used to automate the "See also" section at the top of the page (currently maintained manually with Template:also + bot).
Apr 13 2018
Looks like the storage part of RESTBase isn't conceptually a cache which can be "thrown away" (that was my assumption). Maybe this has to do with a requirement to serve older revisions for other endpoints (not applicable for definitions)?
I think I still don't understand some parts of the service architecture. Why do all 5+ million entries have to be cached in one go? If the cache gets purged, won't it automatically/lazily repopulate the most requested ones? Some entries will probably never/rarely get requested, why should they get pre-cached?
Also, if I make a normal change to a Wiktionary page, shouldn't that invalidate the cache and return the new schema? What are the typical delays for the change propagation?
Is there no way to just invalidate the entire cache? seems quite complicated to make changes to the API.
why not clear the whole cache on schema change?
yes, maybe T131092, that shouldn't take too long. it would also be good to get a sense of what else is needed from a client's perspective.
Apr 12 2018
reopened this task since the API still returns duplicates
Apr 9 2018
@bearND ok, thanks. if the new code is deployed, where would the old responses come from? or is the header change required for the cache invalidation? is the cache invalidated on source page change (e.g. /wiki/fazer) ? what about changes in dependent templates?
Maybe a duplicate of T131092?
Apr 8 2018
@bearND Looks like the change hasn't been deployed yet? Still getting duplicate sentences
Apr 3 2018
Apr 1 2018
I've been a few times now in this situation where unwanted changes were recovered.
Mar 30 2018
I had a look through the codebase to see how hard it would be to restore iOS9 compatibility. Sadly, iOS 10+ calls are a bit all over the shop.
Mar 27 2018
It always displays "Discard my changes and switch", even without changes. Which means that you need two clicks to switch to visual editor.
Mar 26 2018
Mar 20 2018
Mar 18 2018
Ok, thanks. For reference, T55784 seems to be the ticket to follow.
@Halfak ah ok, that makes sense. how many edits need to be labeled? how sensitive is the whole approach to template/code level changes?
Mar 16 2018
@Halfak ok, so it's separate from a user's normal patrolling activity, meaning you work on an older sample instead of labeling recent changes? I think the participation rate could be a lot higher if it is integrated into the "normal" patrolling activity (opt-in, of course).
Mar 14 2018
@bearND I can handle Gerrit, just remembered that I made some contributions to the Wikipedia iOS app using GH a while back, it's just so much easier.
I have a first patch ready for review. Can I submit PRs via Github or do I have to go through gerrit 😱?
Mar 8 2018
Feb 28 2018
Feb 23 2018
@ArielGlenn so these would include the output from RestBase, with parsoid-annotated DOM? That would be very helpful for all sorts of processing tasks.
Feb 20 2018
@bearND I think it would be preferable to extract the definition from the glossary and not the linked Wiktionary definition page (which might contain some other unrelated content). But I also understand that you don't want to add extra parsing code, so it might be a good first compromise. Maybe there's something simple we could do on Wiktionary to make things easier?
Feb 15 2018
Jan 9 2018
@Mholloway an approximation for the primary definition could be the first gloss/sense of an entry, skipping all obsolete/archaic senses which are sometimes listed first (presumably to illustrate the semantic development). in case of several part of speech headers this is trickier, since there's often no clear ordering
Dec 18 2017
@Noe There are a lot of dictionaries in Wikisource, but most of them are "scan-only". By "tagging", you mean creating an index to entries, so you can quickly navigate to the scanned page?
Dec 16 2017
@Noe excellent, one stone, two birds (une pierre deux coups?)
Dec 15 2017
Dec 8 2017
there's consensus to use IABot: https://en.wiktionary.org/wiki/Wiktionary:Beer_parlour#Inviting_IABot
Dec 3 2017
Dec 2 2017
Nov 21 2017
I really don’t see how having enough technical debt already justifies adding it. There is literally no ability to turn off a skin from a Wikimedia project, so local technicians end up supporting Modern for those three people using it and now will be expected to support Timeless for those five people that would experiment with it.
Nov 7 2017
enwikt's multistream is still missing, or just late?
Would just like to add that it's not just dying connections, but all sorts of backend errors (timeouts etc). Anyway, I've been hitting this bug more frequently recently.
Nov 5 2017
Nov 4 2017
Oct 31 2017
In the meantime, maybe this can be "fixed" with custom CSS/gadget or something similar. Need to remove the 'user-select: none' style from the heading.
Oct 23 2017
thanks, much appreciated
Oct 22 2017
Still seems to happen on https://dumps.wikimedia.org/enwiktionary/20171020/ , or is not complete yet?
Oct 14 2017
Thanks for bringing together the various strands of discussion here. I finally managed to read through the various proposals linked in this task, there are quite a few (balanced / typed templates, template data, wikitext 2.0).
Oct 13 2017
@ArielGlenn thank you!
Oct 12 2017
@SBisson it works now as expected, thanks.
Oct 10 2017
Ok i'll have a look for it then. do the tasks get automatically updated when the fix gets deployed?
@SBisson great, thanks. for me this always happens, not just when using the back button. probably related though.
Oct 5 2017
Jun 28 2017
It looks like the anti-spam mechanism is not working as expected. Almost all (90%) of the OTRS emails I currently get are spam emails, to the point that I don't feel like contributing any longer.
Jun 8 2017
Ok sounds good. If I understand it correctly you can update labels from within Wiktionary (with a gadget), without leaving the site? What's the difference to just setting entries as "patrolled"?
Jun 7 2017
@Halfak yes i'm interested to help with this. what would i need to do?
Mar 15 2017
Yes, that's the idea, editors wouldn't even notice the fact that extra markup gets generated. However it would also mean to promote the usage of templates wherever possible, and to possibly automate the conversion of non-templated content with bots.
No I haven't – this was just a first initial test / proof of concept. To me at least it has proven useful, I can now extract usage examples quite easily from the HTML output of the templates, provided that they actually get used (Wiktionary has many cases where templates are recommended but are in fact optional).
@Lydia_Pintscher OK, so making Wiktionary easier to parse right now will help with that transition. It will be great to have at least some of the data easily accessible.
+1 for HTML dumps. I work with Wiktionary XML dumps and getting the data out there is really tricky. A big portion of the content is generated via Scribunto and therefore not extractable from the XML alone.
Feb 23 2017
@Lydia_Pintscher I'm aware of the efforts of the Wikidata team, it is great to see that this is happening. The approach present here is meant to be a temporary solution until we have this data. Then there's also the chicken-egg question: we first need to get the data present on Wiktionary into Wikidata. This task will be a lot easier if we already have some semantic information present in the generated output, it would let us automate that process. That's what I meant in the initial task description:
I finally managed to get some time to work on this and also did some research on microformats. In the last few years this area has become increasingly confusing with a variety of options (microformats1/2, W3C microdata, schema.org, RDFa (lite), JSON-LD etc).
Sep 4 2016
Aug 30 2016
@JMinor great, that was quick! last missing part would be to document these schemes somewhere. Where should this go? The doc folder of the repo?
Aug 18 2016
great! can somebody please add me to the testflight group? have already sent my details to @Fjalapeno.
Aug 12 2016
ready for review:
Jul 29 2016
Need a few clarifications. What should the url look like?
Jul 28 2016
Ah, it wasn't an oversight then, but not sure why you would just want to launch the app.
OK, just had a quick look at the PR (wikipedia-ios/pull/696) and tested locally. The app opens, but I don't see any code which handles the URL passed to the app.