Sun, Mar 18
Ok, thanks. For reference, T55784 seems to be the ticket to follow.
@Halfak ah ok, that makes sense. how many edits need to be labeled? how sensitive is the whole approach to template/code level changes?
Fri, Mar 16
@Halfak ok, so it's separate from a user's normal patrolling activity, meaning you work on an older sample instead of labeling recent changes? I think the participation rate could be a lot higher if it is integrated into the "normal" patrolling activity.
Wed, Mar 14
@bearND I can handle Gerrit, just remembered that I made some contributions to the Wikipedia iOS app using GH a while back, it's just so much easier.
I have a first patch ready for review. Can I submit PRs via Github or do I have to go through gerrit 😱?
Thu, Mar 8
Wed, Feb 28
Fri, Feb 23
@ArielGlenn so these would include the output from RestBase, with parsoid-annotated DOM? That would be very helpful for all sorts of processing tasks.
Tue, Feb 20
@bearND I think it would be preferable to extract the definition from the glossary and not the linked Wiktionary definition page (which might contain some other unrelated content). But I also understand that you don't want to add extra parsing code, so it might be a good first compromise. Maybe there's something simple we could do on Wiktionary to make things easier?
Feb 15 2018
Jan 9 2018
@Mholloway an approximation for the primary definition could be the first gloss/sense of an entry, skipping all obsolete/archaic senses which are sometimes listed first (presumably to illustrate the semantic development). in case of several part of speech headers this is trickier, since there's often no clear ordering
Dec 18 2017
@Noe There are a lot of dictionaries in Wikisource, but most of them are "scan-only". By "tagging", you mean creating an index to entries, so you can quickly navigate to the scanned page?
Dec 16 2017
@Noe excellent, one stone, two birds (une pierre deux coups?)
Dec 15 2017
Dec 8 2017
there's consensus to use IABot: https://en.wiktionary.org/wiki/Wiktionary:Beer_parlour#Inviting_IABot
Dec 3 2017
Dec 2 2017
Nov 21 2017
I really don’t see how having enough technical debt already justifies adding it. There is literally no ability to turn off a skin from a Wikimedia project, so local technicians end up supporting Modern for those three people using it and now will be expected to support Timeless for those five people that would experiment with it.
Nov 7 2017
enwikt's multistream is still missing, or just late?
Would just like to add that it's not just dying connections, but all sorts of backend errors (timeouts etc). Anyway, I've been hitting this bug more frequently recently.
Nov 5 2017
Nov 4 2017
Oct 31 2017
In the meantime, maybe this can be "fixed" with custom CSS/gadget or something similar. Need to remove the 'user-select: none' style from the heading.
Oct 23 2017
thanks, much appreciated
Oct 22 2017
Still seems to happen on https://dumps.wikimedia.org/enwiktionary/20171020/ , or is not complete yet?
Oct 14 2017
Thanks for bringing together the various strands of discussion here. I finally managed to read through the various proposals linked in this task, there are quite a few (balanced / typed templates, template data, wikitext 2.0).
Oct 13 2017
@ArielGlenn thank you!
Oct 12 2017
@SBisson it works now as expected, thanks.
Oct 10 2017
Ok i'll have a look for it then. do the tasks get automatically updated when the fix gets deployed?
@SBisson great, thanks. for me this always happens, not just when using the back button. probably related though.
Oct 5 2017
Jun 28 2017
It looks like the anti-spam mechanism is not working as expected. Almost all (90%) of the OTRS emails I currently get are spam emails, to the point that I don't feel like contributing any longer.
Jun 8 2017
Ok sounds good. If I understand it correctly you can update labels from within Wiktionary (with a gadget), without leaving the site? What's the difference to just setting entries as "patrolled"?
Jun 7 2017
@Halfak yes i'm interested to help with this. what would i need to do?
Mar 15 2017
Yes, that's the idea, editors wouldn't even notice the fact that extra markup gets generated. However it would also mean to promote the usage of templates wherever possible, and to possibly automate the conversion of non-templated content with bots.
No I haven't – this was just a first initial test / proof of concept. To me at least it has proven useful, I can now extract usage examples quite easily from the HTML output of the templates, provided that they actually get used (Wiktionary has many cases where templates are recommended but are in fact optional).
@Lydia_Pintscher OK, so making Wiktionary easier to parse right now will help with that transition. It will be great to have at least some of the data easily accessible.
+1 for HTML dumps. I work with Wiktionary XML dumps and getting the data out there is really tricky. A big portion of the content is generated via Scribunto and therefore not extractable from the XML alone.
Feb 23 2017
@Lydia_Pintscher I'm aware of the efforts of the Wikidata team, it is great to see that this is happening. The approach present here is meant to be a temporary solution until we have this data. Then there's also the chicken-egg question: we first need to get the data present on Wiktionary into Wikidata. This task will be a lot easier if we already have some semantic information present in the generated output, it would let us automate that process. That's what I meant in the initial task description:
I finally managed to get some time to work on this and also did some research on microformats. In the last few years this area has become increasingly confusing with a variety of options (microformats1/2, W3C microdata, schema.org, RDFa (lite), JSON-LD etc).
Sep 4 2016
Aug 30 2016
@JMinor great, that was quick! last missing part would be to document these schemes somewhere. Where should this go? The doc folder of the repo?
Aug 18 2016
great! can somebody please add me to the testflight group? have already sent my details to @Fjalapeno.
Aug 12 2016
ready for review:
Jul 29 2016
Need a few clarifications. What should the url look like?
Jul 28 2016
Ah, it wasn't an oversight then, but not sure why you would just want to launch the app.
OK, just had a quick look at the PR (wikipedia-ios/pull/696) and tested locally. The app opens, but I don't see any code which handles the URL passed to the app.
Great! Is this in the most recent release (5.0.5)?
Jul 12 2016
Yes, this would be great, doing a manual search on Wikisource + copy/paste is very awkward.
Jul 8 2016
Great, thanks! I'll have a play with it.
Thanks! What are the next steps to actually tag new enwikt edits with this model?
Jun 27 2016
Jun 26 2016
Wait, 40K changes and only 164 reversions? That sounds too low. Maybe lots of bot changes in there?
Apr 19 2016
Current state of Move www.wikisource.org to mul.wikisource.org: 10 support, 2 oppose (one of which qualified as "weak oppose"). Does this count as consensus or do we need more votes?
Jan 11 2016
Nov 10 2015
Nov 6 2015
Sep 18 2015
Jun 17 2015
Jun 12 2015
ok, looks like the full wiktionary dump has successfully completed now. do you expect future dumps will also have this 10 day window from start to finish? seems rather long.
Jun 10 2015
re multistream, no, don't think it was part of the original ticket, just strange that these now get generated almost a week after the initial files, that's why it looked "broken" to me.
Jun 8 2015
one end-user related thing i've noticed is that the multistream dumps are still missing: http://dumps.wikimedia.org/enwiktionary/20150602/
Jun 4 2015
thanks for your work on this ariel! to be fair, the title of the ticket is "snaphot1004 running dumps very slowly", not "wiktionary/(insert other project name) db dumps not available".
May 22 2015
Mar 31 2015
Is there an easy way to set up one (or even several) WikiData / Wiktionary integration sandboxes where interested parties could just try out things and experiment? I think prototyping an integration with a small subset of the data could be very beneficial. It would allow us to get some quick feedback on which kind of ideas could work (and where the problem areas are). Planning everything upfront is almost impossible, given the ambition of this project.