User Details
- User Since
- Jun 15 2015, 4:15 PM (442 w, 1 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- RonnieV [ Global Accounts ]
Mar 2 2022
Hi all,
@Ciell told me yesterday about the new model that is implemented. She suggested to implement links like https://ores.wikimedia.org/v3/scores/nlwiki/60941895 which could be helpful to get even more support for ORES. I had a look at magic words (https://en.wikipedia.org/wiki/Help:Magic_words), but might have overlooked REVISIONID or REVISIONNUMBER. Is anyone aware of a magic word that would give that number? If that does not exist, would it be hard to get that implemented?
Is there Lua-alternative to get the revision id of the currently viewed page (the page we want to show the most likely scale on)?
Dec 15 2021
Both 'getWeightedAutoAssignedMentors' (line 343) and 'invalidateCache' (line 320) call 'makeCacheKeyWeightedAutoAssignedMentors'. The last one calls the getId function of the function getMentorsPage directly( $this->getMentorsPage()->getId(), line 335 ), without checking whether the MentorsPage exists. This could be unsafe.
In getWeightedAutoAssignedMentors there is a check included to see if $this->getMentorsPage() does not return null, that check is NOT made in invalidateCache. I think it is save to have it checked there as well. Or better, make makeCacheKeyWeightedAutoAssignedMentors save by testing there whether $this->getMentorsPage() does not equal null.
Jul 31 2021
Requirements for a specific class can change over time, but there will always be a top class (and a bottom class). The requirements for these classes are usually the easiest to formulate and the easiest to find out whether an article (revision) meets these. That a top class article according to the requirements of 2005 does not fit the requirements of 2016 (nor 2021) is not a problem. It will move to the next lower class or maybe even drop two classes. The distinctions between a top-C and a bottom-B article can be much less clear.
Jul 30 2021
I like the idea of class names above class labels. Some suggestions:
Etalage
Zeer goed
Goed
Redelijk
Beginnetje
I already commented in WP:DK. I think it is more likely to get classes between the current classes than above (or below) the current classes. Also, as all classifications of all revisions are calculated at the moment they are shown using the model of that moment, there will be no (stored) changes to articles needed to change a current A classification to a B classification (or whatever). When we change the conditions an article has to comply to to get a certain classification, that will just be fine.
Jun 24 2021
I am just looking at the table @Halfak gave on June 3. One of the parameters say 'enwiki' in stead of 'nlwiki' (feature.enwiki.revision.paragraphs_without_refs_total_length). Could this be a reason for strange results? It is used twice in the calculation.
Jun 8 2021
Yes, that is the right name of the value.
Would it then be possible to show the link, like https://nl.wikipedia.org/w/index.php?oldid=123125 ? That would make it easier for firsttimers to get there. Copy & paste is common usage.
And it would be great if the JSON-output would be something like
+ score
+ + prediction "D"
+ + rating 2.7
(A more meaningful word than rating would be fine).
Jun 3 2021
Thanks for the great meeting we had today!
Jun 2 2021
Halfak, thanks for you elaborated answer. I will dive into the log after a good night of sleep.
May 31 2021
Hi Geert,
May 30 2021
Hi there,
May 24 2021
Main page seems to be protected on Wikipedia's in surrounding languages (is it in all?) Is protection a reason not to classify the page? I think this is too tight. Permanent protection could be better. [:da:]nnish and [🇩🇪]eutsch|german have hardly any content in the code, just like the Dutch. The {;fr:]rench, [:en:]glish and [:pt:]|Portuguese do have some content in the code of the page. Just going for 'no content' does not seem to be enough, but excluding pages without content might help to save similar pages from being rated. Is it easy (and not too heavy) to head for Wikidata:Q5296? Is it too limited to our ecosystem?
May 23 2021
With this edit A links to Wikipedia:Etalage, B to Wikipedia:Ruwe diamanten, E to Wikipedia:Beginnetje and C and D to Wikipedia:ORES/Kwaliteitsschaal voor artikelen. We can easily change it to other pages, if more relevant pages exist.
For the main page, it consists of comment, a template calls, a category and a second template call. Maybe pages without own content can be excluded?
Hi @Ciell , I did not mean to create 'normal' categories (although Ecritures created categories like https://nl.wikipedia.org/wiki/Categorie:A-Class_articles for A-C), but pages like the explainatory https://nl.wikipedia.org/wiki/Wikipedia:ORES/Kwaliteitsschaal_voor_artikelen for all five levels. Maybe using some of the suggestions Halfak made at https://nl.wikipedia.org/wiki/Gebruiker:EpochFail/Kladblok (https://nl.wikipedia.org/wiki/Wikipedia:Ruwe_diamanten for B). Having five red links does not feel very satisfying.
May 22 2021
Thanks, @Halfak , that did it!
I will redirect the categories later this (long) weekend, or think of another solution (maybe a descriptive page of the rating). As far as I see in the English Wikipedia, the articles are placed in categories based on a manual edit in the talk page using the vital article template. I think the Dutch Wikipedia is not ready (yet?) for a massive manual rating of articles.
It would be great, but I estimate also a heavy action, to rate all articles on a regular basis using ORES. But lets first see what we can do with the current information and run some user testings.
May 21 2021
Thank you very much for these examples!
I have had the code for the templates restored (thanks, @Ciell). The template works, I do have to fix some links (not just pointing to 'Categorie:A-klasse artikelen' for classes A through E), but to the etalage-artikelen (FA-articles) and Beginnetjes (stubs) I guess I should point this from the Class template.
Do I have to call the Class template somewhere to have the outcome shown together with the ORES rating, or should @Psingh07 do this in his part of the code?
May 17 2021
Thank you, @Psingh07, it does look great indeed!
Mar 1 2021
Files have been downloaded and uploaded, so task is completed
Files have been uploaded in December 2019, see https://commons.wikimedia.org/wiki/File:Gks16330148v.jpg
Jan 24 2021
Excluding redirects can be reached by requiring a minimal length of (say) 250 or 1000B. For A and B, 1000 should be no problem, for E this might be too long.
Jan 21 2021
Thanks for your time and attention today
Jan 20 2021
Nov 14 2020
This last remark reminds me of something I wanted to add: Pages are (except for the first few thousand, as there has been a restart which lost track of the original ordering) numbered in the order in which they are created. As file history5 ends with pages created in February 2014, it is likely to find the more recent opinions on when a page is a 'Beginnetje' in the newer pages, so focusing on history6 might be wise. There will be removals of 'Beginnetje' from older pages, contained in history1..5, but that might be a fewer pages. RevisionId 40371396 is an edit of 14 February 2014, 45656960 is made on 1 January 2016 and 50643818 on 1 January 2018.
Nov 12 2020
The results from file 4 and 6 are added. These gave .json-files from 1.9 and 2.3 MB. The results from history5 are a bit to big to add: a json-file from 27,4 MB is hard to add to github.
History5 contains pages added between December 2010 and February 2014.
Nov 11 2020
Files history2 and history3 have been processed. See
- https://gist.github.com/Ronnie-V/5abd8aa3dd4518e580b652a178495965#file-nlwiki-20201101-pages-meta-history2-json (>380 kB)
- https://gist.github.com/Ronnie-V/5abd8aa3dd4518e580b652a178495965#file-nlwiki-20201101-pages-meta-history3-json (± 1 MB)
The other three files will be done tomorrow.
@Chtnnh Fine that you are running it from a utility script. The __name__ == "__main__" won't be an issue than, but it won't hurt either and makes it possible to run the script without using utility.
Your source file seems to be somewhere on a central computer. That's fine, I'm running the script on my home computer. In the public dumps there are six files for groups of pages, with a combined (packed with bz2) size of approx. 36 GB. Can you conform that size for your /mnt/data/xmldatadumps/public/nlwikimedia/latest/nlwikimedia-latest-pages-meta-history.xml.bz2 file, to make sure we are using the same source? pages-meta-history.xml sounds good to me.
Nov 7 2020
Running the script (seems to) take(s) a lot of time. I ran it against the nlwiki-20201101-pages-meta-history2.xml-p134539p484052-file for page numbers up to 136000, and got a result in json. I added it to the gist mentioned in the previous message, so we can see what result can be optained.
@Chtnnh, will this be enough for you to run it, see what the outcome is and so something with that?
Nov 6 2020
My version, to run the first file, is at https://gist.github.com/Ronnie-V/5abd8aa3dd4518e580b652a178495965
Unfortunately, it will just give a result after processing the whole file, 150GB of (unzipped) data.
The attribute errors seem to be thrown for hidden versions, like https://nl.wikipedia.org/w/index.php?title=Nelson_Mandela&oldid=39191825 Seems like a good reason to exclude the page from the zipped version.
I have been looking at this script. It needs two more lines at the end:
Aug 10 2020
Good to hear you are going to change it.
And very good to hear input from the past is not lost, but simple not adequately transferred to Commons. For those interested, it will be possible to restore the missing data manually, and it might even be possible to have a look at re-exporting them.
Aug 6 2020
Maybe not everyone is aware of this pipe trick, maybe everyone else is even more into previewing what the result of their edit is than I am. In the Dutch community, there are six usernames with a ( in it. Four of them have a space before it, two not. Of these four, three are ' (WMF)'.
May 27 2020
I bet there is a big gap between 2 and 3, and most articles will be in there. But if ORES could help identify articles which belong in one of these four categories, I'd be happy if the remainder is in that gap. ORES could than, later on, help categorising the articles from the in between group and might identify candidates for the four categories.
Hey folks, Thanks for the nice meeting and your information.
The template name is 'Etalage', 42311910 is the revision number that got approved and then follow year, month and day of the decision to recognise this article as 'Etalage'.
The template name is 'Beginnetje'. It is followed by a category (1 out of a fixed list of 46), and then follow year, month and day of the decision.
May 13 2020
Dec 4 2019
Images are downloaded.
Now looking for a convenient way to upload all of these.
Nov 27 2019
The results of this task are this module and this template, being used p.e. at the page about Hugo de Vries and some more. Probably many will follow.
Nov 23 2019
Thanks to Anton for realising this.
Infobox is created and used on some pages about plantages.
Anton is working on it at Wiki Techstorm 2019.
Thank you!
Nov 20 2019
Nov 13 2019
Nov 6 2019
@Ecritures , the file stated above which you call 'metadata' (it's just the data belonging to the specific picture), only contains 200 pictures, not the 10.000+ you said it would contain.
I can make four sets of 50 items each, but that is it. Please give me a file containing all 10k+ records.
Nov 5 2019
Overigens zijn deze afbeeldingen uit de tachtiger en negentiger jaren van de twintigste eeuw, Zijn deze vrij beschikbaar? Ik zet daar mijn vraagtekens bij.
Er staat nu een setje van 200 records op https://maior.memorix.nl/api/oai/raa/key/Elsinga/?verb=ListRecords&metadataPrefix=ese . Dat kan ik in vieren hakken, maar komt niet in de buurt van de 10.000 afbeeldingen.
Het uploaden van de hele bups is sowieso geen taak voor dit moment, dus hoort niet op mijn bordje te liggen.
Volgens mij zou Ecritures voor het downloaden zorgen. Leuk dat Ecritures dit opeens aan mij assigned, maar dat is niet hoe het werkt.
Nov 4 2019
Wat ben je aan het doen?
Oct 29 2019
Hoe vordert dit?
Oct 25 2019
Verwacht je dat het verzamelen van deze informatie de eerste taak is tijdens de workshop, of verkrijg je deze vooraf, bij voorkeur via het project, zodat er (kleine) batches van gemaakt kunnen worden die gebruikt kunnen worden tijdens de workshops en (grotere) batches om later te gebruiken?
Oct 23 2019
Oct 21 2019
I will divide it into batches, smaller ones and bigger ones, once I have received the results of T236032.
Sep 6 2019
I hope it will help the community with rethinking the blocking policy. The large number of indefinite blocks for anons and ranges should be a reason to rethink this, especially with [https://wikimania.wikimedia.org/wiki/2019:Research/Despite_the_ban:_doing_good_work_anonymously_on_Wikipedia this in mind]. The quality of edits by users of tor is about the same as the quality of all anonymous contributions.
Aug 21 2019
Thank you both. Where can I find this translatewiki so I can put in a language reference, instead of the (hardcoded?) 'Engels'?
And which placeholder is for the introducing text ('You are setting label, description and aliases in <Language> for Schema <Entitynumber>')? That one is in English, should be translated (made available in Dutch too).
Hi @Lydia_Pintscher , thanks for testing. I retested it. The problem remains the same, when my language preferences are set to Dutch (nl). [https://www.wikidata.org/wiki/Special:SetEntitySchemaLabelDescriptionAliases/E105/de] then gives me 'You are setting label, description and aliases in Duits for Schema E105.' on top of the page, but 'Het label van het schema in het Engels' in grey in the box for the label.
The first sentence should be in Dutch, the second one should mention 'Duits' (instead of 'Engels').
When I switch to English or German as language, it seems to be working fine.
Aug 16 2019
Aug 15 2019
Any clues where these texts are coming from would be appreciated. I might want to dive into it in order to get this fixed, but I got no clue where to start looking.
May 30 2019
@SIryn, @DanielleJWiki, @Ecritures : If you would like to add more languages, the current languagetemplates are on Google drive. Translations are welcome.
Please note that there are differences between the languages. Maybe we should get to a more resembling text, for instance pointing to both the own URL of the institution and the commons-page for the institution (which does not always exist).
@SIryn, @DanielleJWiki, @Ecritures : Most of the work is done, see [https://commons.wikimedia.org/wiki/Special:Contributions/RonnieBot] and [https://commons.wikimedia.org/wiki/Template:RonnieVKoninklijke_Bibliotheek].
May 28 2019
@SIryn The documentation points to two categories. Would you like the script to create these as well?
May 27 2019
I've just been looking at this task and got some questions/remarks (sorry, in Dutch):
May 23 2019
It seems the first batch is imported, but not the rest of 'monumenten.xls'.
Not sure about the status of other files.
The script for making pages on the Dutch Wikipedia on the information in 'Kopie van Erfgoed van Strijd Bevrijding Verzet.xls' was kind of working at the end of WTS2018. Migth have another look at it, so we can make it work before WTS2019 and have all these articles made.
May 18 2019
Hai Yupik,
Oct 28 2018
Good to see that you found a solution for the strange addition which changes integers to floats. New imports will be fine, so it's just a one time clean-up for 100 records. Manual handling is a quick solution, writing a script might be nicer and help in other cases
The spreadsheet which Slryn made available has also a description of each monument. If this information is available under the right license, we could use this spreadsheet to create pages for monuments and use the infobox and these descriptions to start with articles on a reasonable level. Some script to use a csv to create these pages and fill them was already made yesterday. I'll upload it to github.
If we get so much detailed information into Wikidata, we should try to use a Wikidata-driven template to show this information on Wikipedia. Will we use a special template for this, or update the template [https://nl.wikipedia.org/wiki/Sjabloon:Infobox_beeld]? Not all monuments are sculptures.
Should we make a subtask or a separate task for this?
Oct 24 2018
Looking at places like Bushiribana and their definition of 'population', this doesn't look like valuable information.
To preserve some quality, I'd recommend to find better info or refrain from importing.
Jul 14 2018
Jan 21 2016
I'd like to ask to reconsider this decline.