Sounds good, thank you!
May 7 2021
Jan 11 2021
Jan 10 2021
Oh nice, thanks for the quick and helpful response! This pretty much solves my problem.
Oct 27 2020
Jun 6 2020
Mar 13 2020
Mar 2 2020
Hi, those indexes are autogenerated so I guess they don't need to be migrated and we can just let the cronjob rebuild them as needed, if that's easier.
Nov 30 2019
Hi, thank you all again for your submissions! We chose @AikoChou for this project couple of weeks ago. Sorry we never wrote back here -- I actually thought applicants would get some communication through the Outreachy platform.
Nov 3 2019
Thank you all for your submissions! I'm looking through them and they will very much be considered along with the final submission (deadline November 5th, as @Samwalton9 mentioned), even if I don't make further comments on GitHub :)
Oct 15 2019
Oct 10 2019
@Surlycyborg But the mediawiki API usually returns the disambiguation page or the redirect page not the exact article. which would need to send another request assuming the redirect is correct or the first result of the disambiguation is correct. But if using the search API before then we could get the article by pageid instead.
Hi @Miriam @Surlycyborg,
Here's my repo: https://github.com/achillesheel02/wikimedia-citation-classification-task-2.git
Oct 7 2019
Hello! I found it useful to install the packages in a virtual environment, had some issues probably with packages from before, and having a virtual environment solved the "No module found" errors. Here is some info: https://docs.python-guide.org/dev/virtualenvs/
Oct 6 2019
Those sections typically don't have a lot of usable text that is not links or references, so I think no, we can just ignore those.
That is probably a question for @Miriam - I imagine we want the same level of section title that was used to train the model.
Oct 4 2019
- Outreachy will ask us to choose an applicant around mid- to end-October according to their timeline, which we'll do based on which tasks they have completed, and how. We have no deadlines other than that - so let's say, any time in the next couple of weeks is perfectly fine.
- I believe question 2 was answered a couple of comments ago: so, email for this one, please. You can find Miriam's email in this user page, but I'll also edit this and the parent task to make that obvious, sorry for the omission.
- This and https://phabricator.wikimedia.org/T234606 are the only onboarding tasks we have in mind.
Sorry, that was a bit ambiguous I guess :) I mean to make a comment on this Phabricator entry, like we're doing now, with the URL to your repository.
The only deadline is really the Application period for Outreachy, which goes until mid-October.
Sep 7 2019
This is great, thank you for jumping on this @Magnus !
Feb 25 2019
Feb 23 2019
Hi, if the lower limit is here to stay, would it make sense to make a quick announcement in labs-l, or perhaps even a note to tools that have offended the limit in the past N days? Apologies if this was done and I missed it, but I can't seem to find it. Per-user limits make perfect sense, and I'm adapting my batch jobs to it, but an email would have saved a bit of head-scratching as I saw my jobs failing.
OK, I've just deployed the fixes for the two issues I mentioned above: nicer retries and removing the extra CREATE commands from the serving path.
Feb 22 2019
If there is still impact to Toolforge and reasonable evidence that this tool is responsible, please feel free to disable it. Otherwise, that seems unwarranted. As I mentioned above, it makes a lot of sense to remove the rendundant CREATEs and I hope to do it over the weekend, but it doesn't sound like we should be treating this as an urgent issue at this stage (fwiw the tool has had these extra statements for the past 4 years).
Feb 21 2019
Sorry, I haven't got around to it yet. Hope to get to it in the next few days though, it should be tracked in the GitHub issues I mentioned above.
Feb 15 2019
Thanks! Yes, sql tools does work now. Good luck with the rest of the recovery!
OK, so this morning I'd already disabled the batch jobs that make the heaviest use of the database, and we can definitely survive without them until Toolforge is fully recovered. However, the tool itself is still down (https://tools.wmflabs.org/citationhunt) and I still don't seem to have access to the database:
Hi, thanks for filing this. Here's some background I have:
Sep 6 2018
Aug 29 2018
. Trust me that there is nothing self-explanatory about this. I don't know what a Petscan ID is. So...I'm stuck, a non-starter.
Aug 28 2018
Link to the tool: https://tools.wmflabs.org/worklist-tool .
@Aklapper, right, I believe @Meghasharma213, the GSoC student I mentored in writing the tool, is interested in staying around as a volunteer to develop and maintain the tool. I think I'll also be able to do some work on it sporadically, but Megha will likely remain the main developer going forward.
Apr 23 2018
We've accepted @Meghasharma213 for this project. Congratulations!
Mar 27 2018
The Phabricator task will be considered the draft and the PDF file uploaded to Google is the final proposal, but they should really just look the same.
Mar 26 2018
My idea was that if the list with same name already exists, then why create a new one? Why can't people add the articles to the existing list only? But still, we can discuss it before implementation. For now, should I let it remain as such in the proposal?
Mar 25 2018
Quick reminder to students applying for this: it seems you're required to submit your proposal both as a Phabricator task and a PDF to Google, as per step 9 in https://www.mediawiki.org/wiki/Google_Summer_of_Code/Participants.
Cool, this looks great and is just about ready for final submission. I've made a few more comments below but none are blocking to the proposal -- I'm convinced you understand the problem and have a plan to solve it :)
Mar 23 2018
Very nice, thank you! I'll make some inline comments below, but generally I like the extra features you've suggested, and I'd like to see a bit more detail on how you'd design and implement some of the things you've mentioned. Do let me know if you have any questions of course.
Mar 22 2018
Hi @Meghasharma213 -- great to hear you're interested! I'm definitely planning on being extra-available for feedback and finishing touches on proposals over the next few days, so it might actually not be a big deal that you're late: we'll just iterate faster :) Just let us know when you have something and we'll take a look!
I think different users can view each other's worklists right? Do we have a purpose to show only the loggedin user's worklists on the home page? If yes, we are already storing 'created_by' in the worklist table. It would be easy to filter the worklists for the current loggedin user.
Mar 21 2018
Hey, nice to see this taking shape. Here's a couple more questions / comments based on your proposal and things that came up when I last met with the other mentors.
Mar 13 2018
Many thanks for your comments and suggestions that seem very intersting.
I thing that most of info are already on my proposal a part from the front end. When I say
To design the database architecture and refine specs
It is all about db architecture and implementation. Unless you require me to put more detail.
Mar 11 2018
Great, thanks for the answers! By the way, you're probably planning to do this already, but please move the results of these conversations we're having into the proposal itself at some point, just so it's complete and self-contained.
Mar 8 2018
Hey, thank you for this! By the way, I'd left some comments on https://semestriel.framapad.org/p/functional-spec-gsoc18 which you could address here.
Sorry for the delay, I've left some inline comments on your Google doc. Thanks for submitting this!
Mar 6 2018
Mar 5 2018
Nice, thank you! I can take a closer look in the morning :)
Mar 3 2018
Hey, sure, thanks for asking. I'm in UTC.
I've left a couple of questions and ideas in an issue in @AdityaJ 's repository to help prepare for the proposal.
Feb 20 2018
@BamLifa Very cool, that was fast! I've filed an issue in your repository with a suggestion that I think we'll need (and also updated the original project description to make it more explicit). Let me know if you have questions or problems :)
Feb 17 2018
I've added some more information to the initial post, including the mentors and points of contact and some suggestions for small tasks that applicants can start on before writing their proposal.
Feb 14 2018
I believe the idea would be to have this as a separate tool, not an on-wiki gadget. I'm not too familiar with gadgets, so please correct me if I'm wrong, but my understanding is that they'd typically be used to augment existing wiki pages, right? The idea here would be to have a tool that, given as input a selection of articles, facilitates collaboration for multiple users editing those articles, so it doesn't seem like a great use case for a gadget.
Oct 29 2017
Oct 6 2017
I'm most likely going to throw away this prototype, so there's no point pursuing this. Thank you all for your input!
Sep 24 2017
The code is here: https://github.com/eggpi/similarity/blob/master/app.py . Basically I'm trying to load a ~400MiB pickled file in the tool's home directory at start-up. That file contains a sparse matrix that gets traversed on each request.
Sep 23 2017
Hmm, I did try a couple more low-hanging fruits to reduce the size of the data, which does make a difference locally. Still no luck launching the tool on the grid though.
Sep 22 2017
Aug 5 2017
Thanks for reporting!
Oct 7 2015
It's working again. Thank you very much!