Fri, Apr 9
If I can help, let me know. We have a few non mediawiki jobs in the codehealth pipeline for various languages (Java, python) using a few different configuration set ups.
Thu, Apr 8
Can we do something silly (but also quick and effective), like ignore the last section (maybe last two sections) when parsing the page, on the assumption that these sections may be sources/references sections?
I'm kind of tempted to block this task on T271603: Add a link engineering: Recommendation version, what are your thoughts @Tgr @mewoph @MGerlach?
This is stalled pending the completion of T279427: Republish datasets with primary key ID column included and T279053: Grant ALTER privileges to adminlinkrecommendation user on m2. Hopefully, though, once those two tasks are completed, all that is left to do in this task is a review of the logs of the kubernetes container doing the dataset imports to see that the datasets have been loaded and are up-to-date for all wikis.
Updated query information:
Wed, Apr 7
@akosiaris @JMeybohm wondering if you all have ideas here. Comparing the local run with the profiler output from staging, there are several calls that seem to take significantly longer: the SQL queries, ngram_iterator, the (many) regex calls, and numpy. Granted, staging is underpowered compared to the other production deploys but we see slowness in the external and internal traffic releases too. Is it possible there is some kind of resource limit or configuration issue we're seeing with the kubernetes deployment that is affecting the performance of the application?
@Gehel any thoughts on where this task might fit (or not) into your team's scheduled work?
I lean towards adding this to the post-release backlog.
Here's the profiler output from running this query in the staging release in kubernetes:
Tue, Apr 6
We have Special:NewcomerTasksInfo, Grafana integration, and an API endpoint. The toolforge site that consumes data from the API endpoint will happen in T249987: Scale: GrowthExperiments wiki monitoring dashboard so I think we could mark this task as resolved.
Thank you @Tgr!
@mepps do you mind updating the status of this task please? Is there anything to do here?
One thing I was thinking about was just to use the phrase "Suggestions mode" – does it matter to the end user if it's a computer or other humans providing the suggestions? Suppose in the future we had a structured task that involved one user making annotations in a document (like in Google Docs "Suggest" mode) and another user accepting/rejecting them – would that be a similar user experience to what we are providing with link recommendations?
This is blocked on T278864: Add a link: evaluate link recommendation (Mar 30 2021) but could be QA'd after.
The SE module contains logic to filter out protected articles, which Special:NewcomerTasksInfo doesn't -- T259346: Add page protection filter to CirrusSearch is probably the best way to do this rather than re-implementing the logic every place that we want to query data about tasks. Maybe that explains the off by one discrepancy.
I think we can declare this one resolved.