Wikidata software engineer, open source enthusiast, mediawiki volunteer developer, long-term Wikipedian
Babel: fa-N, en-4, de-2, tr-1, hu-1
Wikidata software engineer, open source enthusiast, mediawiki volunteer developer, long-term Wikipedian
This is basically a ticking bomb
I won't be around but also I want to mention I'm not sure reportupdater would be a great idea here. We are not using the codebase, reportupdater is not under active development (44 commits according to github), its yaml files are scattered everywhere, we have our own system of refinery scripts that we can use.
Okay, It should be fixed now.
My suggestion is to add be-tarask to langlist file. That solved the issue and generated the hotfix I just deployed. the hotfix will get undeployed if 602675 doesn't get merged soon (and someone runs the scap update-interwiki-cache) . @Urbanecm Having the language name in langlist file seems less hacky than put it into interwiki map. What do you think?
Looking at the logic of dumpInterwiki.php, it uses the db name, which would explain why but that might be the reason, let me dig deeper and find a solution.
I was wrong, we somehow removed the APCu cache bit altogether, I should find out what happened.
I think I found out what's wrong. The APCu cache seems to be per-wiki but this doesn't need to be like that. Let me fix it.
Seconding Krinkle. I investigate more on this. I have lots of ideas on how to improve memcached read pressure (we load lots of items from memcached sometimes for example) but I need to have more metrics on what keys are being accessed too many times, what are the large data that is being sent often, what requests caused lots of read, etc. If it's not possible, it's fine, just let me know.
This really really shouldn't happen. I investigate.
I assume it's good to go now? Shall we deploy soon?
Why bidi is so fun?
I looked at a little. We can use ::placeholder pseudo-element while it's not supported in IE: https://caniuse.com/#search=%3A%3Aplaceholder
I set up a set of patches that basically allows redirects to be set in wikidata as sitelink (whether if they have a badge or not). Is that acceptable @Lydia_Pintscher ?
I can't reproduce this. Got fixed automagically?
which option shall we pursue? Any preference?
My two cents:
- Has any sort of profiling done on @ and AtEase, one might be quite faster. We should check.
- I'm always supportive of reducing code base that we need to maintain, one less repository to maintain, no matter how small, is potential for more work in critical areas for us
Wed, Jun 3
@Addshore Maybe I'm missing something obvious but looking at https://gerrit.wikimedia.org/r/c/integration/config/+/573235/3/jjb/job-templates.yaml it does only checks the main wikibase package.json and not the important ones like bridge, TR, termbox (honestly, running npm audit on these submodules exploded majestically)
The most recent run says these two need updating:
# Run npm install --save-dev email@example.com to resolve 1 vulnerability ┌───────────────┬──────────────────────────────────────────────────────────────┐ │ Low │ Prototype Pollution │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ Package │ yargs-parser │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ Dependency of │ stylelint-config-wikimedia [dev] │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ Path │ stylelint-config-wikimedia > stylelint > meow > yargs-parser │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ More info │ https://nodesecurity.io/advisories/1500 │ └───────────────┴──────────────────────────────────────────────────────────────┘
ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki=wikidatawiki --property-id P1150 --new-data-type external-id Successfully updated the property data type to external-id. ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki=wikidatawiki --property-id P1161 --new-data-type external-id Successfully updated the property data type to external-id. ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki=wikidatawiki --property-id P1162 --new-data-type external-id Successfully updated the property data type to external-id. ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki=wikidatawiki --property-id P5858 --new-data-type external-id Successfully updated the property data type to external-id. ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki=wikidatawiki --property-id P1036 --new-data-type external-id Successfully updated the property data type to external-id. ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki=wikidatawiki --property-id P1149 --new-data-type external-id Successfully updated the property data type to external-id. ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki=wikidatawiki --property-id P1190 --new-data-type external-id Successfully updated the property data type to external-id. ladsgroup@mwmaint1002:~$
Let me take a look at this.
For bots, although they sometimes login more frequently than human users, that is not the majority of them. Bots that make significant numbers of edits usually login only once in a while. So their footprint should be small.
I'm removing wikidata tag as currently there's nothing for the team to review. Re-add the tag if we can review any patch.
login notify is not great, it only counts login from a new device or unsuccessful login. Looking at logstash, the number is massive, a couple of times larger. And it's mostly bots, sometimes they login several times a minute continuously (which wouldn't be captured in login notify). Unsuccessful logins are small though.
Looking at grafana for login notify, enabling this would add more than 200 rows per minute in total, Assuming the cu_changes table keeps data for 90 days. it'll be more than 26M rows.
I'm pretty sure there are more tickets. I think I also filed one.
Tue, Jun 2
Has this been approved by Legal, T&S and DBA? I think we should get it before moving forward.
Clarification: I think we need to decide whether we need another table or hardcoded numbers in the code on case-by-case basis but we should not use enum for any of those.
Sat, May 30
I just wanted to say as a Wikipedian for 15 years, every time I see the design in the top of the ticket, my heart melts. Great work. Keep it up <3
Now the bot automatically adds and updates the description, here's an example: https://phabricator.wikimedia.org/transactions/detail/PHID-XACT-TASK-7yfvmpcvr7dcd5l/ it checks the open tickets of wiki creation project once a day. Do not touch the automatic text as it will get overridden or worse, if you remove the ending or beginning, the bot appends the whole thing again.
okay, to recap:
- The dns and puppet patches are blocker for me
- The comments made by @Jdforrester-WMF hasn't been answered yet. I think this is really important as this can't be easily changed later. Either a fishbowl wiki or a public wiki connected to SUL. We might get away with a public wiki (everyone can create account) without being connected to SUL but as I said, these are really hard to fix later.
- Nitpick: "API Portal" doesn't look right as project name, Do you mean "Wikimedia API Portal"? It's not that important as it can be changed later.
I still don't see the suggested edits section in beta cluster while database says I have it enabled:
MariaDB [fawiki]> select * from user_properties where up_user = 35 and up_property = 'growthexperiments-homepage-suggestededits-activated'; +---------+-----------------------------------------------------+----------+ | up_user | up_property | up_value | +---------+-----------------------------------------------------+----------+ | 35 | growthexperiments-homepage-suggestededits-activated | 1 | +---------+-----------------------------------------------------+----------+ 1 row in set (0.00 sec)
I added the patch to make it use ores but also the problem was that the configuration pages don't exit yet. I created one but I will add more later.
Okay, it's deployed in beta cluster. The only thing I haven't figured out yet is the configuration for suggested edits. I think "morelike" would work for now but I get this error in my console when trying it in my homepage:
Unable to load topic data for suggested edits: The configuration title does not exist.
After refresh, the whole thing is gone now :(
okay, let's get this patch merged and deployed, once it's there, we can enable it in production. What do you think?
No worries at all. I'm also changing my mind quickly here.
One problem with using lag as the metric is that it doesn't go negative, so the integral will not be pulled down while the service is idle. We could subtract a target lag, say 1 minute, but that loses some of the supposed benefit of including an integral term. A better metric would be updater load, i.e. demand/capacity. When the load is more than 100%, the lag increases at a rate of 1 second per second, but there's no further information in there as to how heavily overloaded it is. When the load is less than 100%, lag decreases until it reaches zero. While it's decreasing, the slope tells you something about how underloaded it is, but once it hits zero, you lose that information.
Load is average queue size, if you take the currently running batch as being part of the queue. WDQS currently does not monitor the queue size. I gather (after an hour or so of research, I'm new to all this) that with some effort, KafkaPoller could obtain an estimate of the queue size by subtracting the current partition offsets from KafkaConsumer.endOffsets().
Failing that, we can make a rough approximation from available data. We can get the average utilisation of the importer from the rdf-repository-import-time-cnt metric. You can see in Grafana that the derivative of this metric hovers between 0 and 1 when WDQS is not lagged, and remains near 1 when WDQS is lagged. The metric I would propose is to add replication lag to this utilisation metric, appropriately scaled: utilisation + K_lag * lag - 1 where K_lag is say 1/60s. This is a metric which is -1 at idle, 0 when busy with no lag, and 1 with 1 minute of lag. The control system would adjust the request rate to keep this metric (and its integral) at zero.
With PID, we need to define three constants K_p, K_i and K_d. If we had problem with finding the pool size, this is going to get three times more complicated (I didn't find a standard way to determine these coefficients, maybe I'm missing something obvious)
One way to simplify it is with K_d=0, i.e. make it a PI controller. Having the derivative in there probably doesn't add much. Then it's only two times more complicated. Although I added K_lag so I suppose we are still at 3. The idea is that it shouldn't matter too much exactly what K_p and K_i are set to -- the system should be stable and have low lag with a wide range of parameter values. So you just pick some values and see if it works.
We currently don't have an infrastructure to hold the "maxlag" data over time so we can calculate its derivative and integral. Should we use redis? How it's going to look like? These are questions, I don't have answers for them. Do you have ideas for that?
WDQS lag is currently obtained by having an ApiMaxLagInfo hook handler which queries Prometheus, caching the result. Prometheus has a query language which can perform derivatives ("rate") and integrals ("sum_over_time") on metrics. So it would be the same system as now, just with a different Prometheus query.
Fri, May 29
I need the DNS patch and the puppet patch be merged and deployed before I can continue. SRE team should take a look.
Thu, May 28
I made this as a possible solution: https://github.com/wmde/reference-island/pull/63 Which seems easy and straightforward.
I'm slightly concerned by putting Wikidata in front of all production wikis. Issues in Wikidata usually propagates to all wikis, can Wikidata take it after group0?
Wed, May 27
With the announcement done, this can move forward on our side
Tue, May 26
I honestly think we should just drop wikibase codesniffer. We have been maintaining too much codebase.
I sorta accidentally picked it up while working on the reading improvements. One thing that I realized is that even batching it to 2 item per write makes it quite faster:
amsa@amsa-Latitude-7480:~/workspace/ref2$ time python3 wikidatarefisland/run.py --step extract_items --side-service-input "whitelisted_ext_idefs.json" --input "/home/amsa/workspace/latest-all.json.gz" --output "extracted_unreferenced_statements.jsonl" --write-batch 0