Page MenuHomePhabricator

Sascha (Sascha Brawer)
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Feb 19 2018, 8:29 PM (226 w, 6 d)
Availability
Available
LDAP User
Sascha
MediaWiki User
Unknown

Recent Activity

Apr 25 2022

TheresNoTime awarded T306790: Set up monitoring for community cronjobs a Like token.
Apr 25 2022, 1:16 PM · observability, Cloud-Services
Sascha added a comment to T306790: Set up monitoring for community cronjobs.

This seems like a duplicate

Somewhat, although this ticket here was meant specifically for monitoring cronjob completions. This is different (and simpler) than setting up Cortex/Thanos-like monitoring on metrics exposed by continuously running services.

Apr 25 2022, 12:35 PM · observability, Cloud-Services
Sascha added a comment to T279621: Set up Misc Object Storage Service (moss).

Will Toolforge and Cloud VPS jobs be able to read and write into their own custom buckets? (That would be super helpful.)

Apr 25 2022, 12:16 PM · SRE-swift-storage
Sascha created T306790: Set up monitoring for community cronjobs.
Apr 25 2022, 11:44 AM · observability, Cloud-Services

Feb 2 2022

Sascha added a comment to T236992: Order Wikidata search result by number of statements/labels/sitelinks/identifiers.

Possibly relevant: https://qrank.wmcloud.org/ which ranks Wikidata items by how often their pages get viewed on Wikipedia, Wikitravel, etc.; updated ~weekly.

Feb 2 2022, 6:45 AM · CirrusSearch, Discovery-Search, Wikidata

Jan 25 2022

Sascha added a comment to T194332: [Epic] Make Toolforge a proper platform as a service with push-to-deploy and build packs.

@nskaggs, it’s in Go, although I’m working on a web frontend that’ll eventually have a part written in JavaScript/React. Here’s my current “release process” that I’m hoping to make less manual. The environment variable GOOS=linux can be omitted when the compiler runs on a Linux machine.

Jan 25 2022, 6:20 AM · User-dcaro, Cloud-Services-Origin-Team, Cloud-Services-Worktype-Project, Toolforge Build Service, Cloud Services Proposals, cloud-services-team (Kanban), Epic

Jan 21 2022

Sascha added a comment to T194332: [Epic] Make Toolforge a proper platform as a service with push-to-deploy and build packs.

Does the push-to-deploy pipeline accept early adopters? I’d gladly volunteer as a guinea pig.

Jan 21 2022, 8:23 AM · User-dcaro, Cloud-Services-Origin-Team, Cloud-Services-Worktype-Project, Toolforge Build Service, Cloud Services Proposals, cloud-services-team (Kanban), Epic

Jan 6 2022

Sascha added a comment to T215438: Aggregate pageviews to Wikidata entities.

Meanwhile I’ve set up Wikidata QRank, which computes this dataset on a weekly basis and offers the results for public download. So, feel free to close this ticket. But if the data engineering team was interested in joining/improving/taking over the project, please feel welcome; it’d be great to work together!

Jan 6 2022, 6:44 AM · Data-Engineering

Dec 23 2021

Sascha created T298228: Request creation of qrank VPS project.
Dec 23 2021, 7:03 AM · Cloud-VPS (Project-requests)

Nov 17 2021

Sascha created T295879: Wikidata: Support language gmh for monolingual text.
Nov 17 2021, 1:39 PM · Language codes, Wikidata

Jun 15 2021

Sascha added a comment to T284947: Shut down certmon?.

Oh, if it’s useful to you, I’ll gladly keep it running. Do you want it to monitor other domains beyond the current four?

Jun 15 2021, 1:21 PM · Tools

Jun 14 2021

Sascha created T284947: Shut down certmon?.
Jun 14 2021, 5:40 PM · Tools

May 17 2021

Sascha added a comment to T215098: Playing audio for the second time always stutters.

Friendly ping? The bug is still present.

May 17 2021, 2:59 PM · Kaltura player

May 14 2021

Sascha added a comment to T282264: Monitor certificate validity for Cloud VPS.

Cool, glad it’s useful! When you set up Prometheus rules, consider alerting when certmon_tls_certificate_expiration_timestamp - time() becomes less than ~2 weeks or so for a domain; see Prometheus recommendations for timestamps. Then, the the SRE team would get plenty of advance notice for expiring TLS certificates, allowing problems to be fixed long before they become user-visible outages. (Apologies if I’m stating the obvious here, you’ll know more about this than me).

May 14 2021, 9:46 AM · cloud-services-team (Kanban), Cloud-VPS

May 11 2021

Sascha added a comment to T282102: certificate for Cloud VPS has expired.

If it helps, feel free to adopt https://certmon.toolforge.org/ which was quickly thrown together in an attempt to help Wikimedia to improve its monitoring. See source code and the metrics endpoint for Prometheus monitoring. Feel free to fork, send pull requests, whatever. Please do tell if you end up using it, I’m quite curious. If it’s useful, my personal preference would be that you’d clone the repo into a better place (perhaps a Phabricator project) and run it yourself, so the Wikimedia SRE team could change things without me getting involved.

May 11 2021, 1:36 PM · SRE, Traffic, HTTPS, Cloud-VPS
Sascha added a comment to T282264: Monitor certificate validity for Cloud VPS.

If it helps, feel free to adopt https://certmon.toolforge.org/ which was quickly thrown together in an attempt to help Wikimedia to improve its monitoring. See source code and the metrics endpoint for Prometheus monitoring. Feel free to fork, send pull requests, whatever. Please do tell if you end up using it, I’m quite curious. If it’s useful, my personal preference would be that you’d clone the repo into a better place (perhaps a Phabricator project) and run it yourself, so the Wikimedia SRE team could change things without me getting involved.

May 11 2021, 1:36 PM · cloud-services-team (Kanban), Cloud-VPS

May 7 2021

Sascha created T282264: Monitor certificate validity for Cloud VPS.
May 7 2021, 5:45 PM · cloud-services-team (Kanban), Cloud-VPS

Apr 28 2021

Sascha added a comment to T209390: Output some meta data about the wikidata JSON dump.

Hm, good point. Could the dumps be made consistent? Maybe like this: Before starting a dump, find the current last revision; pass this cut-off revision ID to the dumping shards; change the dump-producing code to not consider changes after the cut-off revision. But I wouldn’t know how hard this would be. Actually, DumpEntities already seems to take a last-page-id flag, but I don’t know if/where that is getting set in production (and if that’s really enough).

Apr 28 2021, 9:14 AM · wdwb-tech, Dumps-Generation, Wikidata
Sascha added a comment to T87283: Wikidata dumps should have revision ID or other sequence mark.

Regarding dump-level metadata, it would be super useful to know what timestamp should be passed to EventStreams for catching up with user edits after the dump was produced. To find this timestamp, can clients extract the entity ID with the highest lastrevid from a Wikidata dump, and then retrieve the corresponding timestamp via Special:EntityData like this? Or would a sync-up client loose some edits if it were to do this? (For example, if dumps get produced by parallel workers, they’d probably have to agree on a cut-off revision before starting the dumping process; otherwise, the JSON file wouldn’t necessarily contain all changes before the highest lastrevid in the dump file... correct?)

Apr 28 2021, 8:59 AM · MW-1.34-notes (1.34.0-wmf.5; 2019-05-14), Wikidata
Sascha added a comment to T209390: Output some meta data about the wikidata JSON dump.

To find the timestamp of the last Wikidata change that went into a dump file, couldn’t one — while processing the dump — extract the entity and revision ID with the highest lastrevid value in the entire dump, and then retrieve the corresponding modified timestamp for that single edit via Special:EntityData like in this query? The lastrevid field seems to have been added to dumps by T87283 in changeset 500806.

Apr 28 2021, 8:19 AM · wdwb-tech, Dumps-Generation, Wikidata

Apr 9 2021

YFdyh000 awarded T275024: Toolforge: Update go runtime a Love token.
Apr 9 2021, 1:23 PM · Toolforge (Software install/update), cloud-services-team (Kanban)

Apr 6 2021

Sascha added a comment to T277749: [Toolforge] Generic webservice not working on Kubernetes.

Sure, glad to try. I’ve changed the qrank-builder job config to use the new image. It seems to work fine.

Apr 6 2021, 6:12 AM · Kubernetes, Toolforge

Mar 25 2021

Sascha created T278416: Mention QRank in “Analytics Datasets”.
Mar 25 2021, 9:42 AM · Analytics-Kanban, Dumps-Generation, Analytics
Sascha created T278409: [Legal] Downloads license should mention CC0 for Analytics datasets.
Mar 25 2021, 7:17 AM · Analytics-Kanban, Analytics, Datasets-General-or-Unknown

Mar 22 2021

Sascha created T278176: Use ranking signal for Special:Search.
Mar 22 2021, 7:49 PM · Discovery-Search, MediaWiki-Search, Advanced-Search
Sascha added a watcher for community-labs-monitoring: Sascha.
Mar 22 2021, 10:56 AM
Sascha changed the status of T278097: Monitoring and alerting for Toolforge tools from Open to Stalled.

Thanks for the pointer! Indeed, I was hoping the Wikimedia Cloud had something like Cortex or Thanos running on behalf of custom tools. Hm, considering for how long these discussions seem to already have been taking place, it doesn’t really look like this will be coming anytime soon. So, closing this ticket here as stalled; things won’t go any faster with more tickets around.

Mar 22 2021, 10:43 AM · Toolforge
Sascha created T278097: Monitoring and alerting for Toolforge tools.
Mar 22 2021, 10:04 AM · Toolforge

Mar 19 2021

Sascha added a comment to T277749: [Toolforge] Generic webservice not working on Kubernetes.

As @bd808 suspected, security is indeed the main reason why I’d like to run my dinky webservice in a constrained environment. As an external volunteer developer, I’m always fearing that my contributions may cause more harm than good. Especially when contributing some minor tool that doesn’t see much attention, I can sleep better when there isn’t much else bundled into the container for my webservice. Of course, the risks can be mitigated with container scanning, actively checking CVEs, etc. — but as an external volunteer, I don’t really want to impose such maintenance burden on others. Of course, keeping containers lean isn’t the universal solution to all problems in production security—still, with less baggage, fewer things can go wrong. Basically, it’s an attempt at taming the beast of system complexity.

Mar 19 2021, 8:35 AM · Kubernetes, Toolforge

Mar 18 2021

Sascha added a comment to T277457: Request increased quota for qrank Toolforge tool.

Thanks, @aborrero! I filed a separate ticket T277808 about deployment since it’s a bit off-topic from the CPU quota.

Mar 18 2021, 9:32 PM · Toolforge (Quota-requests)
Sascha created T277808: [Deployment pipeline] Support Toolforge?.
Mar 18 2021, 9:19 PM · Release-Engineering-Team (Seen), Toolforge
Sascha added a comment to T277749: [Toolforge] Generic webservice not working on Kubernetes.

With the Go programming language, binaries typically get statically linked. So, compiled programs will typically run without any runtime dependencies whatsoever — they wouldn’t access package files, call shared libraries, or use any other files. When compiling for Linux, the compiler builds an ELF binary that directly invokes the operating system kernel through Linux system calls, not even using libc or anything else in a Linux distribution. Rust may be similar in that respect (not sure); static linking can also be done with C and C++, although it’s a bit less common there.

Mar 18 2021, 8:42 PM · Kubernetes, Toolforge
Sascha created T277749: [Toolforge] Generic webservice not working on Kubernetes.
Mar 18 2021, 1:14 PM · Kubernetes, Toolforge
Sascha added a comment to T277457: Request increased quota for qrank Toolforge tool.

Also, you mention building the go program on Toolforge. How do you build it? I guess you build it in the toolforge bastion?

Mar 18 2021, 12:56 PM · Toolforge (Quota-requests)
Sascha added a comment to T277457: Request increased quota for qrank Toolforge tool.

Thank you! Yes, this is a build pipeline for data, it isn’t compiling code. For background, see the technical design document. (Feedback very welcome!)

Mar 18 2021, 10:15 AM · Toolforge (Quota-requests)

Mar 16 2021

Sascha updated the task description for T277457: Request increased quota for qrank Toolforge tool.
Mar 16 2021, 6:35 PM · Toolforge (Quota-requests)
Sascha updated the task description for T277457: Request increased quota for qrank Toolforge tool.
Mar 16 2021, 6:32 PM · Toolforge (Quota-requests)
Sascha added a comment to T143424: Explore the Entity Relevancy Scoring for Wikidata.

Perhaps the QRank signal might be helpful here? The signal is computed in the Wikimedia cloud infrastructure (Toolforge) and gets periodically refreshed. It’s just aggregated pageviews, but I found it pretty useful in my own projects, which is why I contributed it to Toolforge.

Mar 16 2021, 11:30 AM · Wikidata
Sascha added a comment to T174981: Add pageviews total counts to WDQS.

Perhaps the QRank signal might be helpful here? The signal is computed in the Wikimedia cloud infrastructure (Toolforge) and gets periodically refreshed. It’s just aggregated pageviews, but I found it pretty useful in my own projects, which is why I contributed it to Toolforge.

Mar 16 2021, 11:29 AM · Analytics-Radar, Discovery, Wikidata, Wikidata-Query-Service

Mar 15 2021

Sascha created T277457: Request increased quota for qrank Toolforge tool.
Mar 15 2021, 12:16 PM · Toolforge (Quota-requests)

Mar 2 2021

Sascha added a comment to T275371: Toolforge on Kubernetes: Broken symlink to dumps.

Yes, it works now. Thank you!

Mar 2 2021, 6:02 AM · cloud-services-team (Kanban), Kubernetes, Toolforge

Feb 24 2021

Sascha created T275703: [Toolforge] Reading dumps is very slow.
Feb 24 2021, 10:14 PM · cloud-services-team (Kanban), Kubernetes, Toolforge
Sascha added a comment to T275555: [toolforge k8s] Support Cinder volumes.

Note that Kubernetes can also directly mount volumes from Ceph RBD, so this wouldn’t necessarily have to be done via Cinder. If Kubernetes was directly mounting Ceph RBD, there would be one less layer to maintain. But I don’t know how well this would fit into Wikimedia’s production setup in terms of quota enforcement, key management, monitoring, etc. Here’s some pointers, in case you want to explore this. The example setup looks actually quite simple.

Feb 24 2021, 11:15 AM · cloud-services-team (Kanban), Kubernetes, Toolforge

Feb 23 2021

Chicocvenancio awarded T275555: [toolforge k8s] Support Cinder volumes a Love token.
Feb 23 2021, 8:29 PM · cloud-services-team (Kanban), Kubernetes, Toolforge
Sascha created T275555: [toolforge k8s] Support Cinder volumes.
Feb 23 2021, 8:22 PM · cloud-services-team (Kanban), Kubernetes, Toolforge

Feb 22 2021

Sascha created T275371: Toolforge on Kubernetes: Broken symlink to dumps.
Feb 22 2021, 10:46 AM · cloud-services-team (Kanban), Kubernetes, Toolforge

Feb 17 2021

Sascha created T275024: Toolforge: Update go runtime.
Feb 17 2021, 1:05 PM · Toolforge (Software install/update), cloud-services-team (Kanban)

Nov 12 2019

Sascha added a comment to R2362:938075faf218: Add bulk lexeme creation mode.

Do you need a beta tester? I have a public domain list of 600 Sursilvan verbs including inflected forms. (In Sursilvan, verb inflection is quite complicated and fills entire textbooks; sort of like Latin, but with more exceptions). I’d like importing this knowledge to Wikidata. (Actually, if QuickStatements2 was able to create lexemes and refer to the newly created lexeme from within the same batch, that would probably be enough). Example:

Nov 12 2019, 8:40 AM

May 6 2019

Sascha added a comment to T215438: Aggregate pageviews to Wikidata entities.

Friendly ping?

May 6 2019, 11:04 AM · Data-Engineering

May 3 2019

Sascha added a comment to T222426: Add monolingual language codes rm-rumgr, rm-surmiran, rm-sursilv, rm-sutsilv, rm-vallader, rm-puter.

The codes are valid (and registered) IETF BCP47 language codes.

May 3 2019, 2:48 PM · Language codes, MW-1.35-notes (1.35.0-wmf.30; 2020-04-28), Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

@GerardM Yes, this request is merely about supporting Wikidata *content* in those language variants, eg. allowing people to enter Sursilvan usage examples for a Sursilvan lexeme (see T222426). No need to translate the *user interface* to Sursilvan, Vallader etc.

May 3 2019, 2:44 PM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha added a comment to T222420: Wikidata links to Commons should appear in Special:WhatLinksHere.

Hm... in the sidebar on Wikimedia Commons (see screenshot), would it perhaps make sense to replace the link to Special:WhatLinksHere by a link to Special:GlobalUsage? Currently, there seems to be a usability/UX issue: the feature is already implemented (thanks for the kind explanation on this bug, I had no idea!). However, people may might never come across the Special:GlobalUsage unless they already know that it exists. Hence the suggestion to remove “What links here” from the sidebar and replace it by “Global usage” which seems to be a superset. (There’s a risk of cluttering the user experience when the sidebar has too many links).

May 3 2019, 9:33 AM · Commons, Wikidata, MediaWiki-Special-pages
Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

@Lea_Lacroix_WMDE Thank you! Filed T222426.

May 3 2019, 7:59 AM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha created T222426: Add monolingual language codes rm-rumgr, rm-surmiran, rm-sursilv, rm-sutsilv, rm-vallader, rm-puter.
May 3 2019, 7:58 AM · Language codes, MW-1.35-notes (1.35.0-wmf.30; 2020-04-28), Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

Is there anything specific I should do so that people can enter usage examples for Sursilvan lexemes, and likewise for lexemes in the various other Romansh variants? I’ll gladly file more tickets if it helps; just tell me what to do.

May 3 2019, 7:33 AM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

Filed T222423 for another (very minor) issue that seems related to language variants.

May 3 2019, 7:15 AM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha created T222423: Lexemes should display language name (not code) of Romansh variants in gloss language.
May 3 2019, 7:10 AM · Language codes, I18n, MediaWiki-extensions-CLDR, Wikidata
Sascha reopened T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes as "Open".
May 3 2019, 6:20 AM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha reopened T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes, a subtask of T144272: [DO NOT USE] new monolingual language code requests for Wikidata (tracking) [superseded by #language_codes], as Open.
May 3 2019, 6:20 AM · Language codes, MediaWiki-extensions-WikibaseRepository, Tracking-Neverending, Wikidata
Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

Hm, adding usage examples (and probably similar properties) doesn’t seem to work yet. Try adding the sentence “Ils tgauns vivan dalla naschientscha naven ensemen cullas nuorsas.” as usage example (P5831) in language “rm-sursilv” for tgaun (L45642); see screenshot.

May 3 2019, 6:19 AM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha created T222420: Wikidata links to Commons should appear in Special:WhatLinksHere.
May 3 2019, 5:52 AM · Commons, Wikidata, MediaWiki-Special-pages

May 2 2019

Sascha closed T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes as Resolved.

Ah, got it. Thank you!

May 2 2019, 7:49 PM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha closed T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes, a subtask of T144272: [DO NOT USE] new monolingual language code requests for Wikidata (tracking) [superseded by #language_codes], as Resolved.
May 2 2019, 7:49 PM · Language codes, MediaWiki-extensions-WikibaseRepository, Tracking-Neverending, Wikidata
Sascha reopened T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes as "Open".

Is something else needed to activate lexemes in variants of Romansh? See screenshot:

May 2 2019, 10:20 AM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha reopened T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes, a subtask of T144272: [DO NOT USE] new monolingual language code requests for Wikidata (tracking) [superseded by #language_codes], as Open.
May 2 2019, 10:20 AM · Language codes, MediaWiki-extensions-WikibaseRepository, Tracking-Neverending, Wikidata

Apr 8 2019

Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

Just to clarify, the codes in this ticket (rm-rumgr etc.) are not made up; they have been standardized by IETF and appear in the IANA language subtag registry.

Apr 8 2019, 5:09 PM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata

Apr 2 2019

Sascha created T219914: Outdated project codes in pagecounts-ez.
Apr 2 2019, 6:23 PM · Analytics

Apr 1 2019

Sascha added a comment to T215438: Aggregate pageviews to Wikidata entities.

If nobody else has time to do this, may I volunteer to write the code? Please tell me where to start (which programming language, what framework, etc.)

Apr 1 2019, 11:55 AM · Data-Engineering

Mar 25 2019

Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

Curious, is it possible to estimate by what date this might get implemented? Is there anything I can do to help?

Mar 25 2019, 2:12 PM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata

Mar 20 2019

Ninjastrikers awarded T213576: Display a warning when entering Zawgyi-encoded Burmese a Like token.
Mar 20 2019, 5:48 AM · Wikidata

Mar 14 2019

Sascha added a comment to T124758: [Story] Show all available languages in monolingual text value's suggester.

Oh, all you need from CLDR is an English label? Nothing else? In that case, this Wikidata query might be helpful:

Mar 14 2019, 5:24 PM · MW-1.36-notes (1.36.0-wmf.34; 2021-03-09), Patch-For-Review, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), UX-Debt, UniversalLanguageSelector, WMDE-Design, Design, Story, Wikidata-Sprint-2016-04-26, Wikidata-Sprint-2016-03-01, Wikidata
Sascha added a comment to T124758: [Story] Show all available languages in monolingual text value's suggester.

Sure, but it will take a while until the next official release of CLDR so you'd have to read the CLDR data from the development branch ("trunk"). I do wonder, though, if you could read the IANA registry in addition to CLDR and use IANA as fallback for the English names when CLDR has no data yet. Then, you would immediately get an English name for every language with an ISO 639 or IETF BCP 47 code, so you'd add support for a couple thousand languages at once.

Mar 14 2019, 2:53 PM · MW-1.36-notes (1.36.0-wmf.34; 2021-03-09), Patch-For-Review, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), UX-Debt, UniversalLanguageSelector, WMDE-Design, Design, Story, Wikidata-Sprint-2016-04-26, Wikidata-Sprint-2016-03-01, Wikidata
Sascha added a comment to T124758: [Story] Show all available languages in monolingual text value's suggester.

The easiest way to add a new language to CLDR is preparing ‘seed’ files in XML format;

Mar 14 2019, 7:00 AM · MW-1.36-notes (1.36.0-wmf.34; 2021-03-09), Patch-For-Review, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), UX-Debt, UniversalLanguageSelector, WMDE-Design, Design, Story, Wikidata-Sprint-2016-04-26, Wikidata-Sprint-2016-03-01, Wikidata

Feb 28 2019

MichaelSchoenitzer awarded T213535: Normalize loudness a Love token.
Feb 28 2019, 12:17 PM · Lingua Libre

Feb 8 2019

Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

Friendly ping, is there anything I can do to help with this ticket?

Feb 8 2019, 1:47 PM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata

Feb 6 2019

Sascha created T215438: Aggregate pageviews to Wikidata entities.
Feb 6 2019, 5:00 PM · Data-Engineering

Feb 2 2019

Sascha created T215098: Playing audio for the second time always stutters.
Feb 2 2019, 9:06 AM · Kaltura player

Jan 11 2019

Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

@GerardM, is there anything I can do to help with this ticket? There’s a sizable Romansh dictionary whose data can be donated to Wikidata, but this is currently blocked on this ticket. (Try an exact search for a few German words, eg. “Hund” or “Gelbsucht”, to see how the words are different in various variants of the Romansh language).

Jan 11 2019, 8:54 PM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha updated subscribers of T208641: Improve categorisation for languages that do not have ISO 639-3 code.

For languages that have no language code yet, perhaps Lingua Libre could use “mis-x-Q12345” (where Q12345 would be the Wikidata item for the language of the pronunciation audio). That would be a syntactically valid IETF BCP 47 tag, and you wouldn’t lump unrelated languages into the same category. Once the language does get a code, some bot could change the categories of uploaded files on Wikimedia Commons. @GerardM, what do you think?

Jan 11 2019, 8:45 PM · Lingua Libre
Sascha created T213576: Display a warning when entering Zawgyi-encoded Burmese.
Jan 11 2019, 8:26 PM · Wikidata
Sascha added a comment to T213556: Detect Zawgyi encoding for Burmese strings.

Sorry, here’s the correct link to the Unicode FAQ about Zawgyi: https://www.unicode.org/faq/myanmar.html

Jan 11 2019, 4:13 PM · Lingua Libre
Sascha created T213556: Detect Zawgyi encoding for Burmese strings.
Jan 11 2019, 4:12 PM · Lingua Libre
Sascha created T213535: Normalize loudness.
Jan 11 2019, 11:16 AM · Lingua Libre
Sascha created T213534: Compress audio before uploading?.
Jan 11 2019, 11:08 AM · Lingua Libre
Sascha added a comment to T208641: Improve categorisation for languages that do not have ISO 639-3 code.

Have you considered using IETF BCP 47 language tags instead of ISO 639-3? Every language with an ISO code also has an IETF code (usually the same, since IETF draws in ISO 639 among others). But other than ISO 639, you can do finer-grained distinctions with IETF tags. That’s why all the internet standards (such as HTTP, HTML, XML) use IETF BCP 47 instead of ISO 639. For example, Brazilian Portuguese, Sursilvan and Zürich German have IETF language tags but no ISO code. If LinguaLibre is asked to support languages without an IETF code, you can request the addition of a language tag.

Jan 11 2019, 10:53 AM · Lingua Libre

Jan 8 2019

Sascha added a comment to T210311: Add monolingual language code ccp for Chakma.

Agree. The Chakma language is sometimes written in other scripts than the Chakma writing system, such as Bengali or Latin, but this seems to be rare. (In the future, other writing systems will probably get used more rarely than today, because support for the Chakma writing system is getting rolled out to modern computer operating systems only now). In the Unicode CLDR project, we’ve therefore made Cakm the default script for language ccp; see the line <likelySubtag from="ccp" to="ccp_Cakm_BD"/> in likelySubtags.xml. Also, in Unicode CLDR, all Chakma translations are currently kept in the Chakma writing system; we haven’t received any requests to support (in CLDR) the Chakma language ccp in other writing systems than Cakm. Just a data point; not sure if/how this matters for Wikimedia.

Jan 8 2019, 9:52 AM · Language codes, MW-1.35-notes (1.35.0-wmf.8; 2019-11-26), Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), User-Kizule, good first task, Wikidata
Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

Friendly ping?

Jan 8 2019, 9:12 AM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha added a comment to T210311: Add monolingual language code ccp for Chakma.

Friendly ping?

Jan 8 2019, 9:06 AM · Language codes, MW-1.35-notes (1.35.0-wmf.8; 2019-11-26), Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), User-Kizule, good first task, Wikidata

Nov 30 2018

Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

Given that the codes should adhere to standards, what is the basis for these codes?

Nov 30 2018, 8:29 PM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

Place names can vary by variant, for example St. Moritz [de] = San Murezzan [rm-rumgr] = Sogn Murezi [rm-sursilv] = San Murezi
[rm-sutsilv] = Son Murezzi [rm-surmiran]. But that’s a multilingual label, not a monolingual text statement.

Nov 30 2018, 6:05 AM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata

Nov 28 2018

Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

@Nikki, do you know how where/how I should request adding these BCP47 tags so they become available for Wikidata lexemes? I’ve a large dictionary in multiple Romansh variants, which I’d like to import to Wikidata lexemes, so this isn’t just an academic request.

Nov 28 2018, 1:56 PM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata

Nov 24 2018

Sascha added a comment to T210311: Add monolingual language code ccp for Chakma.

Sounds good.

Nov 24 2018, 7:40 AM · Language codes, MW-1.35-notes (1.35.0-wmf.8; 2019-11-26), Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), User-Kizule, good first task, Wikidata

Nov 23 2018

Sascha created T210311: Add monolingual language code ccp for Chakma.
Nov 23 2018, 10:49 PM · Language codes, MW-1.35-notes (1.35.0-wmf.8; 2019-11-26), Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), User-Kizule, good first task, Wikidata
Sascha added a comment to T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.

For illustration, here’s the English word ‘dog’ in various variants of Romansh:

  • rm-rumgr: chaun
  • rm-sursilv: tgaun
  • rm-sutsilv: tgàn
  • rm-surmiran: tgang
  • rm-puter: chaun
  • rm-vallader: chan
Nov 23 2018, 10:38 PM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata
Sascha created T210293: Add language codes rm-rumgr, rm-sursilv, rm-surmiran, rm-sutsilv, rm-vallader, rm-puter for Lexemes.
Nov 23 2018, 2:45 PM · Language codes, MW-1.34-notes (1.34.0-wmf.3; 2019-04-30), User-Michael, Wikidata Lexicographical data, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata