Fri, Feb 21
Thanks @kaldari. Yeah, "Regional society" and "Regional geography" and "Regional interest" were intentionally general, sort of as last ditch categories when there weren't higher confidence topic assignments. Part of this was to avoid false positive topic assignment. I like the set intersection notion @Halfak suggests. For analytic needs, agreed in general on how it would be ideal to have a little more precision on the probable most salient topic cluster if a small enough set of certain clusters emerge regularly (and I'm pretty sure they do).
Mon, Feb 10
Jan 21 2020
Thanks @Bstorm. Please advise when I should re-check.
Jan 20 2020
Jan 16 2020
@Halfak Is it possible to enable this model in production, but dark launch it or mark that part of the API as unstable?
Dec 21 2019
I did another run, and pointed to the details of forming the data set at https://github.com/dr0ptp4kt/dr0ptp4kt.github.io/blob/master/topic-20191211.ipynb. Some of the scripts run out of band, which are referenced in the notebook commentary, have been copied into the same directory as this notebook in the repo.
Dec 6 2019
After the update to iOS 13.2.3 the problem went away. So I guess we'll chalk this one up to a browser or OS bug fixed by an OS upgrade.
Thanks. I hadn't yet been prompted to upgrade from iOS 13.1.3 (it seems sometimes the carrier backplane configuration is a little slow on that for initiating the prompt), but will try on 13.2.3 and report back later (gotta run now). Meantime, see the attached animated GIF for w:Main Page, followed by a timeline recording captured a little later. Both from pristine freshly instantiated app and browser instances in private mode. In my timeline inspector it implies that initial handshake is around 500ms (although for some reason I can't seem to hover on it). Anyway, looks pretty fast. I tried a larger article on a presidential candidate and that didn't seem to be a problem.
It seems (?) like all of the network transfer (including TLS handshake, first page, subsequent assets) from this timeline recording happens within 1.5 seconds. That part's okay (even though I wish TLS handshaking and the networking subsystem went faster!). The waiting for layout part, though, at about 8 seconds, seems like it may relate to something with style recalculation, no? The timeline doesn't suggest anything unusual in terms of CPU usage (not sure about memory pressure, but I typically only have one app active).
Dec 3 2019
Dec 2 2019
Kicking over to Mark as manager of the Structured Data engineering team, where maintenance of FKA Multimedia team projects is performed.
Nov 26 2019
@cchen copying you in for visibility. iPad iOS 13 is a desktop UA, in case that's useful info in other contexts.
Access requested for me - dr0ptp4kt
Nov 20 2019
Hi team, I found one reviewer somewhat serendipitously. @pmiazga will provide a review of https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/EventStreamConfig/+/545654/ and this has been okay'd with @ovasileva provided it flows through the Web team's workboard per usual process. @pmiazga see T233634#5664331 for the context - much of the architectural specification has already been worked out over the past months, but @Ottomata is looking for conformance to good MediaWiki PHP conventions and sensible API endpoint and field names.
Nov 14 2019
@Tgr do you know if this one is a simple fix or does it look more complicated? I incidentally was trying the API out this morning and happened to come across this task.
Nov 13 2019
@Halfak, ah, thanks, here you go:
Nov 12 2019
Upon second thought, I'm not sure this really needs mwparserfromhell just yet. No point letting the perfect being the enemy of the good, either. Patch posted ^, please do let me know in case of any questions.
Nov 6 2019
Nov 4 2019
While <textarea> based editing ought to continue to work without JS, yeah, getting an understanding how much editing happens that way without JS would be helpful as a proxy.
Nov 1 2019
I'd like to thank everyone who contributed to this discussion and commend the Web team on taking into consideration the tradeoffs. Looking forward to the results ahead.
Oct 30 2019
@MarkTraceur I filed this request - you should probably be the one to manage the request given you'll be project managing the contractor, but let's discuss. Feel free to edit the request, as I may be not thinking holistically enough...or I may be thinking too holistically!
Oct 29 2019
Pasting in an email I sent:
Oct 27 2019
@Yurik good point. I think the guidance for use of the functionality will necessarily need to note such complications. I imagine instrumentation would be the fastest path to flagging particularly slow cases for remediation.
Oct 25 2019
@Yair_rand agreed the relatively larger JS component size is not ideal. We'll need to compensate with deferred/lazy loading. As for the comparison of the different media types, also agreed on the notion that there's a qualitative difference between static rasters of graphs that wouldn't have follow-on interactivity and plays of video/audio - this is more to note that JS as a requirement for some non-wikitext content is somewhat normal. Video and 3D have a raster, too, it's just that the stuff is intended to be played to enrich the experience. It's certainly a tradeoff around maintenance cost versus the benefits of inbuilt static rendering.
Oct 23 2019
@kzimmerman possibly, depending on time pressure. I just set up a meeting for tomorrow to discuss with you and Connie.
Oct 14 2019
Oct 8 2019
For those still following along: I haven't forgotten about this. Looking at this a bit more, I think mwparserfromhell parsing of templates will improve this hierarchy a bit further in addition to relaxing the regex. I'll be interested in upstreaming this stuff hopefully sometime this Q2 or next.
@Milimetric the visual treatment depends on a few factors, although yes, I think we'll want a placeholder to account for any deferred / lazy fetch or client compute timing things.
Thanks @kzimmerman - yeah, some cursory review of the "predicted" values and whether they're approximately sensible and whether addition of stuff of this nature to Hive for a starting point might be of use would be most appreciated!
Sep 25 2019
I'm about to start the work to derive country from Infobox settlement bearing subjects (effectively replacing the Geography.* mid-level category assignment for such subjects), as that's something we're exploring for the purpose of counting pageviews by topic in a little more fine tuned way as a starting point, but meantime here's what the heuristic output looks like.
Sep 20 2019
Hello all. We're going to turn this into a client-side feature and divest of the server side rendering componentry.
Sep 13 2019
Sep 12 2019
The intent here was to capture the material captured via T102318: Convert startup blacklist to feature test with richer detail and on an ongoing basis.
Sep 7 2019
Sep 5 2019
We're mulling this over still.
Aug 30 2019
Aug 29 2019
Thanks! Okay, expanding the heap size helped here. Thank you for the offer of assistance as well!
Aug 27 2019
Aug 22 2019
Okay, let me look into this.
@ovasileva okay to redirect Safari desktop to mdot?
Aug 2 2019
Here's approximately what I had in mind. The mwparserfromhell library might streamline extraction of key-value pairs, but this gets at the typical whitespacing patterns and it seems editors typically follow convention for parameter ordering.
Aug 1 2019
@Pchelolo I may have missed it but has a standardized persistence mechanism been defined?
Jul 31 2019
Jul 25 2019
I'm in support of permanent stashing.
Jul 22 2019
Thanks @santhosh and @Pginer-WMF. Lydia and I spoke. Just to close the loop on one my earlier questions, template key-to-Wikidata property association does not clash with Wikidata client editing plans as of now. Lydia also reinforced the value of data mining in working through potential futures (some of which is happening I see). There are no simple approaches, to be sure, but I did want to close the loop on this one item.
Jul 18 2019
Hi team, just to follow up on what I've let some of you know by email, I'm going to investigate the option of finite temp resources to help handle the more pressing item on EOL to buy us more time. We'll need to scope narrowly so that we address the EOL problems at a minimum and hopefully sidestep any further challenges like this in the future.
Jul 12 2019
Jul 11 2019
Approved as Engineering Director.
I'm in support of the canonical identifier of template parameter keys being Wikidata properties in the cases they're available. I lean toward TemplateData for the time being as the place to forge the connection between template keys and Wikidata properties. As people inevitably copy TemplateData templates across wikis in the current state that ought to make for more consistency on the semantic meaning of the templates, at least.
Jul 10 2019
Jul 9 2019
@chelsyx I forget some of the details, but I think it's okay to allow the multiple domains be the referrer for the pageviews.
Jun 19 2019
Hi all. There isn't further work on this. Right now the Multimedia team is focused on the Structured Data on Commons project, for which the grant runs until the end of the current calendar year. Presently additional work on stuff outside of SDC or urgent bugfixing or adding test coverage is generally lower priority. Tagging @Ramsey-WMF for any further prioritization or work decomposition if time becomes available in the new calendar year.
Jun 18 2019
Jun 17 2019
Thank you, @chelsyx, great work! Is there a Phabricator paste or Jupyter notebook with the queries and results for our future selves?
Jun 7 2019
Hi @eranroz. Heads up, I'm on time off so my reply may be delayed a couple weeks.
May 23 2019
I was wondering, how well does the parameters name-based approach apply to the set of TemplateData-backed templates themselves?
May 22 2019
One of my hopes is that people will be concentrated together on topics. I'd prefer single track with a well structured format if at all possible.
May 16 2019
May 15 2019
Thanks @kzimmerman. All the signals I've seen and that we could think of seem to suggest there's a positive traffic increase attributable to the intervention. My gut instinct given typical user behavior on search engine result pages and the nature of this intervention is that there's very little cannibalization, no detrimental cannibalization, and in fact there's a real boost to content availability and consumption.
May 13 2019
@elukey thanks for the follow up here. No need to block on me for the GPU. Fully agreed on the need for a secure supply chain.
May 3 2019
As an example, https://upload.wikimedia.org/wikipedia/commons/thumb/7/7f/NY_308_in_Rhinebeck_4.jpg/800px-NY_308_in_Rhinebeck_4.jpg maps to https://ms-fe.svc.eqiad.wmnet/wikipedia/commons/thumb/7/7f/NY_308_in_Rhinebeck_4.jpg/800px-NY_308_in_Rhinebeck_4.jpg in the cluster if fetching directly from stat1005.
May 1 2019
Apr 30 2019
To clarify, was that via the internet or internal cluster?