Fri, Dec 14
@nettrom_WMF Are you using IPython alone or within Jupyter?
@Reedy Thanks for posting on the talk page. I hope that helps some folks figure out their issue if they run into it.
Thu, Dec 13
@chasemp I've linked the accounts now.
The upshot of all of this is the code will live in MediaWiki Core.
Wed, Dec 12
Tue, Dec 11
I wonder if we could have access to turn this on for the grantmetrics database: https://mariadb.com/kb/en/library/slow-query-log-overview/
Mon, Dec 10
Is this the RSVG we are talking about? https://www.npmjs.com/package/rsvg
Wed, Dec 5
Thanks @Aklapper. I'm still learning about the layers of communication that exist around here.
Tue, Dec 4
Is there community input required to +2 @Reedy's patch for MW core? If so, we should start that now. I guess that would happen on Meta?
Fri, Nov 30
Noting here that we are no longer going to provide the Contributions report.
Interesting. It does make me wonder if we should message the user about partial translations existing. Had we decided previously that we wouldn't let users choose partial languages from which to translate? It makes sense because if there's no text then the translator doesn't know what to write.
Thu, Nov 29
This is primarily a socialization task because this usage in the core code is quite small. Seems like this:
Nov 16 2018
Hahahaha. I can make a ticket.
Nov 15 2018
@Niharika Good question. We can make a task. It may be working the way they want it to?
I've made further comments on the PR in Github.
Looks good on my browser.
Nov 13 2018
I wouldn't call it a beta personally but I take your point.
It's the first iteration of such a tool. Future iterations may get closer to *all* than this first iteration will.
I want to point out that the word "report" here might be problematic. We aren't going to deliver a report but will provide the actual stored data. So, it's not just a list but is more like a data dump.
Nov 12 2018
As this is merged, should it be marked "Done" or "Product Review?"
Nov 11 2018
The decision has already been made to go with JSON. The original confusion came from the misunderstanding between us need in human-readable or machine-readable formats.
Nov 9 2018
@jmatazzoni Thanks for getting this all cleaned up.
Nov 8 2018
Here's an example query for "cott" in the API Sandbox: https://commons.wikimedia.org/wiki/Special:ApiSandbox#action=query&format=json&list=search&meta=&srsearch=cott%20filetype%3Adrawing&srnamespace=6
@Niharika We are using the search API. We can take a look but it's possible these are just the vagaries of the search API and not much can be done about it by us.
Thanks for the clarification @Nuria.
It seems to fall into the same class as our other discussion about isBlocked and sitewide. We've changed the whole paradigm so we have to have a more holistic view of what the desired outcomes are and not just what this small block of code does.
Nov 7 2018
I commented on the PR. There seems to be something wrong with the asset loading. Possibly it's just me?
Reviewed and had one small question/comment.
Max thinks the page save latency graph might be the best measure here.
Nov 6 2018
It can be left behind. It's a "nice to have."
I would say yes. But, it likely could be part of our taking the work and learning we gained from the SVG symfony bundle work and applying it to Event Metrics.
@jmatazzoni My feeling is that this can stay as one task and that investigation is probably limited to identifying the sources of data. I honestly don't believe CSV conversion is an issue.
That works for me.
Nov 5 2018
That's exactly what it would look like if we don't process it. However, we could pluck out the "usercontribs" bit maybe to clean it up some. And do something similar for other kinds of data.
@Milimetric I made the mistake of posting before reading all the comments or it would have been obvious from your earlier comment that you likely tried that already.
@colewhite I think you tagged the wrong Phab task in your commit.
Can we release this wiki-by-wiki in case there are performance issues?
@jmatazzoni I think this should just be in the backlog for now. No estimate (yet) and not a blocker for release.
@jmatazzoni JSON is human-readable and has structure. We could quibble over whether a user should be able to easily import into Excel which CSV provides and JSON doesn't (at least not easily). The difference is a "letter" vs "spirit" discussion.
Nov 2 2018
I'd go with making it the width of the text.
Have we tried breaking this into two queries?
- Get the page IDs that are in the category and the timeframe.
- Query the revisions to those page IDs.
At least then we'd have a good sense which part of this is particularly slow.
Nov 1 2018
I don’t know how $preparedParams works exactly, but the names used don’t seem to match the replacement tokens ie int_id != :id and int_lastId != :lastId.
I reckon this is OOUI territory but it seems like we'd need to decide what action/character means you are done with that item. Spaces won't work because pages have spaces in the name, right? Some apps that do this with tags use the comma as a signal. At any rate, blurring the field should submit these at a minimum.
The way I understood this is that with the current architecture, yes, the process would terminate. However, we intend to make this a background process as part of the job queue work we've discussed. Once that is available, this could become a queued process that wouldn't require the user to stay on the page.
Oct 31 2018
I am taking a look at this from the Partial Blocks perspective just in case.
Thanks for all the info. We are looking into this right now.
Oct 30 2018
PHP does have a word count function. Sure, it won't skip wikitext but it would be consistently wrong and maybe that would suffice?
I'd vote for making CSSJanus another task and considering this one closed as the bundle is working.
@Reedy Thanks for the additional details. It seems like the bulk of the work will be in ensuring the messaging to the users is correct and appropriate.
Oct 29 2018
Are we stalled on this? Seems like most of it is ready to go.
If we are counting wikis and there's no notion of time left, then it seems fine.
Created T208246 to track actual dev work. I believe this investigation is complete.
@Reedy Thanks for pointing us to that and for getting that work done. I'll update the description here to not include that work.
I don't think we can accurately build an actual timer. To do so, we'd need to know how long the process will take and what portion has already been completed.
Oct 26 2018
That's good to hear. At least we know that policies can't be created that would override the ones we'd like to modify. I didn't read that code too closely so thanks for verifying this.
The defaults for password policy are here in DefaultSettings.php.
Should or could this be part of our standard ToolForge Docker Symfony Bundle Container Thingie?
Oct 25 2018
We could do it but I don't think we should consider it our responsibility directly.
Oct 24 2018
@Samwilson I added a few comments.
Oct 23 2018
It's here: https://github.com/wikimedia/grantmetrics
Oct 17 2018
@Samwilson Thanks for clarifying that about the files. I wasn't sure if there'd be SVGs or PNGs lying around.
Oct 16 2018
Get size on the most recent revision and the size of the oldest revision for the given time period. Subtract. That's the total bytes changed on that page during the event.
@Niharika Yes, I think per-label is much more feasible. If we build toward that ideal and we find performance issues, we can back off but I wouldn't anticipate that.
I just looked at some of the raw data instead of the counts and I need to redo these queries. It's definitely not correct.
Our current implementation plan is that all SVG reading and writing happens in the back end. If you wanted to render a preview every character, that is a LOT of round-trips to the server and many, many unused files laying around. That leads me to say that I don't think it's feasible.
@TBolliger Take a look and see if this is the data you were expecting: https://docs.google.com/spreadsheets/d/1ORQKUYUa0XaCDCfB-2tJGSlyXBopRkF0k5JbWu6OKcQ/edit?usp=sharing
Way out of scope for this task but this kind of data and the slices you want to look at would be PERFECT for an ELK stack. You could even have a dashboard that showed these numbers in real-time.