Isaac (Isaac Johnson)
Research Scientist

Projects

Policy-Admins
Group
Trusted-Contributors
Group

Calendar

User Details

User Since: Oct 1 2018, 2:19 PM (308 w, 5 d)
Availability: Available
IRC Nick: isaacj
LDAP User: Isaac Johnson
MediaWiki User: Isaac (WMF) [ Global Accounts ]

Recent Activity
View All

Fri, Aug 30

Isaac added a comment to T369120: Determine evaluation strategy for article-country model.

Weekly updates:

Starting to put together a more formal set of articles for evaluation. As I've said before, this is challenging because of how broad the space is. If you think about the user experience of "quality" for this model, I assume most will be using it only ever for a specific country (probably where they live) and specific language (their "home" wiki). So their perception of quality will be for that very narrow slice of the model. If we were to provide coverage for all of these entrypoints, however, that would be ~300 languages x 250 countries, which is 75,000 prediction spaces where you'd want at least 10 to get a sense and ideally 50 outputs evaluated to know robustly whether the model will perform well for that user need. Obviously that is way too many model outputs for individuals to evaluate. Even just doing a separate sample for every country on a single wiki would be several thousand articles that would need evaluation. So we have to be much more efficient about how we choose areas to focus on that should hopefully leave us with a good sense of model performance while catching perhaps some of these edge-cases where a particular wiki+country intersection might have issues so we can do something about it.
My goal with this evaluation is two-fold: 1) get enough general data to convince Product stakeholders that the model is good enough to proceed, and, 2) identify any areas where the model is making mistakes that we can actually fix in some way. Eventually, we might have specific language communities who want to do some additional evaluation before enabling the model in some way, but that would be addressed through much more focused evaluations related to whatever their concerns are -- i.e. not the goal here.
My proposed approach (I'll talk with Miriam about this when she's back) is as follows:
- Get someone else to look at my random sample of 50 articles (and maybe extend to 100 to be a bit more robust). This will be the top-level metric that we share that says overall how well the model performs.
- Do a more focused evaluation of a few countries where we're seeing high rates of predictions coming from the wikilinks alone (this is the most uncertain signal in the model and so where false positives are most likely to occur). I put some data on the proportion of predictions for a given country (across all language editions) that were only from wikilinks below. I'll probably draw a small sample (~10) from each country that's above e.g., 20% (0.2) of predictions based on wikilinks (so everything down to Papau New Guinea) and check those to see if any clear issues are identified. I will note that the way that the topic keywords work on Search, we could easily set it so country predictions coming from categories/wikidata are weighted more heavily and so show up higher in the search results. This would reduce the likelihood of false positives being shown to the users if the wikilinks are generating any.
- Do a more focused evaluation of English Wikipedia as well. This is because Content Translation is a likely user of these model outputs and many translations start from English Wikipedia as the source, so the country filter is much more likely to be applied in that setting. This will likely be a simple random sample too. When I did the proportion-from-wikilinks for just English Wikipedia to see if any outliers stood out, it looked pretty similar to the ranking below (not surprising given size of English Wikipedia).

Fri, Aug 30, 1:21 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T361637: Support for topic infrastructure work.

Weekly updates:

Some advance planning with AS around topic taxonomy workshops with the community that we have planned for Q2. I'm going to put together some basic prototype models in September -- e.g., a model that has a "Human Rights" category -- to help us think through the feasibility of various approaches ahead of talking with community members. This is good timing because I'm also having discussions with the Content Translation folks right now about the taxonomy and how to show it in the UI (T113257).

Fri, Aug 30, 12:42 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

Weekly updates:

Continued iteration on workshop paper. Now waiting on decision about whether to move forward before final polishing.

Fri, Aug 30, 12:36 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Thu, Aug 29

Isaac updated subscribers of T368422: Custom translation suggestions: Basic topic selection.

To the general question of the mismatch between the ORES taxonomy and the Newcomer Tasks taxonomy, I'll train to explain but I admit it's a bit messy. Feel free to skip if not interested :)

Thu, Aug 29, 8:54 PM · Patch-For-Review, LPL Hypothesis, CX-boost

Isaac added a comment to T373630: Investigate recent rise in Unique Devices.

I find this interesting because I can come up with reasons why VPN usage could increase OR decrease the unique device counts. My initial though was that increased VPN usage would actually lead to a drop in unique devices counts. Reason being that unique devices is the sum of known unique devices (based on the WMF-last-access cookie: uniques_underestimate in the hive table) and then the additional estimate of how many of the pageviews we see from devices that lack cookies are actually unique (uniques_offset in the hive table). That latter estimate is based on checking how many of the nocookies pageviews have a unique actor signature (essentially UA+IP). So if more folks are using VPNs, then we have more folks on shared IP addresses and therefore a lower count of unique actor signatures and lower unique device count. However, if VPNs are also clearing folks' cookies (and not just obscuring their IP address), then you could see an increase in unique devices because a lot more devices are showing up with nocookies and even if there are more shared IP addresses, you still end up with these devices getting double-, triple-, etc. counted and an inflated unique device count.

Thu, Aug 29, 7:43 PM · Movement-Insights

Wed, Aug 28

Isaac added a comment to T368422: Custom translation suggestions: Basic topic selection.

is the article country model a distinct model with it's own search index tag and query syntax (i.e. "articlecountry:Ghana") or is it a bunch of new values for the article topic tag (i.e. "articletopic:country-ghana")?

@SBisson technically the country model is a separate model from the other topics but right now the plan is to merge these two outputs under the existing articletopic: tag on the Search index. So probably something more like articletopic:country-ghana. Reasoning being that from the standpoint of our end-users, I don't think there's a major difference between the "standard" topics and countries. You can see a bit of the discussion at T301671#10052411 and then Eric's reply.

Wed, Aug 28, 9:03 PM · Patch-For-Review, LPL Hypothesis, CX-boost

Isaac added a comment to T368422: Custom translation suggestions: Basic topic selection.

One challenge of this feature is getting the ORES topics and their localized labels.

@santhosh @SBisson good questions. I don't have a strong opinion about the technical solution for getting localized names for the topics beyond that it be shared across Newcomer Tasks (@KStoller-WMF) and Content Translation so there isn't duplicate work to translate the labels. Ideally future recommender systems would be able to use the functionality as well -- e.g., if Android wanted to expand their Suggested Edits module.

Wed, Aug 28, 6:36 PM · Patch-For-Review, LPL Hypothesis, CX-boost

Mon, Aug 26

Isaac created T373374: Interwiki links with translation tags and multiple colons seem broken in Parsoid.

Mon, Aug 26, 5:15 PM · Parsoid-Read-Views (Phase 1 - DiscussionTools support), Content-Transform-Team-WIP, Parsoid

Fri, Aug 16

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

Weekly updates:

Draft of resource paper completed and shared separately for review

Fri, Aug 16, 7:45 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T301671: Investigate what would be required to include countries in ORES and accessible via a search keyword.

@EBernhardson thanks for all these notes! Sounds like no major blockers then and we can work out the specific details of tags/flushing/etc. when the model is up on LiftWing. Bulk update definitely falls in the nice-to-have category at the moment so if that's not ready when the stream turns on, that's probably okay. But also it should be easy for me to prepare the files in spark if we go that route.

Fri, Aug 16, 1:12 PM · OKR-Work, Growth-Team, Machine-Learning-Team, Discovery-Search, ORES

Isaac added a comment to T361637: Support for topic infrastructure work.

Weekly updates:

Small-scale evaluation of 50 random articles for article-country model as described in T369120#10069239 shows strong early performance. Doing a stratified sample will help us understand if there are any more problematic pockets though.

Fri, Aug 16, 11:59 AM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T369120: Determine evaluation strategy for article-country model.

Weekly updates:

I started with a random sample of 50 articles to evaluate myself to get an early sense and see how difficult it was. Generally I found it to be pretty fast -- using Google Translate to read the article works for most language editions plus inspecting Wikidata when useful meant that I could do most of them in about 30 seconds.
Out of the 50, I was in agreement with the predicted labels for 44 (88%), which is a good sign (full data)! And if you treat each prediction as independent, the model is at 100% precision with just imperfect recall, which is a much better situation than having incorrect predictions.
Out of the six disagreements:
- Three were species-related with endemic areas that are mentioned in the article not being predictions (low recall). This is a known issue but we've identified strategies for improving coverage. Additionally, species articles make up a substantial proportion of articles on Wikipedia due to bot generation and simple notability criteria, but they do not receive a proportionate level of attention so I'm okay with lower recall in this topic area.
- Two were a version of "how much time do you have to spend in a country for it to be relevant to your life?". One was a painter who moved from UK to US at age 8 and I assigned the UK but the model only predicted US. The other was a British diplomat to France who I assigned UK + France and model only predicted UK. Again, both are recall issues and borderline cases so I'm okay with this.
- The final one was a Canadian film by an Ethiopian filmmaker that is set in Ethiopia, so I assigned Canada+Ethiopia but the model only predicted Canada. More links to the Ethiopia component in the article or Wikidata properties that mentioned Ethiopia probably would have led to alignment but there's no obvious fix here and again it's an issue of recall, which I'm comfortable with.

Fri, Aug 16, 11:55 AM · OKR-Work, Research (FY2024-25-Research-July-September)

Mon, Aug 12

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

I'd like to do a pass before the blog post is published. Let me know when is a good time for me to do that. (I won't be able to do it this week, but the week of August 19th or after is fine.)

@leila acknowledging. I'm considering adapting it to be a resource paper for my EMNLP Wikipedia+NLP workshop (Aug 29th deadline) with then a blogpost likely being a follow-up action to share more broadly. If that were the case, I'd complete a full draft this week and then the following week while I'm out, I'd ask you and other folks to provide feedback. This puts us right around the two-week window for review of papers so I understand if upon reading through, you think it would need more time. Either way, I'll let you know by end of week which direction I'd like to try for.

Mon, Aug 12, 8:41 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Isaac added a project to T371900: [Research Engineering Request] Productionize article-country data dependencies: Research.

Mon, Aug 12, 4:14 PM · Research, Research-engineering

Thu, Aug 8

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

Weekly updates:

Started thinking about where might be appropriate places to share this work -- e.g., expand literature review and submit to the Wikipedia-NLP EMNLP workshop I'm co-organizing?

Thu, Aug 8, 7:26 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T361637: Support for topic infrastructure work.

Weekly updates:

Provided feedback on translation lists prototype and feeling pretty good about where that work by the Language team is headed (T371515)
Put in requests for support from ML Platform, Research Engineering, and Search Platform around article-country model to start those conversations early. I'll have to make sure that they're aware that we would like to make updates again later in the year based on feedback about the topic taxonomy so we should make sure that we don't duplicate work unnecessarily.

Thu, Aug 8, 7:25 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T369120: Determine evaluation strategy for article-country model.

Weekly updates:

Paused evaluation to focus on putting in requests for hosting model on LiftWing (T360455), Research Engineering support for the data dependencies (T371900), and Search platform support for incorporating the predictions as tags (T301671#10052411). These should hopefully help identify any red flags with the approach early in the process and make sure the productionization goes smoothly when we're ready to move to that stage.

Thu, Aug 8, 7:20 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T301671: Investigate what would be required to include countries in ORES and accessible via a search keyword.

@EBernhardson I want to re-invigorate this task as I have started to make progress on a model for assigning countries to articles (WE 2.1.1) and am in the early stages of asking that this model be hosted on LiftWing (T371897) so I think it's a good point to restart the conversation about how to incorporate these predictions into the Search index and when that might be possible.

Thu, Aug 8, 6:18 PM · OKR-Work, Growth-Team, Machine-Learning-Team, Discovery-Search, ORES

Isaac updated the task description for T371897: Request to host article-country model on Lift Wing.

Thu, Aug 8, 3:22 PM · OKR-Work, Lift-Wing, Machine-Learning-Team

Wed, Aug 7

santhosh awarded T371934: [medium] Analyze localization and maintenance of translated content a Love token.

Wed, Aug 7, 3:34 AM · research-ideas

Tue, Aug 6

Isaac created T371934: [medium] Analyze localization and maintenance of translated content.

Tue, Aug 6, 8:42 PM · research-ideas

Isaac added a comment to T371865: [WE 1.3.] Who are moderators?.

Just linking this to the request for productionizing edit types in case we decide that's a core part of this work: T351225

Tue, Aug 6, 3:54 PM · Research (FY2024-25-Research-October-December)

Isaac added a comment to T371515: Community-defined translation lists: List definition, storage & recommendation.

@santhosh oh cool, yeah, this is looking good! Agreed that supporting both wikilinks and QIDs would be ideal. And my gut feeling is that caching everything about the articles to enable faster filtering is ideal so long as that doesn't fill up the cache. Two optional extensions I could imagine:

Let organizers maintain their pages on their local wikis and just create a page with a soft redirect (template that we could parse) that the backend could follow? Or probably easier just add a "campaign-list-page" as an optional parameter in the Translation campaign template? And then you could also support an organizer adding multiple translation campaign templates to a single page. I think it would be minor for the code to support it and would allow campaigns that already maintain extensive lists on their local wikis to take part without duplicating that work or moving it.
Maybe have an expiration date parameter on the list? Trying to think how we don't use up cache space for old translation lists. Maybe that's not necessary but could help with ensuring the translation lists are maintained.

Tue, Aug 6, 3:32 PM · OKR-Work, LPL Hypothesis

Isaac added a subtask for T366273: Article country model: T371900: [Research Engineering Request] Productionize article-country data dependencies.

Tue, Aug 6, 3:09 PM · Research, Epic

Isaac added a parent task for T371900: [Research Engineering Request] Productionize article-country data dependencies: T366273: Article country model.

Tue, Aug 6, 3:09 PM · Research, Research-engineering

Isaac created T371900: [Research Engineering Request] Productionize article-country data dependencies.

Tue, Aug 6, 3:08 PM · Research, Research-engineering

Isaac added a subtask for T366273: Article country model: T371897: Request to host article-country model on Lift Wing.

Tue, Aug 6, 2:48 PM · Research, Epic

Isaac added a parent task for T371897: Request to host article-country model on Lift Wing: T366273: Article country model.

Tue, Aug 6, 2:48 PM · OKR-Work, Lift-Wing, Machine-Learning-Team

Isaac created T371897: Request to host article-country model on Lift Wing.

Tue, Aug 6, 2:45 PM · OKR-Work, Lift-Wing, Machine-Learning-Team

Mon, Aug 5

Isaac added a comment to T369865: Fix API Gateway examples for Javascript.

Just acknowledging my thanks here on phabricator too -- I appreciate you figuring out why this was happening and fixing!

Mon, Aug 5, 6:59 PM · Machine-Learning-Team

Fri, Aug 2

Isaac added a comment to T360455: Add Article Quality Model to LiftWing.

@isarantopoulos excited to see this up! It helped me notice a small bug in my code that's now fixed and so my experimental API endpoint is matching the outputs from staging. Latency sounds good though I didn't do any formal testing of it. I'd say we're good to move to the next step and begin to coordinate with Enterprise about traffic to the endpoint!

Fri, Aug 2, 8:10 PM · Patch-For-Review, Content-Transform-Team, Research, Machine-Learning-Team

Isaac added a comment to T369120: Determine evaluation strategy for article-country model.

Weekly updates:

I took a pause from evaluation to focus on setting up conversations with ML Platform about hosting the model so we get those started earlier. It was also a good point to pause and reflect and do some writing to make sure I felt good about where we're at (I am). To that end, I put together a draft model card and am now working on filling out the phab request for hosting the model on LiftWing.
I also uploaded the code for some of the data pipelines I built for generating the bulk predictions: https://github.com/geohci/wiki-region-groundtruth/tree/main/notebooks

Fri, Aug 2, 8:08 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T361637: Support for topic infrastructure work.

Weekly updates:

Article country model: put together draft model card as pre-requisite towards starting conversations about hosting this model on LiftWing. Also uploaded notebooks that cover various aspects of the data pipeline.
Discussion with Evelin/Ilana/Alex/Felipe about WikiProjects (context of Spanish Wikipedia Climate Change WikiProject). One of the issues they're dealing with is how store lists of references (e.g., new climate change reports) and the articles/topics they might be relevant to. And do this in a way that future editors can easily discover these resources and use them for editing. I put forth a few ideas. We're not working on building this yet but interesting to ponder and some of it can build on the article-oriented topic infrastructure that we're working directly on:
- Add the sources to Wikidata and set their main-subject property. Then have a tool that periodically caches the sources in a way that they'd be discoverable from articles related to the main-subject property. Some non-trivial challenges there but probably could make something work.
- Build separate tool that uses AI to analyze the report and recommend relevant articles. Or use list-building tool or something similar for organizers to quickly build a list of relevant articles. Then store that somewhere in a structured way that lets it be surfaced to editors who are editing one of the listed articles.
- Incorporate the sources into at least a few relevant articles. Then build generic tooling for recommending sources to editors that could also surface these sources. Some options for that:
  - Use current formula for e.g., images but apply to sources: for any given article, collect sources (this often means external URLs) in other language editions and surface the most prevalent that aren't used yet in that language edition. The drawback here is that presumably many of these sources are in other languages too so they might not be useful.
  - Use list-building tool (or even just based morelike functionality) to find similar articles to the one in question and aggregate sources from these.
  - Same as above but instead of aggregating sources (and therefore decontextualizing them), you instead filter the list to higher-quality articles and present them as potential examples that the editor can follow for improving their current article.
  - You aggregate top-used sources for the WikiProject and surface them to editors (example). This has the challenge that WikiProject is often too high-level of a topic area for a specific source.

Fri, Aug 2, 8:06 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

Weekly updates:

Left thoughts on open questions for AI strategy white paper: T340693#10033198

Fri, Aug 2, 7:44 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T340693: Align on high level direction of ML/AI development in the coming years.

(And to confirm: that was an example to help with the thinking. It's not a decision.:)

Ahh okay because that would have been a huge shift in direction so I was pretty confused. Something like half of our Research model development at the moment is NLP (e.g., Diego's reference-need and peacock work, Martin's text simplification and add-a-link improvements), we're engaging more in their conferences (ACL/EMNLP), and ML Platform is also heavily invested at the moment in hosting LLMs.

Fri, Aug 2, 3:49 PM · Essential-Work, Research

Isaac added a comment to T340693: Align on high level direction of ML/AI development in the coming years.

@leila thanks for the acknowledgment. One question about the new open question: when you say I will argue that investing in models that use NLP is something we should avoid for the coming few years unless we have very strong reasons to do so., what do you mean by "investing"? Like what sorts of activities are you suggesting that we avoid?

Fri, Aug 2, 2:45 PM · Essential-Work, Research

Isaac updated subscribers of T371515: Community-defined translation lists: List definition, storage & recommendation.

@santhosh just following up on your summary in T368713#10030506 but moving discussion here. All of what's outlined here generally makes sense to me for a start. I guess the design research question is whether to ask organizers to add specific interwiki links to source articles or just the generic Wikidata IDs for the items they see fit for translation. I'm going to use a recent worklist from Wiki Loves Sports as an example because it conveniently seems very close to what you're envisioning (screenshot below). They have a table of articles that includes both the various individual interwiki links (those green cells with + in them are actually links to the language version) and the Wikidata items for the articles of interest.

Fri, Aug 2, 1:48 PM · OKR-Work, LPL Hypothesis

Aug 1 2024

Isaac added a comment to T360455: Add Article Quality Model to LiftWing.

For testing purposes, this API should be hosting the same model so should match LiftWing outputs: https://misalignment.wmcloud.org/api/v1/quality-revid-html?lang=en&revid=1228403723. It's coded slightly differently and there might be tiny rounding errors but in that sense it's a nice independent verification :)

Aug 1 2024, 2:44 PM · Patch-For-Review, Content-Transform-Team, Research, Machine-Learning-Team

Jul 31 2024

Isaac added a comment to T370746: CX Unified Dashboard: Support suggestions based on previous edits.

Thanks for sharing that @Pginer-WMF ! I'd agree that less of a concern for section translation but yeah if we ever incorporated it for article translation, maybe not just a straight ranking and instead we define some ideal range -- e.g., 3000 - 10000 characters -- between which we prioritize source articles. Certainly good food for thought on not just what is a good topical overlap with the query but perhaps what are also good candidate articles for translation in general.

Jul 31 2024, 6:34 PM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Sep), ContentTranslation

Isaac added a comment to T340693: Align on high level direction of ML/AI development in the coming years.

Some of my past writing around this space:

A few different ways I categorize ML models at Wikimedia that has helped me both be clear about where our challenges are and what is needed to intervene: https://meta.wikimedia.org/wiki/User:Isaac_(WMF)/ML_modeling_at_Wikimedia
My thoughts on the need for Wikimedia AI datasets/benchmarks: https://meta.wikimedia.org/wiki/User:Isaac_(WMF)/AI_Datasets (or slightly more developed version in a Google doc)
Major data gaps on the Wikimedia platform that hinder our ability to develop equitable ML models in various classification domains: https://meta.wikimedia.org/wiki/User:Isaac_(WMF)/Content_tagging/Data_gaps

Jul 31 2024, 6:03 PM · Essential-Work, Research

Isaac added a comment to T360455: Add Article Quality Model to LiftWing.

We're going to solve the numpy issue by relaxing the kserve restriction by using our wmf kserve fork.

@isarantopoulos thanks!

Jul 31 2024, 4:44 PM · Patch-For-Review, Content-Transform-Team, Research, Machine-Learning-Team

Jul 30 2024

Isaac added a comment to T360455: Add Article Quality Model to LiftWing.

After checking Isaac's notebook I found that the model has been trained using numpy 2.0.0, so ideally this would be the numpy version we would want to use while unpickling.

@isarantopoulos if this continues to be an issue, let me know and I can see about re-training/pickling with an earlier version of numpy. In theory it should be pretty easy to force an older version of statsmodels that depends on an earlier version of numpy (aka shouldn't throw any errors or require much code changes) and I don't think it should affect the model parameters in any serious way as the core logic should all be the same.

Jul 30 2024, 1:55 PM · Patch-For-Review, Content-Transform-Team, Research, Machine-Learning-Team

Jul 26 2024

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

But if this emerges in some form, I'll look into integrating it in tools I develop.

Sounds good!

Jul 26 2024, 6:22 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T369120: Determine evaluation strategy for article-country model.

Weekly updates:

I prepared a data dump of all country predictions (+ topics) for all Wikipedia articles and made that public (data; README). I'll be able to use this to easily generate samples of articles to evaluate with people once the strategy is set.
This also allowed me to begin to identify subsets of the projects where country predictions are largely driven by wikilinks as the places that are probably most important to verify (as the wikilinks method has the highest likelihood of false positives). Biology (species) stood out as expected but so did Music (albums/songs/charts) and a few other wiki-specific areas (e.g., Chinese biographies). There's a ton of data and different ways to slice it so I haven't decided exactly yet how to visualize it but I can create visualizations like the one from last week for any wiki/topic combination now.

Jul 26 2024, 3:36 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T361637: Support for topic infrastructure work.

Weekly updates:

Continued to monitor and be very happy with Language's work to switch all of their recommendation functionality to the new LiftWing API (they moved section translation over this week)
Connected with Seddon briefly about Android's recommendation API usage (context: T338430) and my willingness to provide guidance when they port that code over as I've done for Language + Content Translation
Prepared data dump of all country predictions (+ topics courtesy of Muniza's new Airflow job) for all Wikipedia articles and made that public so other folks can use it if desired (data; README)
Built simple API for mapping WikiProjects to LiftWing topics for Community Wishlist exploration (details: T370951#10015459). This might help with making wikiprojects more discoverable to new editors.

Jul 26 2024, 3:34 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Jul 25 2024

Isaac added a comment to T360794: Implement stream of HTML content on mw.page_change event.

I think we could try and get to this or the batch ingestion one next quarter (starting October).
This is something I really want to do but it’s hard to find the time.
Would you have any interest in building a Flink job with our support - that could help speed up getting this done? We have a very stable job that enriches page change events with wikitext and this would be very similar. Let me know what you think.

Jul 25 2024, 9:26 PM · Data-Engineering, Event-Platform

Isaac created T371062: [Research Engineering Request] Build HTML page stream.

Jul 25 2024, 9:25 PM · Research-engineering

Isaac added a comment to T370746: CX Unified Dashboard: Support suggestions based on previous edits.

Patch looked good to me around adding support for section translation recommendations. Just a few thoughts in case you ever want to work on optimizing the part where you check for available section recommendations. Currently the code starts with a large set of candidate articles and then checks each one asynchronously until enough are found with section translations available. I don't know what percentage of candidates actually have section translations available, but if it's a low-ish percentage, then this can potentially run for a while before reaching the desired result set size. Because there are too many possible combinations of pages + target languages to cache, I think the best approach would be to optimize the likelihood of finding candidates quickly that have section translations available. So instead of ranking the candidates by relevance when you check for section translations, this could probably be optimized by ranking instead by:

Page length. This is also pretty reasonable and already can be gathered from the APIs (prop=info and then use the resulting length value) -- e.g., https://en.wikipedia.org/w/api.php?action=query&format=json&formatversion=2&prop=langlinks|langlinkscount|pageprops|info&lllimit=max&lllang=es&generator=search&gsrsearch=morelike:Japanese_iris&ppprop=wikibase_item|disambiguation&gsrlimit=10
Article quality, where high article quality probably means more sections probably means higher likelihood of a section translation available. This feature isn't ready yet but we're working on adding an article quality model to LiftWing (T360455) and then we'd load those article quality predictions into the Search index as a weighted tag that could be gathered along with langlink counts, QID, etc. as above (prop=cirrusdoc&cdincludes=weighted_tags) when the tool is generating the candidates -- e.g., https://en.wikipedia.org/w/api.php?action=query&format=json&formatversion=2&prop=langlinks|langlinkscount|pageprops|cirrusdoc&lllimit=max&lllang=es&generator=search&gsrsearch=morelike:Japanese_iris&ppprop=wikibase_item|disambiguation&gsrlimit=10&cdincludes=weighted_tags

Jul 25 2024, 5:01 PM · Patch-For-Review, LPL Essential (LPL Essential 2024 Jul-Sep), ContentTranslation

Isaac added a comment to T370951: Investigation: How do we get WikiProject topics, as defined by LiftWing?.

Sharing some thoughts! For a given WikiProject, we have two potential pathways to link it to a LiftWing topic:

Use the main-subject property (P921) for WikiProjects. The challenge is then connecting the values for main-subject to their corresponding LiftWing topics. The two challenges here are:
- How many WikiProjects have a main-subject? This seems to be 50% (sparql) but this is at least easily solve-able by the community and for an MVP I assume it's more important to have accurate information than full coverage anyways.
- How do we connect the main subject to a LiftWing topic? I came up with one hacky approach which is to take any sitelinks for the main subjects associated with a WikiProject (e.g., for WikiProject African Diaspora (Q15304953), this is African Diaspora (Q385967) which has 23 sitelinks though in practice I only use a random subset of 10 that is set to always include English). For each of those Wikipedia articles, I get the LiftWing topic predictions and average them together. Here's the final result: https://wiki-topic.toolforge.org/wikiproject-topic?qid=Q15304953 and you can test others (Rihanna; France). This seems to work well and I think could reasonably run as a regular batch job to update the Community List as new WikiProjects are created or main subjects are added etc.
Use the worklist for a given WikiProject and get topic predictions for e.g., 10 of those articles and do the same averaging. I think this would also work quite well but it does require us to map the WikiProject item to a language edition that uses the PageAssessments extension (to get the worklist). Eventually I want us to get there, but for now I think this might just be messier and the above seems to work alright so I'd recommend going with that. Especially because I know you're using the labels etc. from Wikidata so adding in the requirement that a WikiProject be tracked on Wikidata isn't unnecessary overhead.

Jul 25 2024, 4:07 PM · CampaignEvents, Event-Discovery, Campaign-Tools (Campaign-Tools-Current-Sprint)

Jul 23 2024

Isaac updated subscribers of T338430: Prepare the recommendation-api service for usage without RESTbase.

Are there any usage of RESTBase specific technology? Or the modernization/replacement could safely be routed through REST Gateway?

That I don't know -- looping in @Seddon who can probably route that question more effectively.

Jul 23 2024, 5:55 PM · RESTBase Sunsetting, Recommendation-API

Isaac added a comment to T338430: Prepare the recommendation-api service for usage without RESTbase.

Just quickly chiming in to make a few connections to related workstreams:

Older task about this: T340854
Modernization of recommendation API being lead by Content Translation which is almost complete and will make for strong base on which to incorporate the Apps functionality (what is hitting the RESTbase endpoint described in this task): T369484

Jul 23 2024, 5:19 PM · RESTBase Sunsetting, Recommendation-API

Jul 22 2024

Isaac added a comment to T370381: Make header and footer as standalone objects.

SSG sounds reasonable then to me, especially if it's already working for wikiworkshop and design strategy. I have no knowledge to guide what is a better/worse choice so I'll let you guide but simple is always nice :)

Jul 22 2024, 1:48 PM · Research-landing-page

Jul 19 2024

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

thanks for sharing your valuable perspective.

Happily -- thanks for engaging with the work!

Jul 19 2024, 5:51 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T361637: Support for topic infrastructure work.

Weekly updates:

Did some brainstorming and left some feedback about how we might actually make article translation lists accessible from tooling like Content Translation (T368713#9990651)
Categories that have Wikidata items with the "category combines topics" (P971) are proving pretty useful for the article-country model (notebook). This is particularly exciting because categories historically are how the Wikimedia communities organize content and annotate similarities between articles but the network itself is so messy that it's very difficult to make use of categories within tooling (background). This modeling on Wikidata, however, provides a nice structured way of using categories that because it depends on explicit Wikidata properties (as opposed to the category hierarchy), seems less prone to unintended consequences and an approach that might be useful in other topic-related initiatives.

Jul 19 2024, 2:03 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T370381: Make header and footer as standalone objects.

FYI as context about a past discussion about this: T181588

Jul 19 2024, 12:59 PM · Research-landing-page

Isaac added a comment to T360455: Add Article Quality Model to LiftWing.

@isarantopoulos your summary is accurate - thanks! We can adjust model card schema if it's non-standard and you want to adjust the keys / structure. In particular, as I think about it more based on your point around latency and double-running the model, I'd be open to a very simple default configuration where we just return the quality score (number between 0 and 1) as that's the "official" output of the model with the optional mode you mention that also returns the feature values and class label.

Jul 19 2024, 12:51 PM · Patch-For-Review, Content-Transform-Team, Research, Machine-Learning-Team

Jul 18 2024

Isaac added a comment to T369120: Determine evaluation strategy for article-country model.

Weekly updates:

Incorporated categories with explicit countries set per Wikidata into the prototype API.
- This can be explored simply via the UI: https://wiki-topic.toolforge.org/countries
- Or more detailed information directly from the API -- e.g., https://wiki-region.wmcloud.org/regions?lang=en&title=Blue-headed_parrot
Compared the overlap (notebook) between these three signals (Wikidata properties, article links, Wikipedia categories) for a small sample of 100 random articles on English Wikipedia to begin to get a sense of what's going on:
- Most countries (77%) have at least one associated country
- US dominates at about 1/3 of the countries predicted
- Good agreement between Wikidata and wikilinks (~50%) when a country predicted
- Category coverage is lowest but does provide provide an additional 10% of the countries
- Wikilinks give the most lift by themselves (20%)
- Wikidata only provides unique predictions 7% of the time
In summary: right now, all the signals seem valuable. There are a fair number of wikilink-based predictions, which are the least confident of the predictions, so those will warrant deeper evaluation by people at some point to verify they are precise. These conclusions could easily change as I expand my samples beyond English Wikipedia though.

Jul 18 2024, 6:36 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Jul 17 2024

Isaac added a comment to T360794: Implement stream of HTML content on mw.page_change event.

Hey @lbowmaker -- I wanted to check in on the status of this. For the article quality model (T360455), I would like to run a batch job that builds a distribution of a bunch of features from article HTML. For the moment, I've had to move the individual dump files onto HDFS but this isn't sustainable long-term and the ability to make incremental updates to these distributions based on a stream would be fantastically helpful.

Jul 17 2024, 7:35 PM · Data-Engineering, Event-Platform

Isaac added a comment to T360581: [SPIKE] Test MGAD Model on LiftWing.

Yes, just linking back to some old performance data to back up @OTichonova's findings: T343123#9573432. @Dbrant is right and I would advocate for launching. In parallel, the one thing that can be done is to follow up with ML Platform to see how close they are to being able to host on GPUs. That should be a silent change from your perspective if they can (no code updates needed) while having a noticeable impact on latency per T343123#9520331.

Jul 17 2024, 7:19 PM · Wikipedia-Android-App-Backlog (Android Release - FY2024-25)

Isaac added a comment to T360455: Add Article Quality Model to LiftWing.

Ok, for the V1 of the model, I have everything ready to go! Specifically:

Normalization values for all language editions
- Artifact: https://analytics.wikimedia.org/published/datasets/one-off/isaacj/quality/V4-HTML/html-features-all-wikis-2024-07-01.tsv
- Code to generate: https://gitlab.wikimedia.org/isaacj/miscellaneous-wikimedia/-/blob/master/article-features/quality-model-normalization-values-html-dumps.ipynb
Model binary
- Artifact: https://analytics.wikimedia.org/published/datasets/one-off/isaacj/quality/V4-HTML/quality-model-html-ordinal-logistic-regression.pkl
- Code to generate: https://public-paws.wmcloud.org/User:Isaac%20(WMF)/Quality/html-qual-exploration.ipynb
Model code (in theory I think this is all you need as it contains links to the above artifacts too): https://public-paws.wmcloud.org/User:Isaac%20(WMF)/Quality/quality-code-for-liftwing.ipynb

Jul 17 2024, 7:08 PM · Patch-For-Review, Content-Transform-Team, Research, Machine-Learning-Team

Isaac added a comment to T368713: Community-defined translation lists: Technical Exploration.

I've been thinking about this challenge too: a translation list is about creating articles on a specific target Wikipedia but redlinks aren't tracked as entities in our APIs so the worklist must be accessible via our APIs on the potential source articles. For an MVP, I guess you go with whatever works but eventually we'll need to resolve this tension in a way that:

Makes it easy for organizers to build and update translation lists -- i.e. ideally they can create a list either via a list of Wikidata items associated with the articles they want created or via a list of specific source articles that could be translated over. I don't think it's reasonable to expect them to edit other language editions to generate their list. Maybe it's reasonable to ask them to edit Wikidata?
Does not create a bunch of bloat / noise on the source wikis -- i.e. I think we want to avoid solutions that require adding some sort of tag to source articles or their talk pages. That realm should belong to that language edition and patrollers on that language edition won't want to receive watchlist notifications etc. whenever a source article is tagged as a good translation candidate. One exception is that again I think it might be reasonable to consider having this tagging happen on Wikidata though that community would have to weigh in on whether something like on focus list of Wikimedia project (P5008) has been an approach that they would like to build on or not.
Does not greatly complicate the existing recommendation API -- i.e. we currently rely on a very small and quick number of API calls that takes as inputs the desired source and target language (as well as various topic filters to apply to the target language). To Santhosh's point, I'm pretty sure the tagging of an article as part of a translation campaign has to also be done via a filter that can be applied directly to a Search query. This could be via things like incontent or hastemplate but I think long-term we'll probably have to create another WeightedTag like we do for articletopic. The only other alternative that has occurred to me is fully switching to Wikidata as the backend for this -- i.e using P5008 that I mentioned above to tag articles, WDQS for gathering candidate lists and sitelinks for filtering, and then finding someway to recreate the morelike and articletopic functionality in Wikidata. But that would be a big lift and latency would be a major challenge.
Allows the translation lists to be filtered by our existing filters like articletopic or morelike -- here I'm thinking of the example where you have a worklist of all women scientists that is created via SPARQL and an editor comes in and they want to create an article in Spanish Wikipedia with English Wikipedia as the source and to further filter down to women who are connected to their home country of Ecuador (this could either be done soon via the country filter I'm working on or just by adding morelike:Ecuador as a filter). If the translation list is accessible as a filter on the Search index of English Wikipedia, this is easy. If it's not, it's much much harder/slower.

Jul 17 2024, 3:15 PM · Patch-For-Review, OKR-Work, LPL Hypothesis

Jul 15 2024

Isaac added a comment to T351118: [Research Engineering Request] Produce regular snapshots of all Wikipedia article topics.

this all looks and sounds great -- thanks @MunizaA and good to resolve from my end!

Jul 15 2024, 8:19 PM · Research-engineering, Research

Jul 12 2024

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

Weekly updates:

Responding to Miriam's feedback about the draft blogpost -- in particular I'm pondering the calls-to-action.
- For Wikimedians, I think it's pretty clear that there's just the continued request for sharing needs/experiences so we're aware. I'm curious to see if an AI Focus Area comes out of the new Community Wishlist or if the different topical domains (citations, translation, etc.) just all have asks embedded that could potentially use AI.
- For Researchers/Practitioners, I'll have to think about what the main asks would be. Obviously helping to develop and implement Wikimedia-related benchmarks. Probably something about doing more massively multilingual work. And maybe that's where I slot the existing postscript about not forgetting about the core, non-flashy technologies like OCR or NER that are essentially for these generative AI tools actually working effectively.

Jul 12 2024, 7:56 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Isaac closed T367549: Cloud VPS "recommendation-api" project Buster deprecation as Resolved.

Fully deprecated - thanks all!

Jul 12 2024, 3:47 PM · Cloud-VPS (Debian Buster Deprecation)

Isaac updated the task description for T367549: Cloud VPS "recommendation-api" project Buster deprecation.

Jul 12 2024, 3:44 PM · Cloud-VPS (Debian Buster Deprecation)

Isaac added a comment to T365347: Update endpoints used in Content and Section Translation to use the LiftWing version of the Recommendation API.

Just a note that the old wmflabs endpoint is fully deprecated -- thanks @ngkountas for enabling this and confirming that we were ready to pull the plug!

Jul 12 2024, 3:41 PM · MW-1.43-notes (1.43.0-wmf.14; 2024-07-16), LPL Essential (LPL Essential 2024 Jul-Sep), Unplanned-Sprint-Work, ContentTranslation

Isaac added a comment to T361637: Support for topic infrastructure work.

Weekly updates:

Presented on article-country model at Topical Collaboration Learning Circle. Pau reminded me of the importance of having clear criteria for the countries to present to community members, which exists in this README and is based on the ISO 3166-1 codes.
Deprecated GapFinder officially and provided code review on a major overhaul of the recommendation API on LiftWing by Santhosh to simplify the code-base, improve latency, and improve result quality (gerrit). He also has a separate patch for adding in the topic support, which was simple to add on top of his refactoring (gerrit).

Jul 12 2024, 12:40 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T369120: Determine evaluation strategy for article-country model.

Weekly updates:

Took feedback that AS gathered from WikiProject Biodiversity folks around how to better incorporate species data into the model. This identified a new Wikidata property to include (taxon range) and the existence of categories on Wikipedia that are modeled as being explicitly connected to a given country on Wikidata. These could be used in the model but also present an opportunity as an independent signal to Wikidata and Wikipedia links that could be used to evaluate the recall of those approaches. I wrote up some quick code to make sure it was simple to extract them and see what sort of coverage they have. Details: https://meta.wikimedia.org/wiki/Research:Language-Agnostic_Topic_Classification/Countries/Species

Jul 12 2024, 12:35 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Jul 11 2024

Isaac created T369865: Fix API Gateway examples for Javascript.

Jul 11 2024, 8:16 PM · Machine-Learning-Team

Jul 10 2024

Isaac closed T367444: Replace or remove Debian Buster VMs in 'wmf-research-tools' cloud-vps project as Resolved.

The other three have been migrated over / deprecated now too (https://os-deprecation.toolforge.org/buster/wmf-research-tools.html will hopefully update soon to reflect this) so I'm closing the task. Thanks all!

Jul 10 2024, 8:57 PM · cloud-services-team, Cloud-VPS (Debian Buster Deprecation), Research

Isaac closed T367444: Replace or remove Debian Buster VMs in 'wmf-research-tools' cloud-vps project, a subtask of T186519: Request creation of "wmf-research-tools" VPS project, as Resolved.

Jul 10 2024, 8:56 PM · cloud-services-team (Kanban), Research, Cloud-VPS (Project-requests)

Isaac added a comment to T367873: Technical exploration to support topic-based suggestions with the current Recommendation API.

the usecase that you described (e.g., intersection between similar articles to a given one, and a set of topics) seems quite relevant, and something we were planning to support in the UI.

Great to hear!

Jul 10 2024, 6:49 PM · LPL Hypothesis, ContentTranslation

Isaac awarded T358604: Update labpawspublic extension to jupyterlab 4 system a Love token.

Jul 10 2024, 1:12 PM · PAWS

Jul 9 2024

Isaac added a comment to T367873: Technical exploration to support topic-based suggestions with the current Recommendation API.

Commenting here rather than on the above patch because it's a high-level question that I thought Pau might have thoughts about too.

Jul 9 2024, 8:26 PM · LPL Hypothesis, ContentTranslation

Isaac added a comment to T365347: Update endpoints used in Content and Section Translation to use the LiftWing version of the Recommendation API.

@ngkountas I'm looking to deprecate the old wmflabs endpoint this week if possible. Based on a quick on-wiki test, it seems that Content Translation is now using the LiftWing API but I wanted to check first before I take down that instance.

Jul 9 2024, 8:01 PM · MW-1.43-notes (1.43.0-wmf.14; 2024-07-16), LPL Essential (LPL Essential 2024 Jul-Sep), Unplanned-Sprint-Work, ContentTranslation

Isaac added a comment to T360455: Add Article Quality Model to LiftWing.

I was wondering if you'd be open to use a gradient boosting regressor model (xgboost, catboost, lightgbm) so that we don't have to do much feature preprocessing (normalization). In this case we wouldn't need to maintain the feature values (the one we have now in the csv) and model maintenance/updates would be easier. wdyt?

Jul 9 2024, 3:09 PM · Patch-For-Review, Content-Transform-Team, Research, Machine-Learning-Team

Jul 2 2024

Isaac added a comment to T360455: Add Article Quality Model to LiftWing.

Just commenting that this is very exciting! I'll be working on getting you all the final model etc. in the next week or two.

Jul 2 2024, 9:29 PM · Patch-For-Review, Content-Transform-Team, Research, Machine-Learning-Team

Isaac moved T369120: Determine evaluation strategy for article-country model from Backlog to FY2024-25-Research-July-September on the Research board.

Jul 2 2024, 9:28 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac created T369120: Determine evaluation strategy for article-country model.

Jul 2 2024, 9:27 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Andrew awarded T367444: Replace or remove Debian Buster VMs in 'wmf-research-tools' cloud-vps project a Like token.

Jul 2 2024, 9:03 PM · cloud-services-team, Cloud-VPS (Debian Buster Deprecation), Research

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

There is another initiative in Wikimedia that intends to engage in natural language generation involving Wikidata statements, but with algorithms – Abstract Wikipedia. I'm curious, are there any points of contact between these areas of effort?

Jul 2 2024, 9:01 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T367873: Technical exploration to support topic-based suggestions with the current Recommendation API.

Thanks Santhosh and Pau for kicking this off! Commenting here what I put in Slack for ease of access / transparency and I added a bit more detail. The specific updates that I'd love to see made as part of this clean-up (beyond removing unused code and general modernization/standardization):

Flip ranking order (it's currently "backwards") -- see: T293648#9284550
No longer gather each candidate article's set of claims from Wikidata as they're not being used -- see: T347475#9226750 and T347475#9239002
Consider reducing down the number of candidates that are checked on Wikidata for inclusion from 500 to something more dynamic that cuts off the process when enough candidates have been found -- also see: T347475#9226750.
- The current process makes a search query (params + code) to gather 500 candidate articles for translation. And then for each of these candidates, it applies a few filters to find which already exist in the target language and remove disambiguation/list articles (though the List filter is pretty basic). When the API was being ported to LiftWing, we experimented with reducing the number of candidate articles to check from 500 down to e.g., 250 but didn't see much change in latency because the API calls for each chunk of 50 candidates is done in parallel.
- What I'd suggest considering instead is moving the "does this article exist in the target language" filter (which is the main one for removing candidates) to the original Search API call instead of relying on Wikibase API. So instead of just requesting articles that are morelike the seed and then making additional API calls to filter them, you could make the morelike API call a generator and use the langlinks API to filter out articles that already exist in the target language. In parallel, you could also filter out disambiguation pages with a similar API call. For example, if you were looking for articles like "Banana" to translate from English -> German, you could do something like:
  - Similar articles (and whether they already exist in German): https://en.wikipedia.org/w/api.php?action=query&generator=search&format=json&gsrnamespace=0&gsrwhat=text&gsrsearch=morelike:Banana&gsrlimit=max&prop=langlinks&lllang=de&lllimit=max
  - Same set of similar articles but check for disambiguation pages and their Wikidata ID: https://en.wikipedia.org/w/api.php?action=query&generator=search&format=json&gsrnamespace=0&gsrwhat=text&gsrsearch=morelike:Banana&gsrlimit=max&prop=pageprops&ppprop=wikibase_item|disambiguation
- Then you probably don't need any calls to the Wikibase API unless you want to know how many sitelinks an article already has, but at least at that point it's only for the final result set (which in practice is much smaller). You also don't necessarily need to request all 500 candidates at once if you don't want to and could easily use the Search continuation parameters to e.g., do chunks of 100 and only get more as needed.

Jul 2 2024, 8:37 PM · LPL Hypothesis, ContentTranslation

Isaac updated the task description for T361637: Support for topic infrastructure work.

Jul 2 2024, 6:54 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac closed T367551: Cloud VPS "research-collaborations-api" project Buster deprecation as Resolved.

Being bold and resolving. Both instances have been migrated and when the report next updates (something after 2024-07-02 16:24:57), it should show no more Buster instances: https://os-deprecation.toolforge.org/buster/research-collaborations-api.html

Jul 2 2024, 6:46 PM · Research, Cloud-VPS (Debian Buster Deprecation)

Isaac updated the task description for T367551: Cloud VPS "research-collaborations-api" project Buster deprecation.

Jul 2 2024, 4:34 PM · Research, Cloud-VPS (Debian Buster Deprecation)

Isaac updated subscribers of T367444: Replace or remove Debian Buster VMs in 'wmf-research-tools' cloud-vps project.

Thanks for flagging this @Andrew -- apologies, I hadn't seen something from StrikerBot so this hadn't been on my radar. Overview:

Live list of projects that will need migrated or deleted: https://os-deprecation.toolforge.org/buster/wmf-research-tools.html
Details for how to migrate (mainly create a new instance and copy stuff over if you still need it): https://wikitech.wikimedia.org/wiki/News/Buster_deprecation
Following other tasks, we're aiming for July 15th. If any of us need more time, please comment below and why (otherwise instances might be shut off).

Jul 2 2024, 3:50 PM · cloud-services-team, Cloud-VPS (Debian Buster Deprecation), Research

Jun 28 2024

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

Weekly updates:

Began iterating based on Miriam's feedback -- we'll discuss next steps in our 1:1 next week
Talked with DE on Future Audiences to get his experience with using LLMs for editor-oriented tasks on-wiki and sense of the state of benchmarks. A few notes:
- Generally he agreed on the lack of editor-oriented benchmarks for LLMs and that that leaves him and others in a state of making choices based on anecdotal data or practical considerations as opposed to choosing models that should be the most performant
- He raised the possibility of the various plugins themselves being used to gather some groundtruth data for these sorts of benchmarks
- He emphasized the value of natural-language or Wikipedia content -> Wikidata statements as a key editing workflow to support via LLMs
- Mentioned importance of linking facts with their relevant citations / external sources as a key challenge to editors / tooling in this space too
- We discussed a bit what it would mean to evaluate LLMs on their ability to adhere to NPOV and how this could be heavily affected by prompt design and whether a binary "this violates NPOV" edit diff task would be sufficient or not

Jun 28 2024, 3:48 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T361637: Support for topic infrastructure work.

Weekly updates:

Left some feedback for Language as they begin their explorations for topics on Content Translation
Working on some clean-up steps for the quality model so it doesn't get stuck in staging on LiftWing
Otherwise waiting for Q1 to kick off to begin planning for eval of the article-country model

Jun 28 2024, 3:04 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T360572: Extend Article Quality Model to use HTML.

Weekly update:

Wrapped up internship. Waiting on final model comparison to make decision on which to host. I'll need to also run the job to compute normalization scores for all wikis (not just the six we used in training) but that should be reasonably straightforward to automate and do because I already ran English and that's the biggest by far.

Jun 28 2024, 2:57 PM · Essential-Work, Research, Epic

Jun 27 2024

Isaac added a comment to T351118: [Research Engineering Request] Produce regular snapshots of all Wikipedia article topics.

@Isaac can we close this task? Anything you see that's not completed yet?

Generally I checked the parquet files mentioned above and they looked great as far as largely matching up with descriptive stats from past topic datasets! Two clarifying questions for @MunizaA before we do close out:

I presume we're keeping the most recent snapshot and not storing prior runs? If so, that makes sense to me. I could see justification for storing maybe the previous snapshot too (just to be able to easily detect changes if desired) but I see no reason for storing the topics from older runs.
Sorry I didn't spot this earlier but can we align with the model currently being used by LiftWing (assuming this is the model used by the DAG)?

Jun 27 2024, 8:02 PM · Research-engineering, Research

Isaac added a comment to T113257: Custom translation suggestions: Find opportunities to translate in topic areas selected by the user.

Just a few comments on the technical side in case they're helpful. I'm generally excited to see this moving forward!

We are also exploring making available a quality model on LiftWing (T360455). Not a topic per se but a filter that might also be exposed if Content Translation (or other products) had a clear use-case for it.
The topic taxonomy should be evolving a bit over the first few quarters of next year. Some of that will be smaller changes to the existing articletopic filters (add one here, remove one there) but the main thing to know is that the geographic topics will shift from the current set of regions to countries. There are obviously a ton of countries but also hopefully we can take advantage of the hierarchy (regions, continents) to make them available without overwhelming the UI?
For topic filters, I'm just confirming that you can do both AND and OR queries as needed -- e.g., Southern Africa + architecture as an OR (this example makes less sense but e.g., Southern Africa + Western Africa maybe makes more sense as an OR) and the same two topics as an AND (which does make sense for identifying buildings in Southern Africa).
As far as making the code changes to the recommendation API to support new behavior, I see that as relatively straightforward. The topics are supported in every language edition and so it's really just a question of passing the user-selected topics to the API to implement as filters when it does the search to build the initial set of candidate articles (that are then filtered down based on whether they exist in the target wiki or not). I'm not the person to do that but it should be relatively straightforward. At that point, I'd also recommend alotting a little engineer time to just improving the API. ML Platform smartly focused on just hosting it as-is but there are some pretty commonsense improvements that should speed it up (T347475#9226750) and improve the result quality (T293648#9284550).
And huge thanks for the 2. Direct access to filtered suggestions: URL parameters and persistence aspect!
Regarding 6. External lists to support group translation (Campaigns, Wiki projects...), we aren't there yet where these worklists are standardized in such a way that they're accessible via tooling, but my goal for whatever team tackles this challenge is have them also slurped up into the Search Platform so we can incorporate them as tags just like any other topic filter. There will obviously be too many to show to the user, but perhaps the communities can curate a few that they want elevated at any given time or they could also be searchable.

Jun 27 2024, 12:21 PM · Epic, CX-boost, OKR-Work, Design

Jun 25 2024

Isaac added a comment to T367551: Cloud VPS "research-collaborations-api" project Buster deprecation.

Oh, the current instance also has access to the dumps so we never downloaded files over the internet.

Oh great, this makes it even easier. I just enabled on the new wikinav-bookworm instance so we should be good to go!

Jun 25 2024, 5:02 PM · Research, Cloud-VPS (Debian Buster Deprecation)

Isaac closed T368432: Request access to NFS mount /public/dumps for research-collaborations-api Cloud VPS project as Resolved.

Actually sorry I realize that the project already has access, I just needed to enable in hiera. Sorry for spam :)

Jun 25 2024, 5:02 PM · VPS-Projects, Data-Services

Isaac created T368432: Request access to NFS mount /public/dumps for research-collaborations-api Cloud VPS project.

Jun 25 2024, 4:57 PM · VPS-Projects, Data-Services

Jun 24 2024

Isaac added a comment to T321224: Wikidata Item Quality Model.

just double checking - what is the status of this? Should we close this / move to freezer? Any update we can add here?

@Miriam thanks for checking - this seems to be a victim of my sabbatical last year. Summary of where we are at:

I was feeling pretty good about where the model was and had an API (example) and bulk cluster job (code) ready.
The bulk analysis raised one issue that I wanted to address - how to handle items that are subclasses but that do not have instance-of properties. I added some logic to create an expectation for any item that has a subclass property but I don't think it's great so I'd want to continue to iterate on that. That said, it affects a very small proportion of items.
We wanted to do an evaluation of Wikidata editors to see if this model does a better job of meeting their expectations than the approach taken by the original ORES itemquality model.

Jun 24 2024, 9:21 PM · Essential-Work, Movement-Insights, Research, Linked-Open-Data-Network-Program, Wikidata

Isaac added a comment to T367551: Cloud VPS "research-collaborations-api" project Buster deprecation.

I've containerized these and added a docker-compose.yml file (PR here) so that all this can be easily deployed on any instance that has docker and really only takes a single command to do so, though note that I haven't touched any application code.

Makes sense to me. I haven't worked with Docker so you might need to do a brief walkthrough with me to understand how to deploy with docker on Cloud VPS but that's useful knowledge for me if you don't mind that extra overhead. I went ahead and created a new instance (wikinav-bookworm.research-collaborations-api) that's the same RAM/CPU but new OS just to reserve the space but if it's the wrong flavor etc., don't hesitate to delete and create a new one.

Jun 24 2024, 5:17 PM · Research, Cloud-VPS (Debian Buster Deprecation)

Jun 21 2024

Isaac added a comment to T360572: Extend Article Quality Model to use HTML.

Weekly updates:

None -- waiting on final model comparison before deciding what should be hosted on LiftWing. ML Platform did get an initial version on staging, which is an exciting step towards deployment!

Jun 21 2024, 4:31 PM · Essential-Work, Research, Epic

Isaac added a comment to T354559: Put together diff blogpost on AI + Wikimedia + Datasets.

Weekly updates:

Received feedback from Miriam but have not begun to process/iterate yet

Jun 21 2024, 4:30 PM · Essential-Work, Research (FY2024-25-Research-July-September)

Isaac added a comment to T361637: Support for topic infrastructure work.

Weekly updates:

Supported interviews for contract role that will in part help with evaluating the list-building tool and related functionality
Updated Meta page for country hypothesis to include status updates and shared on Annual Plan: https://meta.wikimedia.org/wiki/Research:Language-Agnostic_Topic_Classification/Countries
Quality model is now hosted on LiftWing staging which is a big step towards having quality scores available in our infrastructure to use as an additional filter around list-building etc.

Jun 21 2024, 4:29 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Jun 20 2024

Isaac added a comment to T308164: Migrate Content Translation Recommendation API to Lift Wing.

Thanks for digging this up @kevinbazira ! I glanced through and much of it was the Content Translation Extension (which Language is working on porting) or copies of configuration from that repo. I did leave a message around the beta-labs settings from CodeSearch because that felt like something that should be removed when appropriate (T365347#9910134) and could possibly be missed. The only piece I wasn't sure about was the uMatrix code. And then maybe some other random uses but they all seemed to be older, unmaintained repos. Feel free to reach out obviously where you feel useful but based on the searches you pulled together, I feel like we're in a pretty good place!

Jun 20 2024, 3:06 PM · Language-Team, Machine-Learning-Team, Epic

Isaac (Isaac Johnson)Research Scientist

Projects

Calendar

Today

Tomorrow

Tuesday

User Details

Recent ActivityView All

Fri, Aug 30

Thu, Aug 29

Wed, Aug 28

Mon, Aug 26

Fri, Aug 16

Mon, Aug 12

Thu, Aug 8

Wed, Aug 7

Tue, Aug 6

Mon, Aug 5

Fri, Aug 2

Aug 1 2024

Jul 31 2024

Jul 30 2024

Jul 26 2024

Jul 25 2024

Jul 23 2024

Jul 22 2024

Jul 19 2024

Jul 18 2024

Jul 17 2024

Jul 15 2024

Jul 12 2024

Jul 11 2024

Jul 10 2024

Jul 9 2024

Jul 2 2024

Jun 28 2024

Jun 27 2024

Jun 25 2024

Jun 24 2024

Jun 21 2024

Jun 20 2024

Isaac (Isaac Johnson)
Research Scientist

Recent Activity
View All