Page MenuHomePhabricator

Ethical ML: Establish Initial Guidance
Closed, ResolvedPublic

Description

This task focuses on leveraging our research skills to help the Wikimedia community (most directly WMF but also the broader editor community) adapt to the new AI capabilities emerging via a few complementary streams of work:

  • Community engagement: beginning more engagement with community members to better understand potential use-cases and pain points / concerns with using AI on the Wikimedia projects. The core focus of this will be around the Wikimedia Hackathon in May.
  • Guidance on 3rd-party ML: I'm helping to chart out a path for how WMF should work with 3rd-party ML services and working on providing guidance that can help product teams in making good decisions around these services.
  • Piloting new models: most relevant is my work supporting the Android team in deploying a generative AI model for Wikidata descriptions but this line generally will consider what other prototypes can help us better understand how to appropriately incorporate these more powerful models into Wikimedian workflows.

Note that this goal originally was going to have a component that focused on data gaps but that work will now hopefully occur as part of my edit types project. This shift has been made in response to the emergent need to address questions / interest around generative AI over the past few months and shifts in annual planning.

Event Timeline

Weekly updates:

  • Worked with Leila to generate some remaining questions for Human Rights folks about how that policy might be used to support the ethical ML space. They're out this week but hopefully responses next week that allow us to move forward.

Weekly updates:

  • Moving forward with Human Rights approach -- waiting to hear on next steps with them.
  • Android pilot seems to be going well -- been monitoring VPS instance to make sure it stays up and will do some evaluation of the edits to get familiar with any issues that are popping up in usage.
  • Started working on session for hackathon. Initial focus is on something like WikiGPT but for Wikitech Help namespaces both as a potentiallly useful tool for developers there and also to showcase what's possible with open-source tech. Example: https://public-paws.wmcloud.org/User:Isaac_(WMF)/hackathon-2023/wikitech-natural-language-search.ipynb

Weekly updates:

  • Reviewing first draft of Human Rights checklist
  • Reviewed some of the enwiki edits from the Android pilot and all were looking reasonable.
  • Continued work to pull together best practices / tips around hosting ML on cloud services.

Weekly updates:

  • Participated in AI + Wikimedia panel at WikiWorkshop
  • Figured out issue with Hackathon demo (cloud vps configuration) and so that is working now (current endpoint: https://wikitech-search.wmcloud.org/docs)! Working on putting together learnings now for the session.

Weekly updates:

  • Participated in Hackathon and processing outcomes from that!
  • Put in a few patches for wikigpt plugin to improve logging so we can better analyze the quality of the different search options
  • Provided feedback on AI Human Rights checklist and signed up to share out with team in two weeks

Weekly updates:

  • Working on iterative feedback sessions on Human Rights Impact checklist. TODOs to help craft a few and potentially do a pilot implementation with some of Diego's models
  • Put together some patches for plugin to help improve quality of internal Wikipedia Search Results in anticipation of doing some testing of how well it does. Early indications are that it does just fine as ChatGPT is generally passing a standard list of keywords as opposed to the raw user question

Weekly updates:

  • Presented overview of space to team during last Tuesday meeting (slides)
  • Android article-description model showing initially promising results! I provided some feedback on the analyses and still some per-language analyses for them to do but it was a promising check-in. Looking into an issue around generating false dates that I think might stem from the training approach but should be easily patch-able.
  • Provided suggested questions for open-licensed models and freedom to publish outputs under an open license for the Human Rights checklist.

Weekly updates:

  • Worked on plugin logging so we'll be able to better evaluate the plugin when it launches.

Resolving this task as I think I have done a good job of establishing initial guidance -- status on the major components:

  • Community engagement: attended community hackathon and learned about importance of summarization and challenges to running AI models on cloud infrastructure. Considering next steps in that area including releasing more AI datasets to encourage greater investment in Wikimedia-related AI challenges and finding ways to establish APIs that the community can use for some of these common tasks.
  • Guidance on 3rd-party ML: continued iteration on human rights AI checklist with intention to pilot with automoderator tool soon.
  • Piloting new models: waiting on final evaluation of machine-generated article descriptions and release of plug-in, but both projects are moving forward and have resulted in a lot of learnings about evaluation of generative AI models and what our gaps are. Gave a presentation to the team highlighting some of these.

This work will continue in the next FY but under a new task.