Page MenuHomePhabricator

[Session] Enterprise For All
Closed, ResolvedPublic

Description

(Please set yourself as task assignee of this session)

  • Title of session: Enterprise For All
  • Session description: An introduction to Enterprise APIs + an open discussion on what community devs could need from our APIs
  • Username for contact: FNavas-WMF & prabhat
  • Session duration (25 or 50 min): 40 min
  • Session type (presentation, workshop, discussion, etc.): Discussion
  • Language of session (English, Arabic, etc.): English
  • Prerequisites (some Python, etc.): API
  • Any other details to share?:
  • Interested? Add your username below:

Notes from session:

Enterprise For All

Date and time: 2024-05-03, 15:00-15:30

Relevant links

Presenter

[[metawiki:User:FNavas-WMF|FNavas-WMF]]
[[metawiki:User:PTiwary_(WMF)|PTiwary_(WMF)]]

Participants

Notes

text from slides

  • What is Enterprise? SLA grade APIs
  • Using Wikimedia content in a third-party environment in high-volume carries challenges

Speed
Usability
Machine Readability
Content Integrity

  • Wiki content/data are used as the backbone for training LLMs and act as information retrieval or identifiers in Knowledge Graphs. And for you, too (!)
  • A Goal of Universal Usability
  • WME APIs

Authentication API
Metadata APIs

    • Snapshot APUI
    • On-demand-API
    • Structured Contents API (Beta)
  • On-demand API

Used to fetch articles in their latest revision/version from all supported projects and languages.
878 projects are supported by WME APIs (as of Mar 2024)
Each article follows a consistent schema [1].
Allows filtering and field selection.
Allows to limit articles when doing cross-project, cross-language lookup.
Refer to the examples here [2] and documentation here [3].
10,000 for total request per accounts (community can create new accounts)
10 request/sec rate limit
[1] Schema : https://gitlab.wikimedia.org/repos/wme/wikimedia-enterprise/-/blob/main/general/schema/article.go
[2] SDK : https://gitlab.wikimedia.org/repos/wme/wme-sdk-go
**[3] Docs: https://enterprise.wikimedia.com/docs/[4] Structured-contents use case: https://wikimedia-enterprise.github.io/structured-contents-use-case/

  • Credibility Signals
  • Credibility Signals support practices of data validation. They are not declarative.
  • Existing Credibility Signals
    • version-level signals

version.comment
version.tags
version.is_minor_edit
version.maintenance_tags
version.is_flagged_stable
Version.scores
Revertrisk
Referencerisk (upcoming)
version.size
version.number_of_characters
version.is_breaking_news
version.noindex
visibility

  • editor-level signals

version.editor.identifier
version.editor.name
version.editor.edit_count
version.editor.groups
version.editor.is_bot
version.editor.anonymous
version.editor.date_started
version.editor.is_patroller
version.editor.has_advanced_rights
version.editor.is_admin

  • Demo

Questions

Q: What is the relation to Action API etc?
Our APIs are independent of those.

Q: Which use-case you have identified for the features that are implemented?
A lot of it came from existing research on Wikipedia. Some of it comes from feedback from customers, for example, breaking news signal.

Q: Which projects are you supporting? also Wikidata?
We are currently supporting 8 projects (Wikipedia, Wikisource, ...). but not: Wikidata or Commons.

Details

Other Assignee
prabhat

Event Timeline

debt subscribed.

Hello! 👋 The 2024 Hackathon Program is now open for scheduling! If you are still interested in organizing a session, you can claim a slot on a first-come, first-serve basis by adding your session to the daily program, following these instructions. We look forward to hearing your presentation!

debt triaged this task as Medium priority.Apr 19 2024, 5:30 PM

As a demo for the presentation, maybe show MR API calls to create a dataset that is used in the LLM RAG demo? See Python code and readme: https://gitlab.enterprise.wikimedia.com/wikimedia-enterprise/experiments/for-blog-llm-rag

Olea updated the task description. (Show Details)
Olea subscribed.