Page MenuHomePhabricator

Technical investigation for learning from extensions that use external APIs [8H]
Closed, ResolvedPublicAug 25 2020

Description

Goal

Technical investigation to understand how Extension:MachineVision and Popups works so we can model our work for getting IP information from external APIs after that.
What can we learn from them?

  • UI (note that we might not always use hover)
  • API

Details

Due Date
Aug 25 2020, 4:00 AM

Event Timeline

Niharika triaged this task as Medium priority.Aug 5 2020, 4:31 PM
Niharika created this task.
ARamirez_WMF renamed this task from Technical investigation for learning from extensions that use external APIs to Technical investigation for learning from extensions that use external APIs [8H].Aug 6 2020, 4:44 PM
ARamirez_WMF changed the subtype of this task from "Task" to "Deadline".

The IP info tool will likely need to call an external provider (or providers) to fetch information about IP addresses. The MachineVision extension calls an external API to fetch annotations for images. What can we learn from MachineVision?

What does MachineVision do?

  • A hook handler onUploadComplete requests annotations after a file is uploaded
  • A config defines the available handlers for requesting annotations
  • Each handler creates a new job, added to the JobQueueGroup, run asynchronously (currently in production we only use one job)
  • When the job is run, it builds the request via HttpRequestFactory
  • A config defines a proxy for making the request
  • If the request is unsuccessful, or there are no useful results, a warning is logged
  • If the request is successful, the results are stored in the database, and the user is notified

How is IP info similar?

  • It will need to make requests to external services via the same proxy
  • It will need to handle success/failure, including logging warnings
  • The services to be used should be configurable

How is IP info different?

  • Requests to external services will be initiated by client-side interactions, so the feature will need an API endpoint (as mentioned in T259726)
  • Data will be displayed to the user to immediately (rather than running a job later), which may raise performance issues with waiting for an external API's response
  • Any failure or timeout will need to be displayed to the user immediately
  • If we want the IP info tool to request data from multiple services (internal and/or external), we'll need to think about asynchronous requests, receiving data at different times, and displaying mixture of success/failure
  • The results are not stored anywhere. The same request might be made multiple times by different users or by the same user. Can we temporarily cache results locally for each user?

I focused this investigation on making requests to external APIs, as per the title/description. I suspect there is also a lot to be learned about the UI from Popups and other similar features - separated into T260377: Technical investigation into features that use popups to display information [8H].

Summary from team discussion:

  • The approach we take will be dependent on the restrictions of any contracts we have with the 3rd parties (e.g. whether we can store any data)
  • The performance review (T260821) may also raise constraints, e.g. whether it's acceptable to keep a connection open while we wait for a 3rd-party response
  • Action: Until we know the contract details, our technical approach should make minimal assumptions about what we're allowed to do (e.g. don't assume we can store data)
  • Action: Expect to reconsider our approach depending on what contracts get agreed, and what performance constraints are

This documentation exists, though it is not completely up to date: https://www.mediawiki.org/wiki/Wikimedia_Product/Machine_vision_middleware
If we have more specific questions, please let me know. Infrastructure & SD teams have internal notes and the PgM can get those for us.