Page MenuHomePhabricator

Add original image and thumbnail parsing for an article
Closed, ResolvedPublic8 Estimated Story Points

Description

User Story: "As a WME product team we want to have an original image and thumbnail to be added to the article response."

Acceptance criteria

  1. It expects that both thumbnail and original image are being fetched from for a particular article
  2. Also it expects that thumbnail and original image is added to the Kafka stream by structure data services (article-update and article-bulk)
  3. All related unit and integration tests pass successfully.
  4. For the product clients for all the services including On-demand, Reatlime Batch, Snapshots, Realtime Streaming the response have the following structure
{
 "article" : {
   "image": {
     "contentUrl": "https://upload.wikimedia.org/wikipedia/commons/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
     "width": 2523,
     "height": 3313,
     "thumbnail": {
       "contentUrl": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg/300px-Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
       "width": 300,
       "height": 394
     }
   }
 }
}

To accomplish that we need to add several adjustments to our services and schema:

ToDo

  • Adjust JSON schema documentation by adding image representation for an article (It’s actually done during this investigation)
  • Adjust Avro Schema according to the JSON Schema (also done)
  • Adjust WMF API client library so it can request images for an article.

According to the Wikimedia Actions API docs, both original image and thumbnail can be retrieved by adding a particular parameter to a request, specifically by adding pageimages parameter to the API request:

props=pageimages

In order to get both the original image and thumbnail, the request parameters should be extended with piprop parameter:

piprop=thumbnail|original

By default Action API returns a thumbnail scaled by default size (which differs from page to page). If we need a specific size of the thumbnail, it also can be achieved by adding a parameter to the API request:

pithumbsize=500 (Value “500” corresponds to width of the thumbnail.)

According to this we can adjust our WMF client library so it fetches and returns bot original image and thumbnail. Also unit test should be adjusted.
Note: We can add required props to GetPages method or add new method.

  • Adjust structured data service so that it adds both thumbnail and original image to a Kafka message. In order to do that we need to adjust Aggregate package (by adding GetPageImages method as an option) and use it in article-update and article-bulk handlers. Also unit test should be adjusted to handle the new functionality.
  • Remove thumbnail from json

Event Timeline

Protsack.stephan renamed this task from Add original image and thumbnail parsing for an article. to Add original image and thumbnail parsing for an article.Mar 27 2023, 1:25 PM
AnnaMikla set the point value for this task to 8.Mar 29 2023, 6:01 PM