Page MenuHomePhabricator

Add support for sending batched image annotation requests
Closed, DeclinedPublic

Description

Currently, we're set up only for sending image annotation requests for a single image at a time.

Google Cloud Vision also provides for asynchronous batched requests of up to 2000 images at a time, as described at https://cloud.google.com/vision/docs/batch. We should update GoogleCloudVisionHandler to be able to use it. I anticipate that all of our annotation requests will ultimately use the batch API.

Note: This will require some additional setup, namely to set up a Google Cloud Storage bucket to receive results, and to configure our Handler to retreive results from it.

Open questions

  • What happens in case of error retrieving or labeling an image during a batched request?

Event Timeline

Mholloway updated the task description. (Show Details)
Jhernandez subscribed.

Moved to analysis to answer the open questions before going to implementation.

@Mholloway Kaldari says to ping him tomorrow to get the Google Cloud Storage set up.

Mholloway closed this task as Declined.EditedOct 23 2019, 6:38 PM

Ha: it turns out that making batched async requests actually requires putting the images to be annotated into Google Cloud Storage:

Google\ApiCore\ApiException from line 139 of /var/www/mediawiki/extensions/MachineVision/vendor/google/gax/src/ApiException.php: {
    "message": "Invalid URI provided in AnnotateImageRequest. Note that At this time, only plain GCS uris are supported and a gcs image uri has to be put in image.source.image_uri of each request.",
    "code": 3,
    "status": "INVALID_ARGUMENT",
    "details": []
}

We're not doing that.