Page MenuHomePhabricator

Implement new Content Translation data collection
Open, MediumPublic

Description

The new data collection should be implemented using the new Event Platform. In practice, this means the following steps are different than before:

If the documentation isn't sufficient, @Ottomata is the best person to consult about the workings of the Event Platform.

The data submitted should follow version 1.0.0 of the content_translation_event schema. The schema itself and the schema readme should contain almost all the information necessary to prepare the data for submission (in particular, note that the schema itself contains two examples of valid events at the bottom). For more information on this new schema format, see wikitech:Event Platform/Schemas.

However, here are a few notes that could not be included in the schema for technical reasons:

  • $schema: this is a special field giving the $id of the schema the backend should use to validate the data. In our case, the value should be: /analytics/mediawiki/content_translation_event/1.0.0
  • meta: this object and its subfields (e.g. meta.dt, meta.stream) will automatically be filled in by the backend, so omit it entirely from the submitted data.
  • http: we aren't interested in this object and its subfields, so omit it entirely from the submitted data.
  • dt: this field will be automatically filled in by the client-side EventLogging code, so omit it entirely from the submitted data.

Event Timeline

nshahquinn-wmf edited projects, added ContentTranslation; removed Language-analytics.

Since this is work for the engineers, I think it shouldn't be in the analytics project.

Pginer-WMF moved this task from Needs Triage to Enhancements on the ContentTranslation board.