Page MenuHomePhabricator

TranslationRecommendation* Schemas Event Platform Migration
Open, Needs TriagePublic

Description

See: https://wikitech.wikimedia.org/wiki/Event_Platform/EventLogging_legacy

We will keep client_ip and geocoded data for these schemas.

status
  • 2021-02-22 - schemas merged and edit protected on metawiki.
  • 2021-02-23
    • Hive table evolved
    • event streams declared

Recommendation API code changes needed

Since these events are sent to the legacy eventlogging backend using from a custom client (not via the MW EventLogging extension), the code will need to be changed to send events to EventGate instead.

It looks as though this code has both JavaScript and Python logic to send events. Both implementations will need updating to send events to EventGate.

For more info see:

Event Timeline

Ottomata created this task.Jan 4 2021, 9:40 PM
Restricted Application added a project: Analytics. · View Herald TranscriptJan 4 2021, 9:40 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Isaac, let us know if this schema needs client IP and/or geocoded data? If not, it will be removed as part of this migration.

Also, do you know what produces these events? I'm assuming its a MW extension somewhere?

Isaac added a comment.Jan 4 2021, 10:24 PM

let us know if this schema needs client IP and/or geocoded data? If not, it will be removed as part of this migration.

If it's not hard, I'd ask to retain the geocoded data. Client IP is nice for determining unique number of users (UA+IP) but there's a user token in the data that also works for that purpose. I do use the geocoded data (country specifically) for looking at geographic diversity of users for the system though so would prefer to retain it.

Also, not ideal, but my understanding is that eventlogging also always shows up in webrequests so we can always extract that information even if it's not logged in the event.<schemaname> table. Is that going to change too?

Also, do you know what produces these events? I'm assuming its a MW extension somewhere?

Yeah, the eventlogging is all coming from GapFinder (not technically an extension) which has this codebase -- specifically, UserAction, UIRequests, APIRequests.

If it's not hard, I'd ask to retain the geocoded data.

It isn't hard, we can do!

eventlogging also always shows up in webrequests so we can always extract that information even if it's not logged in the event.<schemaname> table. Is that going to change too?

All webrequests are logged, so the POST of the event will be available in the webrequest logs, however, it won't be possible to extract the event data from it, since the data is now sent as part of the POST body, which isn't logged in webrequest.

Yeah, the eventlogging is all coming from GapFinder (not technically an extension)

Ok, interesting this will need code changes then. We'll put this off for now, but when we take it up, who should we work with to to make the changes?

Isaac added a comment.Jan 5 2021, 3:14 PM

If it's not hard, I'd ask to retain the geocoded data.

It isn't hard, we can do!

Thanks!

All webrequests are logged, so the POST of the event will be available in the webrequest logs, however, it won't be possible to extract the event data from it, since the data is now sent as part of the POST body, which isn't logged in webrequest.

Ahh...bummer but makes sense.

Ok, interesting this will need code changes then. We'll put this off for now, but when we take it up, who should we work with to to make the changes?

I'd start with reaching out to Leila. If the work is more technical, Baha knows the most about the codebase. If it's just code review / guidance, I can probably do that. Fabian might have an interest. Either way, Leila will be able to direct appropriately.

Ottomata updated the task description. (Show Details)
Ottomata added a subscriber: leila.
razzi moved this task from Incoming to Event Platform on the Analytics board.Jan 14 2021, 5:59 PM

@leila o/ We're getting closer to being done with the low hanging fruit parts of this EventLogging -> Event Platform migration. The TranslationRecommendation events that come from GapFinder are 'medium hanging' :) GapFinder will need code changes. Specifically, the code that sends events as JSON url encoded query parameters to /beacon/event need to be changed to POST a fully formatted event to https://intake-analytics.wikimedia.org/v1/events?hasty=true, following the same logic as implemented in the EventLogging extension.

I don't think the code changes would be difficult, but I'd prefer if someone on your team could do the code. I'd be very available to advise and review. I understand if this is unexpected work that y'all don't have time for, but the sooner we get this done better! The more migrations we finish the sooner we can turn off the legacy eventlogging backed.

leila added a comment.Jan 28 2021, 9:19 PM

On it, meaning: I'll check on our end to see if we can pick this up and one of us will get back to you here.

Change 660658 had a related patch set uploaded (by Bmansurov; owner: Bmansurov):
[research/recommendation-api@master] Send events to EventGate

https://gerrit.wikimedia.org/r/660658

Change 661399 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[schemas/event/secondary@master] Migrate TranslationRecommendation from metawiki

https://gerrit.wikimedia.org/r/661399

Ottomata updated the task description. (Show Details)Mon, Feb 22, 5:26 PM

Change 661399 merged by Ottomata:
[schemas/event/secondary@master] Migrate TranslationRecommendation from metawiki

https://gerrit.wikimedia.org/r/661399

Change 666392 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Declare TranslationRecommendation event streams

https://gerrit.wikimedia.org/r/666392

Change 666392 merged by Ottomata:
[operations/mediawiki-config@master] Declare TranslationRecommendation event streams

https://gerrit.wikimedia.org/r/666392

Ottomata updated the task description. (Show Details)Tue, Feb 23, 4:02 PM

Mentioned in SAL (#wikimedia-operations) [2021-02-23T16:02:55Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Declare TranslationRecommendation event streams - T271163 (duration: 00m 58s)

Ottomata updated the task description. (Show Details)Tue, Feb 23, 4:08 PM

@bmansurov you should be able to produce theses events now using your code. This should work in both beta and production. See also https://wikitech.wikimedia.org/wiki/Event_Platform/Instrumentation_How_To#Viewing_and_querying_events