Page MenuHomePhabricator

CentralNoticeBannerHistory and CentralNoticeImpression Event Platform Migration
Closed, ResolvedPublic

Description

See: https://wikitech.wikimedia.org/wiki/Event_Platform/EventLogging_legacy

Unless otherwise notified, client IP and consequently geocoded data will no longer be collected for this event data after this migration. Please let us know if this should continue to be captured. See also T262626.

  • 1. Pick a schema to migrate
  • 2. Create a new task to track this schema's migration
  • 3. Create the /analytics/legacy/ schema
  • 4. Edit-protect the metawiki Schema page at https://meta.wikimedia.org/wiki/Schema:<SchemaName>
  • 5. Manually evolve the Hive table to use new schema
  • 6. Add entry to wgEventStreams, wgEventLoggingStreamNames and wgEventLoggingSchemas in operations/mediwiki-config
  • 7. Once the legacy stream's data is fully produced through EventGate, switch to using Refine job that uses schema repo instead of meta.wm.org
  • 8. Edit the producer extension.json and set EventLoggingSchemas to the new schema URI
  • 9. Once the producer extension.json is fully deployed, edit wgEventLoggingSchemas in operations/mediawiki-config InitialiseSettings.php and remove the schema's entry.
  • 10. Mark the schema as migrated in the EventLogging Schema Migration Audit spreadsheet

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@AndyRussG
Let us know if this schema needs client IP and/or geocoded data. If not, it will be removed as part of this migration.

@AndyRussG
Let us know if this schema needs client IP and/or geocoded data. If not, it will be removed as part of this migration.

Hi! Yes, I think both are potentially important for debugging and deeper research with this data, so if it's not a problem to keep them going forward, that'd be fantastic. Thanks so much!!!!

Ottomata renamed this task from CentralNoticeBannerHistory Event Platform Migration to CentralNoticeBannerHistory and CentralNoticeImpression Event Platform Migration.Jan 5 2021, 2:18 PM

Oh, I just noticed you also have CentralNoticeImpression. Same question for that @AndyRussG: Does CentralNoticeImpression need client IP and/or geocoded data?

@AndyRussG, FYI we plan to perform this migration next week (week of Jan 11th), unless you have objections.

Hi all! CCing @AndyRussG - do we in Fundraising need IP data for either CentralNoticeImpressions or CentralNoticeBannerHistory? Having the geodata at the state/territory level is important for us for both reporting and diagnostic reasons, but the IP level is perhaps more micro than we need.

Also, what event stream feeds the pgehres data feed? That would be of interest for geodata (at the more macro level) as well.

Hi all! CCing @AndyRussG - do we in Fundraising need IP data for either CentralNoticeImpressions or CentralNoticeBannerHistory? Having the geodata at the state/territory level is important for us for both reporting and diagnostic reasons, but the IP level is perhaps more micro than we need.

Hey! IP data has been important at times for debugging, so, if possible, I'd like to keep it for both.

Also, what event stream feeds the pgehres data feed? That would be of interest for geodata (at the more macro level) as well.

The pgehres feed doesn't arrive via either of these, but rather our custom beacon/impression endpoint. CentralNoticeImpressions is the so-far-mostly-unused feed that was designed to replace beacon/impression. (A bit off-topic... IIRC the adblocking issues previously affecting that feed are now gone, so we could start looking at finally switching to that, maybe?)

@Ottomata hey also (apologies if this question has already been resolved) is there any action needed on our part for this migration? Note that the way CentralNotice uses the client-side EventLogging API is... uhh a bit unconventional (see here, here and here). Thanks so much!

Oh...that is a bit of a problem then. Can we change that so that it logs using the EventLogging client JS?

@AndyRussG, FYI we plan to perform this migration next week (week of Jan 11th), unless you have objections.

We will not be doing this migration this week as this will require more dev work than expected.

@AndyRussG if ok with you, I'll make some patches to CentralNotice to make it always use mw.eventLog.logEvent. This will allow us to migrate these schemas to Event Platform and EventGate. I guess this will undo some code you added in 2015 to support users without sendBeacon but hopefully this is not needed anymore.

Actually, @AndyRussG is anything actually needed to be done? I see that mw.eventLog.logEvent is called as long as the browser has sendBeacon. There is some custom logic to minimize the size of the event for the legacy urr enconded query param, but that will no longer be needed after we migrate. Since we'll be removing support for browsers without sendBeacon anyway, perhaps we can go ahead and migrate to Event Platform now, and just remove the unneeded code in CentralNotice about sendBeacon and checkEventLoggingURLSize afterwards?

Hi @Ottomata! Thanks so much for your help here!!

we'll be removing support for browsers without sendBeacon

Ah ok I was just going to ask about that... since it looks like we currently do still support IE 11, which doesn't support sendBeacon. Do you know the timeline for removing JS support for IE? Should we wait until that happens?

There is some custom logic to minimize the size of the event for the legacy urr enconded query param, but that will no longer be needed after we migrate.

Ah yeah that was the other question I had... So everything will get sent (or maybe is already sent?) as a POST, I guess, so we don't have to worry about payload size at all?

If the payload-size checking code isn't blocking the migration (i.e., if the method mw.eventLog.makeBeaconUrl() is not going away) I think we're probably fine leaving it in for now and removing it (and probably also updating our super-compressed schema) a bit later on...

Apologies for the super-old code, and thanks again! :)

since it looks like we currently do still support IE 11, which doesn't support sendBeacon. Do you know the timeline for removing JS support for IE? Should we wait until that happens?

Hm interesting! I don't know what that timeline is, but as far know I this CentralNotice code is the only custom code that works around the lack of support for sendBeacon. According to https://analytics.wikimedia.org/dashboards/browsers/#all-sites-by-browser, IE 11 is < 1% of our traffic. @Milimetric told me that dropping support in MW for old browsers is more about regional percentages of traffic rather than global (e.g. we don't want to drop a browser that most of some country still uses). But, perhaps the decision to drop support from MW is different than just dropping support for instrumentation of MW in old browsers?

We are hoping to turn off the legacy eventlogging backend service as soon as possible, perhaps in Q1 of next fiscal year.

Q: is collecting CentralNotice* events from IE 11 users essential for usage and analysis of those events?

I don't know what that timeline is, but as far know I this CentralNotice code is the only custom code that works around the lack of support for sendBeacon. According to https://analytics.wikimedia.org/dashboards/browsers/#all-sites-by-browser, IE 11 is < 1% of our traffic. @Milimetric told me that dropping support in MW for old browsers is more about regional percentages of traffic rather than global (e.g. we don't want to drop a browser that most of some country still uses). But, perhaps the decision to drop support from MW is different than just dropping support for instrumentation of MW in old browsers?

Agreed, important point. The potential concern in this case is also about regional impact. For example, IE's share is non-negligible in Japan. That said, I'm not suggesting at this point that any extra work is needed to preserve this workaround. I guess I'd just like to double-check with others before confirming we can turn it off.

We are hoping to turn off the legacy eventlogging backend service as soon as possible, perhaps in Q1 of next fiscal year.

Q: is collecting CentralNotice* events from IE 11 users essential for usage and analysis of those events?

Almost certainly not... I'll try to get back to you by Monday with a more definitive answer. Apologies for the delay and thanks so much for ur patience!! :)

@Pcoombe and @spatton as FYI. I'm wondering if you have feedback on older browser versions here.

For regular banner testing we're still using the pgehres data which comes in via a different method. So I don't think this would affect us at the moment.

For regular banner testing we're still using the pgehres data which comes in via a different method. So I don't think this would affect us at the moment.

Ah yeah important point--for the time being this only would impact on banner history data (since the other event schema we defined is currently not used for production).

Change 677015 had a related patch set uploaded (by Ottomata; author: Ottomata):

[mediawiki/extensions/CentralNotice@master] Use mw.eventLog.logEvent for all EventLogging logs

https://gerrit.wikimedia.org/r/677015

Change 677015 merged by jenkins-bot:

[mediawiki/extensions/CentralNotice@master] Use mw.eventLog.logEvent for all EventLogging logs

https://gerrit.wikimedia.org/r/677015

@AndyRussG is CentralNotice special for deployment? We merged the change > 2 weeks ago but it isn't included in either php-1.37.0-wmf.4 or php-1.37.0-wmf.5. I just looked at git log on deploy1002, do we need to merge that change into the wmf_deploy branch?

@AndyRussG is CentralNotice special for deployment? We merged the change > 2 weeks ago but it isn't included in either php-1.37.0-wmf.4 or php-1.37.0-wmf.5. I just looked at git log on deploy1002, do we need to merge that change into the wmf_deploy branch?

Thanks @Ottomata... Ahh, yes, it does need to be merged there to deploy. Probably easiest would be for us to merge all accumulated changes in master into wmf_deploy just a bit before the new branch is cut for next week's train deploy. If we did that, the EventLogging changes would go out on next week's train. Or, if you need it out sooner (no problem if that's the case btw) we can cherry-pick and put it on a backport deploy today or tomorrow. :) Thanks again!

Ottomata added a project: Analytics-Kanban.

@AndyRussG hi! Did ^ happen?

Did ^ happen?

I just looked, php-1.37.0-wmf.9 has this code so we should be good to proceed!

Change 699753 had a related patch set uploaded (by Ottomata; author: Ottomata):

[schemas/event/secondary@master] Add centralnoticeimpression and centralnoticebannerhistory legacy schemas

https://gerrit.wikimedia.org/r/699753

Ottomata updated the task description. (Show Details)
Ottomata updated the task description. (Show Details)

Change 699753 merged by jenkins-bot:

[schemas/event/secondary@master] Add centralnoticeimpression and centralnoticebannerhistory legacy schemas

https://gerrit.wikimedia.org/r/699753

Change 699759 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] Migrate CentralNotice{BannerHistory,Impression} to EventGate on testwiki

https://gerrit.wikimedia.org/r/699759

Change 699759 merged by Ottomata:

[operations/mediawiki-config@master] Migrate CentralNotice{BannerHistory,Impression} to EventGate on testwiki

https://gerrit.wikimedia.org/r/699759

Mentioned in SAL (#wikimedia-operations) [2021-06-14T14:17:05Z] <otto@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Migrate CentralNotice{BannerHistory,Impression} to EventGate on testwiki - T271168 (duration: 00m 57s)

Change 699762 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] Migrate CentralNotice{BannerHistory,Impression} to EventGate on all wikis

https://gerrit.wikimedia.org/r/699762

Change 699762 merged by Ottomata:

[operations/mediawiki-config@master] Migrate CentralNotice{BannerHistory,Impression} to EventGate on all wikis

https://gerrit.wikimedia.org/r/699762

Mentioned in SAL (#wikimedia-operations) [2021-06-14T14:27:47Z] <otto@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Migrate CentralNotice{BannerHistory,Impression} to EventGate on all wikis - T271168 (duration: 00m 57s)

Change 699787 had a related patch set uploaded (by Ottomata; author: Ottomata):

[mediawiki/extensions/CentralNotice@master] Finalize migration to Event Plaform for EL schemas

https://gerrit.wikimedia.org/r/699787

Change 699787 merged by jenkins-bot:

[mediawiki/extensions/CentralNotice@master] Finalize migration to Event Plaform for EL schemas

https://gerrit.wikimedia.org/r/699787

Change 706689 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] Finalize several EventLogging -> Event Platfom migrations

https://gerrit.wikimedia.org/r/706689

Change 706689 merged by jenkins-bot:

[operations/mediawiki-config@master] Finalize several EventLogging -> Event Platfom migrations

https://gerrit.wikimedia.org/r/706689

Mentioned in SAL (#wikimedia-operations) [2021-07-22T19:26:12Z] <otto@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Finalize several EventLogging -> Event Platfom migrations - T282855 T238138 T282562 T271168 (duration: 00m 55s)