Page MenuHomePhabricator

CentralNoticeBannerHistory and CentralNoticeImpression Event Platform Migration
Open, Needs TriagePublic

Description

See: https://wikitech.wikimedia.org/wiki/Event_Platform/EventLogging_legacy

Unless otherwise notified, client IP and consequently geocoded data will no longer be collected for this event data after this migration. Please let us know if this should continue to be captured. See also T262626.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@AndyRussG
Let us know if this schema needs client IP and/or geocoded data. If not, it will be removed as part of this migration.

@AndyRussG
Let us know if this schema needs client IP and/or geocoded data. If not, it will be removed as part of this migration.

Hi! Yes, I think both are potentially important for debugging and deeper research with this data, so if it's not a problem to keep them going forward, that'd be fantastic. Thanks so much!!!!

Ottomata renamed this task from CentralNoticeBannerHistory Event Platform Migration to CentralNoticeBannerHistory and CentralNoticeImpression Event Platform Migration.Jan 5 2021, 2:18 PM

Oh, I just noticed you also have CentralNoticeImpression. Same question for that @AndyRussG: Does CentralNoticeImpression need client IP and/or geocoded data?

@AndyRussG, FYI we plan to perform this migration next week (week of Jan 11th), unless you have objections.

Hi all! CCing @AndyRussG - do we in Fundraising need IP data for either CentralNoticeImpressions or CentralNoticeBannerHistory? Having the geodata at the state/territory level is important for us for both reporting and diagnostic reasons, but the IP level is perhaps more micro than we need.

Also, what event stream feeds the pgehres data feed? That would be of interest for geodata (at the more macro level) as well.

Hi all! CCing @AndyRussG - do we in Fundraising need IP data for either CentralNoticeImpressions or CentralNoticeBannerHistory? Having the geodata at the state/territory level is important for us for both reporting and diagnostic reasons, but the IP level is perhaps more micro than we need.

Hey! IP data has been important at times for debugging, so, if possible, I'd like to keep it for both.

Also, what event stream feeds the pgehres data feed? That would be of interest for geodata (at the more macro level) as well.

The pgehres feed doesn't arrive via either of these, but rather our custom beacon/impression endpoint. CentralNoticeImpressions is the so-far-mostly-unused feed that was designed to replace beacon/impression. (A bit off-topic... IIRC the adblocking issues previously affecting that feed are now gone, so we could start looking at finally switching to that, maybe?)

@Ottomata hey also (apologies if this question has already been resolved) is there any action needed on our part for this migration? Note that the way CentralNotice uses the client-side EventLogging API is... uhh a bit unconventional (see here, here and here). Thanks so much!

Oh...that is a bit of a problem then. Can we change that so that it logs using the EventLogging client JS?

@AndyRussG, FYI we plan to perform this migration next week (week of Jan 11th), unless you have objections.

We will not be doing this migration this week as this will require more dev work than expected.

@AndyRussG if ok with you, I'll make some patches to CentralNotice to make it always use mw.eventLog.logEvent. This will allow us to migrate these schemas to Event Platform and EventGate. I guess this will undo some code you added in 2015 to support users without sendBeacon but hopefully this is not needed anymore.

Actually, @AndyRussG is anything actually needed to be done? I see that mw.eventLog.logEvent is called as long as the browser has sendBeacon. There is some custom logic to minimize the size of the event for the legacy urr enconded query param, but that will no longer be needed after we migrate. Since we'll be removing support for browsers without sendBeacon anyway, perhaps we can go ahead and migrate to Event Platform now, and just remove the unneeded code in CentralNotice about sendBeacon and checkEventLoggingURLSize afterwards?

Hi @Ottomata! Thanks so much for your help here!!

we'll be removing support for browsers without sendBeacon

Ah ok I was just going to ask about that... since it looks like we currently do still support IE 11, which doesn't support sendBeacon. Do you know the timeline for removing JS support for IE? Should we wait until that happens?

There is some custom logic to minimize the size of the event for the legacy urr enconded query param, but that will no longer be needed after we migrate.

Ah yeah that was the other question I had... So everything will get sent (or maybe is already sent?) as a POST, I guess, so we don't have to worry about payload size at all?

If the payload-size checking code isn't blocking the migration (i.e., if the method mw.eventLog.makeBeaconUrl() is not going away) I think we're probably fine leaving it in for now and removing it (and probably also updating our super-compressed schema) a bit later on...

Apologies for the super-old code, and thanks again! :)

since it looks like we currently do still support IE 11, which doesn't support sendBeacon. Do you know the timeline for removing JS support for IE? Should we wait until that happens?

Hm interesting! I don't know what that timeline is, but as far know I this CentralNotice code is the only custom code that works around the lack of support for sendBeacon. According to https://analytics.wikimedia.org/dashboards/browsers/#all-sites-by-browser, IE 11 is < 1% of our traffic. @Milimetric told me that dropping support in MW for old browsers is more about regional percentages of traffic rather than global (e.g. we don't want to drop a browser that most of some country still uses). But, perhaps the decision to drop support from MW is different than just dropping support for instrumentation of MW in old browsers?

We are hoping to turn off the legacy eventlogging backend service as soon as possible, perhaps in Q1 of next fiscal year.

Q: is collecting CentralNotice* events from IE 11 users essential for usage and analysis of those events?

I don't know what that timeline is, but as far know I this CentralNotice code is the only custom code that works around the lack of support for sendBeacon. According to https://analytics.wikimedia.org/dashboards/browsers/#all-sites-by-browser, IE 11 is < 1% of our traffic. @Milimetric told me that dropping support in MW for old browsers is more about regional percentages of traffic rather than global (e.g. we don't want to drop a browser that most of some country still uses). But, perhaps the decision to drop support from MW is different than just dropping support for instrumentation of MW in old browsers?

Agreed, important point. The potential concern in this case is also about regional impact. For example, IE's share is non-negligible in Japan. That said, I'm not suggesting at this point that any extra work is needed to preserve this workaround. I guess I'd just like to double-check with others before confirming we can turn it off.

We are hoping to turn off the legacy eventlogging backend service as soon as possible, perhaps in Q1 of next fiscal year.

Q: is collecting CentralNotice* events from IE 11 users essential for usage and analysis of those events?

Almost certainly not... I'll try to get back to you by Monday with a more definitive answer. Apologies for the delay and thanks so much for ur patience!! :)

@Pcoombe and @spatton as FYI. I'm wondering if you have feedback on older browser versions here.

For regular banner testing we're still using the pgehres data which comes in via a different method. So I don't think this would affect us at the moment.

For regular banner testing we're still using the pgehres data which comes in via a different method. So I don't think this would affect us at the moment.

Ah yeah important point--for the time being this only would impact on banner history data (since the other event schema we defined is currently not used for production).