This task is about fixing the banner bump, that is, the jump in page content due to banners loading and being injected in the background.
Following initial exploration of a mostly client-focused solution (called "pageview+1", described in T279034), I would like to try a different approach: run the main banner selection logic in the Varnish layer. Here is how it would work:
- The Mediawiki backend provides a single json representation of all campaign and banner settings.
- On the client, we count and store banners seen, banners closed, bucket assigments and similar data, as happens in the current system. All data points needed for banner selection are placed in a single cookie.
- Logic for banner selection is written in inline C in Varnish configuraiton files. This code reads the json representation of campaign and banner settings from the backend and the cookie from the client, selects a banner (or no banner) for the pageview, and injects it into the base HTML.
- Both the json for settings and the banner content are small, so they can be cached in RAM. Also, almost no cache fragmentation (i.e., very few Varnish hash values) would be needed for this.
- Other parts of CentralNotice don't change: analytics reporting code remains on the client, and the admin UI doesn't change (or changes very little).
Advantages over pageview+1:
- Better use of cache-layer resources and better cache performance, since we'd be adding essentially no extra cache fragmentation, everything could be cached in RAM, and there would be no additional internal network requests on every pageview.
- Simpler, cleaner architecture, since there would be only one code path for displaying banners, and we wouldn't have to guess whether a user is a frequent visitor or not.
- Guaranteed not to show less banners than under the current system.
- No impact on current one-hour, or short, A/B tests.
- All banners injected into the base HTML, so, no banner bumps for any users.
Here are some general requirements I'd propose for a fix to the banner bump. (These could apply to any option for a technical solution, not just the one described here.)
- Banners don't cause content to shift after page load.
- Banners don't negatively impact upcoming SEO metrics.
- Banners are shown to at least the same number of users as in the current system.
- We keep existing features for banner/campaign setup, including:
- Targeted emergency/maintenance/blackout campaigns taking all pageviews.
- Banners for a campaign can be switched and updated, and these changes take effect quickly.
- Banners can be taken down quickly.
- Low-level or short A/B tests can be performed as in the current system.
- CentralNotice administration UI remains the same or very similar.
- Analytics reporting unchanged.
- Site performance is not affected.
- No or almost no additional risks of site outages.
- If there needs to be significant restructuring of code, we'll do it once, not twice or thrice.
- The system's architecture will allow, as a possible follow-on feature, long-running, low-level tests involving a representative sample of devices (rather than just a random sample pageviews).
- The system works as simply as possible, given these requirements.
(Note: I am working on this on my own time, so, not as part of my work as an FR-Tech engineer. Unless otherwise noted, this task is not included in planned work by FR-Tech. Please consider my contributions here as you would those of a volunteer contributor. Thanks so so much!! - @AndyRussG)