Page MenuHomePhabricator

CentralNotice code to fix the banner bump with “pageview+1 with exceptions for infrequent visitors and as needed”
Open, LowestPublic4 Estimated Story Points

Description

This task is for CentralNotice code to fix the banner bump, that is, the jump in page content due to banners loading and being injected in the background.

The general approach suggested is: for most users, only show banners on their second pageview over the course of a campaign. On the first pageview, we decide which banner (if any) to show on the next pageview, and set a cookie with the requested banner name and some additional data points. On the next pageview, we add the banner to the HTML, either on the server in the cache layer, or very early on during page display.

There are options to consider in two areas for this approach:

  1. how to inject the banner; and
  2. how to ensure banners are still seen by infrequent users, or users for whom cookie persistence is an issue.

Regarding the first point, how to inject the banner, the options are:
a) Use ESI (edge-side injection) in the Varnish (cache) layer to add banners directly to the base HTML.
b) On the first pageview, in addition to setting the cookie, load banner content into LocalStorage. On the second pageview, on the server, dynamically modify site css based on the cookie, which page display waits on, to reserve space for the requested banner, then inject the banner content from LocalStorage using JS in the ResourceLoader startup process.

On the second point, mitigation for users excluded by pageview+1, options are:
i) Dynamically inject banners on the first pageview for users that seem likely not to return before the campaign ends. This is the simplest option, but will still cause a banner bump for those users.
ii) Choose a banner on the server for such users, based on targeting data points available server-side. This seems feasible since banner selection mostly only considers information stored on the client for frequent visitors.
iii) Start setting cookies on clients a long time before a campaign starts, but only have the cookies trigger banner injection once it starts.

The incomplete, WIP patch currently attached to this task uses option a) ESI for banner injection, and has not yet implemented any measures for pageivew+1-excluded users.

Notes on the a) ESI approach

ESI (edge-side injection) would be used in the simplest possible way, in the Varnish layer, to add banners directly to the base HTML.

Banner content would be retrieved via a call to Special:BannerLoader, made from Varnish. Responses from that page would be deterministically based on the contents of the cookie set by client-side code on the previous pageview. The cookie would contain at least the name of the banner to be injected and the campaign associated with it, though it could also be configured to include additional targeting data points, depending on cache capacity. (Those additional data points are not hard requirements for this approach, but they would help make the system fully responsive to changes in campaign and banner settings.)

Advantages of a pageview+1 approach

In addition to allowing banner content to be fully deterministic based on cookie content, this method would require relatively little new code, compared to other possible ways of fixing the banner bump, since existing CentralNotice client-side code would continue to do the heavy lifting of banner selection, as occurs in the current system. We would run basically all existing client-side code and would retain all existing features for banner selection, impression limiting, data reporting, large-then-small banners, A/B testing, other banner sequences, banner hiding, etc. From the perspective of client-side code, the main difference would be that, instead of loading and injecting a banner, a cookie would be set to request banner injection by the cache on the user’s next pageview.

Notes on mitigation

It's currently not clear what percentage of users who typically click on banners might be affected by an exclusively pageview+1 approach. Initial analysis of a single Fundraising campaign indicates it may be around 10% (see T280478).

Measures should also be put in place for campaigns that must show a banner on every pageview (like maintenance or blackout campaigns). This can be implemented server-side.

(Note: I am working on this on my own time, so, not as part of my work as an FR-Tech engineer. Unless otherwise noted, this task is not included in planned work by FR-Tech. Please consider my contributions here as you would those of a volunteer contributor. Thanks so so much!! - @AndyRussG)

Event Timeline

(Note: I am working on this on my own time, so, not as part of my work as an FR-Tech engineer. Unless otherwise noted, this task is not included in planned work by FR-Tech. Please consider my contributions here as you would those of a volunteer contributor. Thanks so much!!)

@AndyRussG your proposed solution is not ideal. It could seriously hamper non fundraising campaigns which are generally run by the community. There are a number of other types of campaign that would need to be excluded including maintenance and advocacy campaigns.

Even so community campaigns typically use substantially smaller banners and therefore create a smaller bump. To handicap the least problematic campaigns seems a little inequitable here.

@AndyRussG your proposed solution is not ideal. It could seriously hamper non fundraising campaigns which are generally run by the community. There are a number of other types of campaign that would need to be excluded including maintenance and advocacy campaigns.

Hi @Seddon! Thanks for your feedback... Can you please explain what you mean here about how advocacy campaigns would be hampered? Maybe there's a misunderstanding?

In theory the exception for infrequent users could be extended to other types of campaigns too, and we'd still have an improvement in usability for the vast majority of pageviews.

I feel the best option would be to try to pull some actual data about click-through rates for infrequent users for both Fundraising and community campaigns. That would give us a basis for knowing what the impact of an infrequent-user exclusion might be, and also would provide data for tuning how we might flag pageviews that are likely from infrequent users.

DStrine triaged this task as Lowest priority.Apr 1 2021, 3:24 PM
DStrine moved this task from Triage to Blocked or not fr-tech on the Fundraising-Backlog board.

The rest of my comment about inequitably treating community campaigns is the main point but advocacy campaigns include blackout campaigns where full screen banners are used. That's only possible when banner impressions are delivered on all page views.

The rest of my comment about inequitably treating community campaigns is the main point but advocacy campaigns include blackout campaigns where full screen banners are used. That's only possible when banner impressions are delivered on all page views.

Ah right... so for blackouts, maintenance campaigns or other notices that should be shown on all pageviews, there's another solution, which is to add a config option for server-side code just to force the banner injection at the cache layer (based just on targeting data points that can be known on the server). I think that's fine for such cases, since we don't need to limit impressions, do A/B testing, or basically use any client-side data or randomness for such campaigns. I can add an explanation of this option to the task description.

Regarding which campaigns might use the option for immediate banners for infrequent users, many apologies that it seemed inequitable... I guess it was based on the assumption that community campaigns would be less likely to get much of a response from infrequent visitors. However that assumption could well be totally wrong! Really how such an exception might be applied should be data-based, I think (see edit to my previous comment). I can also change the task name and description to reflect this. :)

Thanks again!!

There are a number of technical issues here whereby using cookies is going to be problematic for privacy sensitive users (theoretically they might never see a community banner) or on mobile where cookie setting is known to be problematic for us where we could see real impacts on campaigns like Wiki Loves Monuments which is heavily skewed towards mobile or in non-european countries where the majority of our traffic is mobile.

There are a number of technical issues here whereby using cookies is going to be problematic for privacy sensitive users (theoretically they might never see a community banner)

Any campaigns that use impression limiting are already not shown to users with cookies or local storage disabled (see this code). All the currently active community campaigns are using this feature.

So, in the current system, this we already have this exact issue, for community, and also Fundraising, campaigns. I think there's really no way around it in the current system, or in practically any other setup we could devise.

Also to note, the cookies would contain no personal information, and would be removed by immediately client-side code on the next pageview (though the same code might also set a new banner cookie if a banner is requested for the pageview after that).

or on mobile where cookie setting is known to be problematic

Hmmm, I'm not aware of any make-or-break issues in this regard. Cookies are used throughout the MediaWiki ecosystem for many features. Can you provide details and links? In theory, if there are real problems with cookies on certain platforms, we could also detect those platforms in Varnish and place them in the "immediate banner view" category. If you're worried about users visiting the site in "web view", accessed inside a mobile app, this can also be detected server-side.

we could see real impacts on campaigns like Wiki Loves Monuments which is heavily skewed towards mobile or in non-european countries

I think this is incorrect. (Again, if you have specific details on mobile cookie issues, please elaborate. :) )

However, we should also note that the banner bump is almost certainly a worse problem for mobile users with slower data service anywhere, since for those users, banners will take longer to load, causing the bump to happen later. How many thousands of users experience a few seconds of annoyance every day because of banner bumps, when they mis-click or lose their place in text they were reading? How much effort is it worthwhile to expend to remove a tiny annoyance from thousands of lives every day? Undoubtedly the problem is skewed to affect underprivileged regions and demographics more severely.

AndyRussG renamed this task from CentralNotice code to fix the banner bump with “ESI and pageview+1 with Fundraising exception for infrequent visitors” to CentralNotice code to fix the banner bump with “ESI and pageview+1 with possible exception for infrequent visitors”.Apr 1 2021, 6:21 PM
AndyRussG updated the task description. (Show Details)

I can add an explanation of this option to the task description.
[...]
I can also change the task name and description to reflect this. :)

OK, I've updated the task name and description accordingly. Thanks again! :)

From your description: on pageview 1 a cookie would be set that defined what banner content should be served on pageview 2.

On mobile if a user changers app then would again get served "pageview 1" and never get to pageview 2. You could easily do that 3, 4 or 5 times across a number of apps and never get to pageview 2. In which case a user is never served a banner.

From your description: on pageview 1 a cookie would be set that defined what banner content should be served on pageview 2.

On mobile if a user changers app then would again get served "pageview 1" and never get to pageview 2. You could easily do that 3, 4 or 5 times across a number of apps and never get to pageview 2. In which case a user is never served a banner.

OK, so you're talking about WebView (as it's called on Android, at least). This is detectable in the UA string. So, we can also place users visiting the site in that mode in the "immediate banner view" bucket.

Since it's in the UA string, it's also in our pageview data. So, we can also run some queries to find what percentage of banners are currently served to users visiting the site in that mode, and for many campaigns, what percentage of banner impressions result in click-throughs in that mode. I expect it's not a lot, but it's definitely worth looking at the data.

AndyRussG renamed this task from CentralNotice code to fix the banner bump with “ESI and pageview+1 with possible exception for infrequent visitors” to CentralNotice code to fix the banner bump with “ESI and pageview+1 with possible exceptions for infrequent visitors or as needed”.Apr 1 2021, 6:44 PM

OK, I updated the task title to reflect that the exception might be needed for more than just infrequent users. :)

Hi all! @BBlack, @bd808, @ema do you think this is a potentially feasible approach from a Varnish perspective? I think it's close to the most basic possible way to try out ESI...? If implemented, probably it should be rolled out gradually while cache load and performance are monitored?

(Also, just drawing everyone's attention to the note at the bottom of the task description, in case it's relevant: I'm working on this in my free time, not as part of my work, and it's not planned work for FR-Tech. Thanks much!! :) )

Without being cached Special:BannerLoader would need to be quite fast, as it will be the bottleneck in terms of response time when the cookie is present, including (currently) latency to the active main datacenter. Could calls to Special:BannerLoader made by the edge cache be cached for some time, varying by the set of parameters sent to it?

I know this would create stickiness in terms of campaign start and end (eg. if you cache Special:BannerLoader response for 5 minutes, your campaign might start and end 5 minutes late), but it would greatly reduce the cost per pageview for pages where a banner needs to be injected.

Without being cached Special:BannerLoader would need to be quite fast, as it will be the bottleneck in terms of response time when the cookie is present, including (currently) latency to the active main datacenter. Could calls to Special:BannerLoader made by the edge cache be cached for some time, varying by the set of parameters sent to it?

I know this would create stickiness in terms of campaign start and end (eg. if you cache Special:BannerLoader response for 5 minutes, your campaign might start and end 5 minutes late), but it would greatly reduce the cost per pageview for pages where a banner needs to be injected.

Hi! Yep, that's exactly the idea! Special:BannerLoader would be cached based on its parameters, which is how it works already. (The only thing that would change is that the cache would vary based on parameters in a cookie, instead of the URL parameters, as occurs now.) That's why it's important that the response continue to be deterministic based on the parameters.

We already have campaign stickiness like that, due to Special:BannerLoader and another request (ResourceLoader/load.php, which is where the client gets its list of possible campaigns and banners to chose from) also being cached. For emergencies (like a mistake in a banner that needs to be taken down right away) we also already have a UI to flush part of the Special:BannerLoader cache, making it possible to at least stop a banner immediately, if need be. I imagine this mechanism could be adapted to the new system (and might even just work with almost no changes).

Thanks so much!! :)

Change 677637 had a related patch set uploaded (by AndyRussG; author: AndyRussG):

[mediawiki/extensions/CentralNotice@master] [WIP DO NOT MERGE] ESI banners

https://gerrit.wikimedia.org/r/677637

This task presupposes that the "banner bump" as it currently exists is a problem or will affect SEO. I have several responses to this:

#1
The current shift of content due to a banner is acceptable to all stakeholders. There are some banners that hurt performance more and we tend to debate those on a per-banner basis. This task is not needed to address any of those circumstances.

#2
The impact to SEO is not yet known and no stakeholder would want an engineering solution right now. There are ways of changing banners to minimize impact. WMF stakeholders would want to monitor the situation for a while before changing any plans let alone implementing an engineering solution.

#3
The feature described in this task would be detrimental to all use cases of CN. I can't think of a single CN user who would want this implementation. There are many casual readers who drop in for a single fact from an article and may not return for some time. The very first impression is critical to all CN campaigns. It is the source of a large share of donations during a fundraising campaign.

I will not allow a change like this to go to production for the foreseeable future.

Hi all! @BBlack, @bd808, @ema do you think this is a potentially feasible approach from a Varnish perspective? I think it's close to the most basic possible way to try out ESI...? If implemented, probably it should be rolled out gradually while cache load and performance are monitored?

Supporting ESI in our edge, in production, is a really complex matter, in ways that limited testing of the feature in isolation doesn't capture. Even if we assume that resourcing and priority are non-issues: there are a lot of complexities in blending ESI with the existing logic and functionality, and a lot of risks that are difficult to manage, and also issues around vendor-locking ourselves into parts of our technical stack based on how (and how well, and how efficiently, and with what interactions with other logic) they implement ESI, which complicates other ongoing evolutions of the stack. As much as I would love for our team to provide the magic feature bullet which makes mitigating this easier, It's almost certainly not something we could reasonably support in the kinds of short timeframes that have been discussed for the search-ranking side of the issue for mobile, if ever.

Personally (but I grant that there are many constraints here that I'm only dimly aware of!), I'd much rather see solutions in the direction of not shifting the layout at all: either deciding on a minimal fixed height and filling it with blanks or "article of the day" sorts of quips when no banner is warranted, or switching to some kind of non-shifting overlay at the top or bottom.

This task presupposes that the "banner bump" as it currently exists is a problem or will affect SEO. I have several responses to this:

#1
The current shift of content due to a banner is acceptable to all stakeholders. There are some banners that hurt performance more and we tend to debate those on a per-banner basis. This task is not needed to address any of those circumstances.

You've forgotten our most important stakeholders: the millions of people who use the site every day!

(And that's exactly whose experience the new SEO metrics take into account.)

#2
The impact to SEO is not yet known

No, this is wrong. It has been fully demonstrated that the banner bump negatively impacts our score on the new metrics that Google is rolling out. And Google has stated unequivocally that the new metrics will be taken into account in search rankings.

So, it is absolutely known that the banner bump will hurt our search rankings.

What is not known is how great the impact will be. As far as we can tell, at this point, it could be a little, or it could be a lot. Impact will probably be greater for small language wikis, and in regions where connectivity is worse and people use slower devices. This is exactly where the movement is seeking to grow communities. Google search results are our bread and butter, and it's our responsibility to provide technical support (in this case, in the form of a performant, usable, SEoptimized site) for continued movement growth.

Also, even though we don't know how much SEO will be impacted, we almost certainly have the ability to stop any negative impact, via a solution like the one described here (at least in the medium term--see this comment about related cache issues).

Significant resources have already been spent studying SEO of our sites and improving them for it. See this epic task for improving SEO, with dozens of subtasks, on which we've expended a huge cumulative effort. See also this research, and this other research, another task, more research, work on a dashboard and on archiving SEO data, this library for processing SEO data, this dashboard prototype, and all the tasks with the SEO tag. I'm sure I missed some.

In summary: the banner bump will affect search rankings; it will likely impact most the communities where the movement seeks to grow and where more people are getting online for the first time; the importance of SEO is broadly recognized throughout the movement and the WMF; many hundreds of hours, probably more, have been spent by staff and volunteers on it. Even without knowing how great the impact of the new metrics will be, preventative measures are absolutely warranted.

There are ways of changing banners to minimize impact.

This is also wrong. All the possible ways to change banners either (1) run afoul of another new metric that Google is implementing at the same time, or (2) involve drastic changes that would indeed affect campaigns (unlike the solution proposed here, see below for replies on that) as well as site usability.
Let's review these options:

  1. Banners overlay content. For this to work, banners would have to be small, because of the new Largest Contentful Paint (LCP) metric that goes into effect at the same time as CLS (the metric that the banner bump impacts). LCP will penalize sites when the largest element in the viewport takes a long time to render. So, if a banner is the largest element on a mobile screen, and it renders late (as it would in the current system), our rankings will also take a hit. In addition, if all banners have to be manually closed to reveal content, I'm sure readers and editors will find it incredibly annoying; the objections will likely be... strident. And I'd even be concerned about readership going down due to the annoyance.
  1. Banners are animated and overlay non-content elements. So, this option is like the "nag" elements that Fundraising banners sometimes have when you scroll away from the main banner at the top of the page. You wouldn't have to close them to access content, but they'd be annoying. As in the previous option, the banners would have to be small to not affect our score on the LCP metric, and also to not obscure too much actual content. This seems a bit better than the previous option, though I think objections from readers and editors would still be quite significant. Also, the small size and animated nature would probably mean reworking almost all banner layouts and content currently in use.
  1. Always reserve space, even if there is no banner. Banners would again have to be small, because of LCP, as with the other two options. There is a significant danger that users would get used to ignoring whatever's in this space, since it would almost never contain the information they're looking for (though some amount of animation or eye-catching design might help mitigate this). Reserving space like this would also throw a wrench into ongoing work to clean up site design.

So, just to understand correctly: in the midst of a global economic crisis, you're suggesting radical, untested changes to our banner layouts and designs, and to all Fundraising banner content? Expected community and reader objections aside, this just seems incredibly risky. I see no way the risk wouldn't outweigh engineering effort needed to just fix the problem.

And there's another risk here. Changes to banner layout and design seem likely to have to go hand-in-hand with a reduction in the overall number of banners. This would probably be needed to keep the level of annoyance to a minimum and/or to reduce the impact on LCP that even small or medium-sized banners could have with very small screens. In that event, there would almost certainly be strong differences of opinion about which banners (Fundraising vs. different community campaigns vs. surveys, etc.) would have to be made smaller and/or reduced in number. Are you sure we should risk conflicts over banner real estate, rather than just fixing the problem? Again here, I don't see how not fixing the issue is justified.

To put the last point more anecdotally: some people have reacted to this by saying, "Oh, good, we'll show less Fundraising banners. We already show too many." But others say, "Oh, good, less community banners. That'll reduce banner fatigue, so people will notice Fundraising banners more, and we'll make more money." Truly I do not think that would be a healthy conflict to have. (And, of course, both positions are wrong. No one should use this issue, and avoid fixing a real usability problem on the way, for the purposes of an unrelated goal regarding how many or which banners we show. Anyone seeking to change banner numbers or types or formats should address those issues on their own terms, via the appropriate channels.)

WMF stakeholders would want to monitor the situation for a while before changing any plans let alone implementing an engineering solution

Even if the most likely outcome were that the sites' rankings only decrease a little, there's still a non-negligable chance that they'll go down a lot. This is a needless risk. We have a duty of care for these sites.

#3
The feature described in this task would be detrimental to all use cases of CN.

No. This is false, and demonstrably so.

I can't think of a single CN user who would want this implementation.

CentralNotice administrators, I guess you mean? Hmmm, so wouldn't they be concerned that people need to get to the site to see banners to begin with?

There are many casual readers who drop in for a single fact from an article and may not return for some time.

How many is unclear, and the proposed solution accounts for those users. I think that almost all of them would still see banners.

The very first impression is critical to all CN campaigns. It is the source of a large share of donations during a fundraising campaign.

No. This is 100% a misunderstanding of Fundraising data.

While most donations occur on the first banner a user sees, it does not follow that most of those users do not return to the site over the course of the campaign. Although we still need to determine how many users click a banner while only visiting the site once over the course of a campaign, the raw data is available and we should do the analysis. And, again, as explained, the proposal here aims to take those users into account. The data will show how many such users there are and how well the proposed measures can work to ensure they still get banners.

I will not allow a change like this to go to production for the foreseeable future.

I very much appreciate you taking the time to write your thoughts here, thanks so much for that. Hopefully my replies above aren't too harsh? Ahhh... :) Thanks so so much once again!! :) :)

AndyRussG updated the task description. (Show Details)
AndyRussG added a subscriber: mpopov.

Hi all! @BBlack, @bd808, @ema do you think this is a potentially feasible approach from a Varnish perspective? I think it's close to the most basic possible way to try out ESI...? If implemented, probably it should be rolled out gradually while cache load and performance are monitored?

Supporting ESI in our edge, in production, is a really complex matter, in ways that limited testing of the feature in isolation doesn't capture. Even if we assume that resourcing and priority are non-issues: there are a lot of complexities in blending ESI with the existing logic and functionality, and a lot of risks that are difficult to manage, and also issues around vendor-locking ourselves into parts of our technical stack based on how (and how well, and how efficiently, and with what interactions with other logic) they implement ESI, which complicates other ongoing evolutions of the stack. As much as I would love for our team to provide the magic feature bullet which makes mitigating this easier, It's almost certainly not something we could reasonably support in the kinds of short timeframes that have been discussed for the search-ranking side of the issue for mobile, if ever.

Hi! Thanks so much for this. :) So, I guess to summarize: it's complicated, definitely can't be done quickly and easily, might even require additional hardware, and has non-trivial implications for the future of the stack, though it's also probably not downright impossible to do one day, either. It would at least require careful planning and would have to be prioritized above other important work. Does that sound that more or less right?

Personally (but I grant that there are many constraints here that I'm only dimly aware of!), I'd much rather see solutions in the direction of not shifting the layout at all: either deciding on a minimal fixed height and filling it with blanks or "article of the day" sorts of quips when no banner is warranted, or switching to some kind of non-shifting overlay at the top or bottom.

Right... So, this is something many folks have suggested, though unfortunately I think it's not feasible, or at the very least, not ideal, and risky. (See my last comment, above, for details. Just briefly: overlays will also cause SEO issues under the new metrics, and as for reserved space, it would need to be small, and people learn to ignore what they don't care about pretty quickly. At the end of the day, all the other options trade one usability problem for another, or for a different sort of risk or problem. That's why just fixing the banner bump the right way--even if it's difficult and requires assignment of already scarce resources--really seems like the best option. Also, apologies if you feel I've underestimated the difficulty of the cache changes!)

Thanks again!! :)

If the banner was position fixed (or absolute, sticky) to the bottom of the screen by default might that dodge the issue entirely by requiring a repaint but not a cumulative layout shift/reflow? Have we done any tests to see if such banners would fare worse?

ESI would be a game-changer for CentralNotice, thank you @AndyRussG for exploring this possibility!

I have some scattered notes to contribute,

  • First-visit or cookie-blocking users could still be included by having them fall back to a dynamically-loaded banner (if any). ESI could simply inject an empty string, and already-present JS detects this, or we could inject special JS to handle these users. Maybe that solves the problem? If not, then we need to look at some analytics correlating visitors' WMF-Last-Access cookie to banner impressions and to donations.
  • We have to be careful to serve the same content to search indexing spiders as we do to users. Even accidentally producing "ad-free" content for the spiders will be punished by our algorithmic overlords. Industry jargon for this is "cloaking". (1) (2)
  • Caching the Special:BannerLoader results is critical as already mentioned. It needs to serve at least as fast as the actual page content, or we risk slowing down the entire page load which is obviously bad for everyone.
  • Just to mention, Varnish can do fancy processing of cookies and GeoIP information, allowing us to construct most of the bannerloader params even on a first pageload.

If the banner was position fixed (or absolute, sticky) to the bottom of the screen by default might that dodge the issue entirely by requiring a repaint but not a cumulative layout shift/reflow?

Hi! Thanks! Yes, I think that's the right... and this does seem to be one of the least bad options for minimizing SEO impact via changes in banner format. The footer would have to be small. If it's done well, I think it might be only a bit more annoying than current banners? Still, I wouldn't be surprised if many readers and editors don't like it. It wouldn't be compatible with Fundraising's current initial-large-banner approach.

Have we done any tests to see if such banners would fare worse?

Not that I know of... However, it does seem likely that, at least temporarily, we'll have to switch to SEO-friendly banner formats. So, I'm pretty sure we should do some tests! @Pcoombe, do you know if there have been any?

I do feel SEO-friendly banner format options should be considered, given how soon the new Google metrics are coming online... So, spinning off a new task for that.. see below... hope that makes sense... thanks again!!

ESI would be a game-changer for CentralNotice, thank you @AndyRussG for exploring this possibility!

I have some scattered notes to contribute,

  • First-visit or cookie-blocking users could still be included by having them fall back to a dynamically-loaded banner (if any). ESI could simply inject an empty string, and already-present JS detects this, or we could inject special JS to handle these users. Maybe that solves the problem? If not, then we need to look at some analytics correlating visitors' WMF-Last-Access cookie to banner impressions and to donations.

Yes! Note that cookie-blocking users mostly don't see banners, but first-visit and cookie-clearing users do. In JS, we can tell whether cookies are actively blocked vs. just not present. Definitely agreed that cookieless-but-not-cookie-blocking users should be flagged to get a banner on the first pageview.

However, that still leaves the case of infrequent users, who may not come back to the site over the course of a campaign, since with a pageview+1 approach, they'd only see a banner on their second pageview while the campaign is active. So, I think WMF-Last-Access analytics work is needed.

  • We have to be careful to serve the same content to search indexing spiders as we do to users. Even accidentally producing "ad-free" content for the spiders will be punished by our algorithmic overlords. Industry jargon for this is "cloaking". (1) (2)

Ohhh yeah good point, thanks!!

  • Caching the Special:BannerLoader results is critical as already mentioned. It needs to serve at least as fast as the actual page content, or we risk slowing down the entire page load which is obviously bad for everyone.
  • Just to mention, Varnish can do fancy processing of cookies and GeoIP information, allowing us to construct most of the bannerloader params even on a first pageload.

Right! So, this is also a route for mitigation for infrequent users... Most of the info that we keep on the client (banner impression counts, large-then-small flags, assigned bucket) isn't relevant for infrequent users, so server-side banner selection, on the first pageview, would in theory be possible for them. At least some randomness might still be needed, at least for initial bucket selection, though as @Krinkle has pointed out elsewhere, we could generate some pseudo-randomness from other components of the request (like article name). Maybe it'll be time to revive the old slices (or what was it called) feature?

Another possible mitigation could be a campaign "preparation" period, during which banners are not shown, but cookies are set to trigger future banner display, once the campaign begins?

AndyRussG renamed this task from CentralNotice code to fix the banner bump with “ESI and pageview+1 with possible exceptions for infrequent visitors or as needed” to CentralNotice code to fix the banner bump with “pageview+1 with exceptions for infrequent visitors and as needed”.Apr 19 2021, 6:29 AM
AndyRussG updated the task description. (Show Details)
AndyRussG updated the task description. (Show Details)

Updates:

  • Initial queries on data from the recent Italy Fundraising campaign show that, after clicking on a banner, almost 90% of donors returned to the site before the campaign ended. (Details in the task linked below.)
  • I updated the task description with explanations about additional possible ways to ensure banners for users who would be left behind by an exclusively pageview+1 system.
  • There is also a fully client-side option for fixing the banner bump, which does not require ESI; I also update the task description to include it. (Thanks @Krinkle for going over this!) (I still like the ESI option better, though...)
  • Status of the attached patch: probably about 60% finished CentralNotice code for ESI banners.

So, also, it seems there are just too many facets to the general banner bump/SEO topic for this single task! Perhaps this task can focus on technical aspects of pageview+1 solutions? Here are some additional tasks for related issues that have come up:

I don't want to derail the discussion with weird ideas I've just had - the current proposals are great - but since they're unusual and haven't been discussed before I wanted to bring them up. Feel free to spin that into a separate task to discuss those ideas separately.

When we look for alternative banner formats that don't cause layout shifts, we know that we have to avoid making it too big, so that instead of being captured by Cumulative Layout Shift, it doesn't end up being captured by Largest Contentful Paint instead.

There is an important factor in the definition of Largest Contentful Paint:

The browser will stop reporting new entries as soon as the user interacts with the page (via a tap, scroll, or keypress

In other words, we could wait for such user interactions to happen before injecting a (non-layout-shift-inducing) banner. Maybe with a little delay so it's not too jarring, but it might work. In this case we wouldn't have to worry about the banner size.

If we want to insert a (non-layout-shift-inducing) banner before user any interaction, we can also consider dynamically sizing it to always be smaller than the biggest element in the viewport. Which, most of the time is the first paragraph, or can be a big thumbnail. We can find via JS which element is the biggest and shrink the banner to be smaller if necessary. Or we can just not display the banner at all if the viewport has such extreme dimensions that even a shrinked banner wouldn't fit without being the biggest element in view.

These mitigation strategies actually work because those banners would be injected with JS and because that injection happens relatively late in the page's initial rendering. Using those infrastructural weaknesses to our advantage.

[...] ideas I've just had [...] Feel free to spin that into a separate task to discuss those ideas separately.

Cool beans, thanks!! I did copy your comment to T280477, and updated the explanations there. :)

Just noting here, Google has postponed the rollout of its new search ranking metrics to mid-June.

And the rollout will be gradual until August, with no details on how the gradual part will work.

Just thinking this over yet again... so, if, for mitigation for infrequent users, we use option (ii) (see the current task description) "choose a banner server-side based on targeting data points available there", we could actually invert the whole proposal. That is, instead of "pageview+1 with exceptions for infrequent visitors and as needed", it could become "server-side banner selection and immediate banner injection, with pageview+1 mitigation for frequent users". (In this case, "mitigation for frequent users" would mean, more or less, preventing them from seeing too many banners, as we currently do, using data stored on the client.)

I do think such an implementation is possible. Careful planning and testing would still be essential. Server-side selection could take into account donate cookies... and there would be plenty more details to figure out, too.

Nonetheless, that method would probably end up showing a few more banners, not less, than the current system. Maybe that could help allay concerns we've heard about an engineering solution to the banner bump? And if banners are only ever injected right away (either via ESI or early JS/css injection during page display) we'd be pretty certain to stop CentralNotices's SEO impact, I think?

(Thanks so much @awight for suggestions on this, btw!)

Mentioned this at today's Front-End Standards Group meeting, but I think this problem would be a good candidate for the new Technical Decision Making Process because it cuts across the concerns of several different teams. This process already includes a framework for evaluating different possible solutions (including the "solution" of leaving the current status quo in place).

See https://phabricator.wikimedia.org/tag/tech-decision-forum/ for some other comparable proposals.

@egardner thanks for the note. This is not currently a priority for fr-tech. There are no plans to work on this in the foreseeable future. If it becomes a priority I will initiate the process. Thanks!

@egardner thanks for the note. This is not currently a priority for fr-tech. There are no plans to work on this in the foreseeable future. If it becomes a priority I will initiate the process. Thanks!

@DStrine heyy quick question: if a technical solution were available that we could guarantee would show the same number of banners as the current system, and if work were organized/assigned in such a way that fr-tech was not impacted at all (for example, if a team other than fr-tech took ownership) do you imagine there might still be significant concerns about such a solution?

Thanks so so much, and many apologies for the bother...!!!! (just wishing to perhaps learn a bit more about everyone's points of view here... :) )

Hi @Jdlrobson... thanks so so much for having taken a look at this!! :) Did you see this alternate, possibly superior approach? T283521: Proposal: Fix banner bump with server-side cache-layer banner selection Thx again!