Page MenuHomePhabricator

MW returns uncacheable responses for en.wikipedia.org when specific XFF values are sent
Closed, ResolvedPublicSecurity

Description

Installing the new cp hosts in eqiad (T349244) we noticed a strange behavior while requesting resources from the new text hosts (but same behavior applies when the request is made also from other hosts):

fabfur@cp1100:~$ curl -H "X-Forwarded-For: 2620:0:861:ed1a::1, 10.64.16.240" -H "Host: en.wikipedia.org" -H "X-Forwarded-Proto: https" https://appservers-rw.discovery.wmnet/wiki/Foobar -v -o /dev/null -s 2>&1  | egrep -i 'expires|cache'
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Cache-Control: private, must-revalidate, max-age=0

This, being uncacheable by ATS, results in a pass to Varnish.

Some notes to help troubleshooting:

  • The XFF header is populated by both HAProxy and Varnish
  • This happens only in eqiad|codfw, when the XFF is set to the local text-lb address:
vgutierrez@cp6009:~$ for dc in eqiad codfw esams ulsfo eqsin drmrs; do echo $dc && curl -H "X-Forwarded-For: $(dig +short text-lb.$dc.wikimedia.org), 10.136.0.10" -H "X-Forwarded-Proto: https" -H "Host: en.wikipedia.org" https://appservers-rw.discovery.wmnet/wiki/Foobar -v -o /dev/null -s 2>&1 |egrep -i "Expires|cache"; done
eqiad
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Cache-Control: private, must-revalidate, max-age=0
codfw
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Cache-Control: private, must-revalidate, max-age=0
esams
< Cache-Control: s-maxage=86400, must-revalidate, max-age=0
ulsfo
< Cache-Control: s-maxage=86400, must-revalidate, max-age=0
eqsin
< Cache-Control: s-maxage=86400, must-revalidate, max-age=0
drmrs
< Cache-Control: s-maxage=86400, must-revalidate, max-age=0
  • This happens only on en.wikipedia.org domain, as a counter-example:
vgutierrez@cp6009:~$ for dc in eqiad codfw esams ulsfo eqsin drmrs; do echo $dc && curl -H "X-Forwarded-For: $(dig +short text-lb.$dc.wikimedia.org), 10.136.0.10" -H "X-Forwarded-Proto: https" -H "Host: it.wikipedia.org" https://appservers-rw.discovery.wmnet/wiki/Friedrich_von_Kenner -v -o /dev/null -s 2>&1 |egrep -i "Expires|cache"; done
eqiad
< Cache-Control: s-maxage=1209600, must-revalidate, max-age=0
codfw
< Cache-Control: s-maxage=1209600, must-revalidate, max-age=0
esams
< Cache-Control: s-maxage=1209600, must-revalidate, max-age=0
ulsfo
< Cache-Control: s-maxage=1209600, must-revalidate, max-age=0
eqsin
< Cache-Control: s-maxage=1209600, must-revalidate, max-age=0
drmrs
< Cache-Control: s-maxage=1209600, must-revalidate, max-age=0
  • Not setting XFF header or setting it only with the non text-lb address results in a cacheable response:
fabfur@cp1100:~$ curl -H "Host: en.wikipedia.org" -H "X-Forwarded-Proto: https" https://appservers-rw.discovery.wmnet/wiki/Foobar -v -o /dev/null -s 2>&1  | egrep -i 'expires|cache'
< Cache-Control: s-maxage=86400, must-revalidate, max-age=0

fabfur@cp1100:~$ curl -H "X-Forwarded-For: 10.136.0.10" -H "Host: en.wikipedia.org" -H "X-Forwarded-Proto: https" https://appservers-rw.discovery.wmnet/wiki/Foobar -v -o /dev/null -s 2>&1  | egrep -i 'expires|cache'
< Cache-Control: s-maxage=86400, must-revalidate, max-age=0

fabfur@cp1100:~$ curl -H "X-Forwarded-For: 208.80.154.224" -H "Host: en.wikipedia.org" -H "X-Forwarded-Proto: https" https://appservers-rw.discovery.wmnet/wiki/Foobar -v -o /dev/null -s 2>&1  | egrep -i 'expires|cache' # 208.80.154.224 is text-lb.eqiad.wikimedia.org
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Cache-Control: private, must-revalidate, max-age=0

Quick recap: seems that this behavior (uncacheable response) is only reproducible setting 'text-lb.eqiad|codfw.wikimedia.org' XFF IPs (both IPv4 and IPv6) on en.wikipedia.org domain.

Details

Risk Rating
High
Author Affiliation
Wikimedia Communities

Event Timeline

Fabfur updated the task description. (Show Details)
Fabfur updated the task description. (Show Details)
Fabfur shifted this object from the S1 Public space to the Restricted Space space.Nov 9 2023, 2:26 PM
Joe shifted this object from the Restricted Space space to the S1 Public space.Nov 9 2023, 3:55 PM
Joe changed the visibility from "Public (No Login Required)" to "acl*security_sre (Project)".
Joe changed the edit policy from "All Users" to "acl*security_sre (Project)".
Joe changed the visibility from "acl*security_sre (Project)" to "Public (No Login Required)".
Joe changed the edit policy from "acl*security_sre (Project)" to "All Users".
Joe set Security to Software security bug.
Joe added projects: Security, Security-Team.
Joe changed the visibility from "Public (No Login Required)" to "Custom Policy".
Joe changed the subtype of this task from "Task" to "Security Issue".
Joe subscribed.

potential abuse

Vgutierrez added subscribers: taavi, Vgutierrez.

as pointed out by @taavi on IRC this behavior is triggered by the presence of User talk pages for certain IPs like https://en.wikipedia.org/wiki/User_talk:2620:0:861:ED1A:0:0:0:1 or https://en.wikipedia.org/wiki/User_talk:208.80.154.224. This triggers a notification for every page that a user visit from that IP making the response uncacheable:

vgutierrez@cp6009:~$ curl -H "X-Forwarded-For: $(dig +short text-lb.eqiad.wikimedia.org), 10.136.0.10" -H "X-Forwarded-Proto: https" -H "Host: en.wikipedia.org" https://appservers-rw.discovery.wmnet/wiki/Mittelherwigsdorf -s |grep -i message
                        <div class="mw-message-box cdx-message cdx-message--block mw-message-box-notice cdx-message--notice vector-language-sidebar-alert"><span class="cdx-message__icon"></span><div class="cdx-message__content">Language links are at the top of the page across from the title.</div></div>
                                        <div class="usermessage"><span id="mw-youhavenewmessages">You have <a href="/wiki/User_talk:208.80.154.224" title="User talk:208.80.154.224">a new message</a> (<a href="/w/index.php?title=User_talk:208.80.154.224&amp;diff=cur" title="User talk:208.80.154.224">last change</a>).</span></div>

This specific User talk pages were added back in 2021 by https://en.wikipedia.org/wiki/User:Who_are_you_and_him.

This behavior from MW has the following issues:

  • Anybody can create the user talk page for an IP that they don't own and trigger uncacheable responses every time that a user from that IP visits the Wikipedia where the User talk page has been created.
  • If the visited article is a cold page (not cached by ATS nor varnish) varnish will cache its uncacheability with a HfP (hit-for-pass) triggering a cache bypass for every anonymous user attempting to visit that article

From a product perspective, unregistered users reliably receive talk page notification banners during their anonymous edit session (a.k.a. orange bar of doom), which starts from their first edit and lasts as long as their browser keeps a session cookie (typically 2-3 weeks I believe, but depends on browsing habbits, and browser's privacy settings).

When someone is browsing the site without any session, they generally get their page views from the Varnish cache, and this happens correctly regardless of whether there happen to be unread talkpage notifications.

However, by accident there exists a set of conditions today under which an unregistered user can see the "new talk page notification" banner, even if they don't have a session cookie:

  1. You are logged-out and don't have an edit session or other session cookie.
  2. There exists a User_talk page for your IP.
  3. There are unread messages on said user talk page (i.e. when you last had a session cookie, you didn't click the banner or otherwise visit your own talk page after the message was sent; noting that "visiting" the talk page is what marks it as read).
  4. You are the first to view an article after it expires from the CDN cache.

By all odds, even when this does happen, the very next article you navigate to, or even if you merely refresh the page, the banner may be gone again since you're likeky to hit the CDN cache generally.

In summary: Despite not having seen the orange notification banner for several days between when your session expired and the random CDN-cache-miss response you received when reading an article, you can sometimes get random pageviews with a banner on it. Those intermittent responses are (gladly) correclty marked uncacheable, and do not poison the cache with a personalised response for the world to see. But, as per the task description, it does "teach" Varnish that this URL is sometimes uncacheable even without a session cookie, which is an open attack vector.

Conceptually, this feature is similar to other personalisation features in MediaWiki, and we generally only apply those when there is a session cookie. The user talk page notification is an exception to that principle.

I estimate it as low-risk to fix, and generally meet user expectations to do so. If anything, it seems not unlikely that serving this to users who don't have a session may be unrelated visitors on the same IP (e.g. imagine a school or office, where you get a banner for someone else on the same connection). Or for example when a floating IP is re-assigned after many months you might get "notified" about something that happened a long time ago and may have been addressed to someone else.

It also matches the future in terms of Temp users, in that with that model, there literally isn't a way to show notifications to non-session pages as we wouldn't know which Temp user they are, since the cookie is what assigns the user ID.

Krinkle reassigned this task from Krinkle to pmiazga.
Krinkle moved this task from Soon to Current Sprint on the MediaWiki-Platform-Team board.

Patch was made a couple days ago but somehow the phab ticket wasn't labelled. Adding a label manually.

but somehow the phab ticket wasn't labelled

This ticket is not public so @gerritbot cannot access it.

@Aklapper good point. Thanks - I didn't know that.

Could we make this task public, or do the IP addresses etc. posted here make it PermanentlyPrivate? (it seems that they're our own IPs)

Could we make this task public, or do the IP addresses etc. posted here make it PermanentlyPrivate? (it seems that they're our own IPs)

I'm fine with this task being public now, unless other folks have concerns about potentially (still) sensitive data within this task.

This ticket is not public so @gerritbot cannot access it.

One can always add @gerritbot as a subscriber to protected tasks like these. Of course, if a public patch is being created within gerrit, then there often isn't much additional value in keeping the relevant task protected.

sbassett changed Author Affiliation from N/A to Wikimedia Communities.Jan 18 2024, 2:44 AM
sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".
sbassett changed Risk Rating from N/A to High.