Page MenuHomePhabricator

Invalidate PHP7's opcache when needed
ClosedPublic

Authored by Joe on Jan 2 2019, 11:45 AM.

Details

Reviewers
mmodell
thcipriani
Group Reviewers
Restricted Owners Package(Owns No Changed Paths)
Release-Engineering-Team
Commits
rMSCA2d725a7b4ee3: Fix "F632 use ==/!= to compare str, bytes, and int literals"
rMSCA7bc0a15f0b7a: Drop Arcanist files after move to Gerrit
rMSCAe1b8fc19e3c6: Invalidate PHP7's opcache when needed
Patch without arc
git checkout -b D1135 && curl -L https://phabricator.wikimedia.org/D1135?download=true | git apply
Summary

PHP7 has its bytecode cache called opcache; it can operate in different
ways, but ideally we want it to never revalidate files once they're
parsed the first time, and to be able to only release new code quickly after
everything's synced. To this aim, an "admin site" for PHP7 is present in
the WMF production clusters to allow to selectively invalidate the
opcache.

Bug: T211964

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

Joe created this revision.Jan 2 2019, 11:45 AM
Restricted Application added a reviewer: Restricted Owners Package.Jan 2 2019, 11:45 AM
Restricted Application added a reviewer: mmodell. · View Herald TranscriptJan 2 2019, 11:45 AM
Restricted Application added a reviewer: Release-Engineering-Team. · View Herald Transcript
Restricted Application added a project: Release-Engineering-Team. · View Herald Transcript
Joe requested review of this revision.Jan 2 2019, 11:46 AM
Joe added a comment.Jan 2 2019, 11:49 AM

I frankly have no idea why these unit test issues pop up; I'm pretty sure I didn't touch anything remotely related.

In D1135#22657, @Joe wrote:

I frankly have no idea why these unit test issues pop up; I'm pretty sure I didn't touch anything remotely related.

Hrm, these look to be the remaining python3 tests that are failing for us.

A few nits inline. The only show-stopper is that sync-wikiversions won't work as implemented. Should this be feature-flagged to hide warnings pre-switchover?

scap/main.py
909

sync-wikiversions doesn't accept a file argument. self.include is probably the closest thing that is used for everything that inherits from AbstractSync and might be useful in the general case.

self.include is either None or a file path relative to /srv/mediawiki-staging used by rsync (that is, it has things like wikiversions.*). I'm not sure what opscache-free accepts in terms of file paths, so that may not be useful.

scap/utils.py
846

you could do:

try:
    ...
    r = requests.get(url, params=params)
    r.raise_for_status()
except requests.exceptions.RequestException as e:
    failed[host] = str(e)

And get rid of the r.status_code check. That would get you more detailed info about non-200 return codes, for instance:

>>> import requests
>>> try:
...   r = requests.get('https://www.sparkfun.com/404')
...   r.raise_for_status()
... except requests.exceptions.RequestException as e:
...   print(str(e))
...   pass
... 
404 Client Error: Not Found for url: https://www.sparkfun.com/404
>>> try:
...   r = requests.get('https://www.sparkfun.com/418')
...   r.raise_for_status()
... except requests.exceptions.RequestException as e:
...   print(str(e))
...   pass
... 
418 Client Error: I'm a teapot for url: https://www.sparkfun.com/418
847

can we use requests.exceptions.RequestException rather than a blanket Exception?

Joe added a comment.Jan 3 2019, 5:54 AM

Thanks for the comments @thcipriani, amending now. See my comments inline.

scap/main.py
909

Yeah sorry, I should've added a comment here - I wanted your feedback.

AFAIR, sync-wikiversions generates the wikiversions.php file, so I just need to invalidate 'wikiversions.php' here, right?

opcache-free accepts file paths relative to /srv/mediawiki

scap/utils.py
846

I don't particularly like raising an exception I catch 3 lines below just as a GOTO, and I don't think it adds a lot of information we need, but OK, the code is probably a little more terse this way.

847

No, we shouldn't. We definitely don't want scap to fail in the middle of invalidating the opcache because (say) a dns lookup fails temporarily.

We should catch the broadest range of exceptions here.

Joe updated this revision to Diff 2950.Jan 3 2019, 7:34 AM
Joe marked an inline comment as done.

Incorporated @thcipriani's comments

thcipriani accepted this revision.Jan 3 2019, 10:13 PM

Added a few inline comments that may require some changes, but code looks like it should work as is so feel free to land when you feel like it's ready.

aside: the arc errors you're seeing from your local tests seem to imply that your local python is python3. I work around this for arc by activating the virtualenv tox creates before doing arc [whatever], .e.g,:

tox -epy27
. .tox/py27/bin/activate
arc diff # or arc land <branch>
deactivate
scap/main.py
468

More a note for me: we'll need to ensure that this key is in /etc/scap.cfg before we release this code.

909

re:codepaths -- /srv/mediawiki should be the same as /srv/mediawiki-staging by this point for anything inheriting from AbstractSync.

We sync both wikiversions.json and wikiversions.php (hence self.include = wikiversions.*) although IIRC MediaWiki only cares about wikiversions.php, but I'm not sure about that.

scap/utils.py
847

I think that dns failure would still be a requests.exceptions.RequestException since all exceptions in requests library inherit from that class, but this was more a nitpick than anything: Feel free to ignore this suggestion.

This revision is now accepted and ready to land.Jan 3 2019, 10:13 PM
Joe added a comment.Jan 9 2019, 3:48 PM

This change is still not ready to ship sadly, I need to fix the logic as we should:

1 - only try to invalidate the opcache on servers that have php-fpm running
2 - do the invalidation per-cluster
3 - parallelize the invalidation phase.

Joe updated this revision to Diff 2954.Jan 15 2019, 12:39 PM
thcipriani accepted this revision.Jan 15 2019, 4:42 PM

Looks really cool!

Manually checked unittests — all seems to work.

Couple of inline comments.

In addition to the puppet config change for scap, we'll need to add gevent to the debian packaging stuff on the release branch.

scap/main.py
468

nitpick: file is a type in python2

scap/opcache_manager.py
2

should either be "the" or "ze" :)

3

After reading the docs here...this is neat :)

49

will this just print "(True, None)"?

Joe marked an inline comment as done.Jan 21 2019, 9:33 AM
Joe added inline comments.
scap/main.py
468

oh, sigh, sure. Amending here and everywhere else.

468

I just changed the code to make it safe to deploy without defining the admin port.

scap/opcache_manager.py
49

yup it's a leftover from development.

Joe updated this revision to Diff 2958.Jan 21 2019, 9:33 AM

I suppose we can go ahead and merge this version, and release it as soon as possible as it's important for the php7 migration.

Let's decide on when to do the deploy, I would say after all-hands is the right time.

thcipriani accepted this revision.Jan 24 2019, 2:51 PM
In D1135#22756, @Joe wrote:

I suppose we can go ahead and merge this version, and release it as soon as possible as it's important for the php7 migration.

I'll go ahead and land this diff, hopefully you have the honor of the last differential patch!

Let's decide on when to do the deploy, I would say after all-hands is the right time.

+1 for post-all-hands

This revision was automatically updated to reflect the committed changes.