Page MenuHomePhabricator

Wheels built on ores-misc-01 are incompatible with ores* and scb*
Closed, ResolvedPublic

Description

I'm having trouble testing our newest builds, because the python libs installed to /srv/deployment/ores/venv aren't rebuilt during the deployment process. In fact, they haven't been updated since July 7.

Our scap.cfg is: https://github.com/wikimedia/mediawiki-services-ores-deploy/blob/master/scap/scap.cfg

My first thought is that we need to copy cmd_worker.sh to cmd_cluster.sh, but this is just a random guess and I'd appreciate some confirmation from @thcipriani when possible.

Related Objects

Event Timeline

Change 386663 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/services/ores/deploy@master] Add checks to cause virtualenv rebuild on the new ORES cluster.

https://gerrit.wikimedia.org/r/386663

Change 386663 merged by Ladsgroup:
[mediawiki/services/ores/deploy@master] Add checks to cause virtualenv rebuild on the new ORES cluster.

https://gerrit.wikimedia.org/r/386663

Now I'm seeing a mostly empty venv directory...

The target machine's logs don't say exactly what happens, I see:

Executing check 'cluster_checks'
config_deploy is not enabled in scap.cfg, skipping.
Restarting service 'uwsgi-ores'

Inside cluster_checks, we should be running cmd_worker.sh, and judging by the bad changes to venv/, this seems to be happening. What I'm looking for is the output of that command. Is there a way to increase logging verbosity?
This line in particular must be failing,

$venv/bin/pip install --use-wheel --no-deps $deploy_dir/submodules/wheels/*.whl

Running in a user virtualenv, I obtained the clue:

pip install --use-wheel --no-deps /srv/deployment/ores/deploy/submodules/wheels/*.whl

numpy-1.10.4-cp34-cp34m-manylinux1_x86_64.whl is not a supported wheel on this platform.

I've rebuilt the wheels again, on ores-misc-01. Python version is the same, machine architecture is the same, there is a kernel version difference, but most of the *.whl binaries are unchanged from master. numpy is unchanged. I'm able to install the wheels on ores-misc-01, but not on ores1002. pip install -v gives a stack trace, which shows that the wheels must share some tags in common with the running python's pip.pep425tags.supported_tags

On ores1002, in the deployed virtualenv,

source /srv/deployment/ores/venv/bin/activate
python
from pip import pep425tags
pep425tags.supported_tags
[('cp34', 'cp34m', 'linux_x86_64'),
('cp34', 'abi3', 'linux_x86_64'),
('cp34', 'none', 'linux_x86_64'),
('cp34', 'none', 'any'),
('cp3', 'none', 'any'),
('cp33', 'none', 'any'),
('cp32', 'none', 'any'),
('cp31', 'none', 'any'),
('cp30', 'none', 'any'),
('py34', 'none', 'any'),
('py3', 'none', 'any'),
('py33', 'none', 'any'),
('py32', 'none', 'any'),
('py31', 'none', 'any'),
('py30', 'none', 'any')]

On ores-misc-01, in the virtual env whence I'm rebuilding wheels,

[('cp34', 'cp34m', 'manylinux1_x86_64'),
('cp34', 'cp34m', 'linux_x86_64'),
('cp34', 'abi3', 'manylinux1_x86_64'),
('cp34', 'abi3', 'linux_x86_64'),
('cp34', 'none', 'manylinux1_x86_64'),
('cp34', 'none', 'linux_x86_64'),
('cp33', 'abi3', 'manylinux1_x86_64'),
('cp33', 'abi3', 'linux_x86_64'),
('cp32', 'abi3', 'manylinux1_x86_64'),
('cp32', 'abi3', 'linux_x86_64'),
('py3', 'none', 'manylinux1_x86_64'),
('py3', 'none', 'linux_x86_64'),
('cp34', 'none', 'any'),
('cp3', 'none', 'any'),
('py34', 'none', 'any'),
('py3', 'none', 'any'),
('py33', 'none', 'any'),
('py32', 'none', 'any'),
('py31', 'none', 'any'),
('py30', 'none', 'any')]

Inspecting the numpy wheel,

from pip import wheel
w = wheel.Wheel("/srv/awight/wheels/numpy-1.10.4-cp34-cp34m-manylinux1_x86_64.whl")
w.file_tags
{('cp34', 'cp34m', 'manylinux1_x86_64')}

Yep, that's not supported on ores1002.

I tried pip install pip==1.5.6, then rebuilt wheels, but we're still getting the "manylinux" tag.

awight renamed this task from Scap doesn't rebuilt virtualenv directory when deploying to ores* targets to Wheels built on ores-misc-01 are incompatible with ores* and scb*.Oct 27 2017, 4:57 PM

Comparing environments:

HostOSPythonpip
scb*Jessie 8.63.4.21.5.6
ores-misc-01Jessie 8.73.4.21.5.6
ores*Jessie 8.83.4.21.5.6

This is helpful information, from https://github.com/pypa/manylinux

Wheel packages compliant with [manylinux] tags ... can be installed with pip 8.1 and later.

That was it. So my plan is to include pip-9 in the wheels repo, and install that first, then continue with the rest of the wheels.

Change 386898 had a related patch set uploaded (by Awight; owner: Awight):
[research/ores/wheels@master] Add pip 9 wheel, to let us install multilinux wheels.

https://gerrit.wikimedia.org/r/386898

Change 386898 merged by Halfak:
[research/ores/wheels@master] Add pip 9 wheel, to let us install multilinux wheels.

https://gerrit.wikimedia.org/r/386898

Change 386901 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/services/ores/deploy@master] Install pip 9 to allow us to use multilinux wheels

https://gerrit.wikimedia.org/r/386901

Change 386901 merged by Halfak:
[mediawiki/services/ores/deploy@master] Install pip 9 to allow us to use multilinux wheels

https://gerrit.wikimedia.org/r/386901

Not quite complete: my pip hack left ores-beta in a broken state, the issue seems to be that upgrading pip doesn't correctly update the binary in venv/bin/pip, so that became unusable and incompatible with the installed pip library.

Once this is solved, we need to test the upgrade from the revscoring 1.x stable deployment, to the new code, especially keeping an eye to how the virtualenv is rebuilt.

Change 387613 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/services/ores/deploy@master] Remove the virtualenv directory each time

https://gerrit.wikimedia.org/r/387613

Change 387619 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/services/ores/deploy@master] Remove the virtualenv directory each time

https://gerrit.wikimedia.org/r/387619

Change 387622 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/services/ores/deploy@STABLE_REVSCORING_1] Remove the virtualenv directory each time

https://gerrit.wikimedia.org/r/387622

Change 387613 abandoned by Awight:
Remove the virtualenv directory each time

https://gerrit.wikimedia.org/r/387613

Change 387619 merged by Ladsgroup:
[mediawiki/services/ores/deploy@master] Remove the virtualenv directory each time

https://gerrit.wikimedia.org/r/387619

Change 387622 merged by Ladsgroup:
[mediawiki/services/ores/deploy@STABLE_REVSCORING_1] Remove the virtualenv directory each time

https://gerrit.wikimedia.org/r/387622

Looks like this is working nicely. I'm moving it to the done column.