Page MenuHomePhabricator

Magul's quick tests doesn't run anymore
Open, HighPublic

Description

See https://travis-ci.org/magul/pywikibot/builds

tl;dr: Test specific Gerrit patchset on request using several Python versions and several environments by Wikimedia-CI

Event Timeline

Xqt created this task.Feb 1 2018, 12:48 PM
Restricted Application added subscribers: pywikibot-bugs-list, Aklapper. · View Herald TranscriptFeb 1 2018, 12:48 PM
Xqt renamed this task from Magul'ss quick tests doesn't run anymore to Magul's quick tests doesn't run anymore.Feb 1 2018, 12:51 PM
Dvorapa added a subscriber: Dvorapa.Feb 1 2018, 1:05 PM

Probably we could ask Release-Engineering-Team? We would like to make Travis and AppVeyor builds for patchsets sometimes. Until now it worked like this:

  1. @Magul was added as a reviewer to a patch
  2. Not sure how Magul's GitHub fork of Pywikibot was updated and was created a Pull request with a change
  3. Travis and AppVeyor builds were run on a patchset and using Magul's account links to outputs were added to a comment when done

Probably we could achieve something similar with Jenkins?:

  1. Write a keyword (like recheck is used for jenkins retesting patchset)
  2. Travis and AppVeyor builds are run from Jenkins
  3. Output is inserted as a comment to a patchset
Dalba added a subscriber: Dalba.EditedJun 6 2018, 2:55 AM

I think it would be rather easy to add live tests to tox.ini so that Jenkins will run them with py2.7 and py3.4. The problem is that live tests are not fully deterministic and fail from time to time due to various reasons and that will result in false -1s on patch-sets. Also, they take nearly 10-15 minutes to complete... Is that OK?

Dvorapa added a comment.EditedJun 6 2018, 3:04 AM

It seems too long to me, especially for patches which does not change the code at all (doc patches...). I like we reduced Jenkins test time to 2 minutes lately. I would love the keyword option (test only on request), but Release-Engineering-Team seems to be busy/does not respond to my question.

Maybe we could also try to email @Magul like in readthedocs issue, where we finally succeeded this way?

We have a lot of flexibility with jenkins, we could have it run the longer tests in a separate queue (so the developer will get feedback on the standard voting tests faster), post the result back to the patch, but not give it a vote based on the results of the tests. Would that meet the current needs?

Dvorapa added a comment.EditedJun 10 2018, 1:19 PM

I think it would do, yes! (see also T132138)

Xqt added a comment.Jun 10 2018, 7:21 PM

Would that meet the current needs?

+1

OK the one thing I missed...what is the command to run the slow/full test suite?

Dalba added a comment.EditedAug 1 2018, 2:00 PM

OK the one thing I missed...what is the command to run the slow/full test suite?

Assuming that dependencies are installed and the user-config.py is generated (with a valid username and password_file), python -m unittest tests while the environment variable PYWIKIBOT2_TEST_WRITE=1 is set, should run all the tests that we need.

Dvorapa updated the task description. (Show Details)Feb 23 2019, 10:01 PM
Dvorapa updated the task description. (Show Details)Feb 23 2019, 10:30 PM
Dvorapa updated the task description. (Show Details)Mar 4 2019, 3:40 PM
hashar claimed this task.Mar 4 2019, 6:00 PM
hashar added a subscriber: hashar.

Hello, this got raised to Release-Engineering-Team meeting last Monday, thought I was busy with other random duties.
This week will be easier so I gotta review the request but in short Wikimedia CI does not support Travis or AppVeyor at all.

For Travis, it is heavily tight to github and we cant reproduce that with Gerrit and our CI (Jenkins/Zuul/Docker). That being said, most of the time it is very straightforward to rewrite the travis logic to a Jenkins job.

AppVeyor I don't think I have ever heard of it. But I guess it is a third party closed source software and we would probably not be able to reproduce it.

Note: we used to have a Jenkins job that ran the pywikibot tests against the Beta-Cluster-Infrastructure (901f445d1058f55d8ceae23cff4cf7b58537a778 (T100903).
) but it was broken/never completed and I have removed it T188256. The job was doing:

export PYWIKIBOT2_DIR=$WORKSPACE
export TOX_TESTENV_PASSENV=PY_COLORS
export PY_COLORS=1
tox -e venv -- pip install --upgrade ndg-httpsclient
tox -e venv -- python -m generate_family_file 'http://en.wikipedia.beta.wmflabs.org/' 'wpbeta' 'y'
tox -e venv -- python -m generate_user_files -dir:$WORKSPACE -family:wpbeta -lang:en -v
echo "console_encoding='utf8'" >> $WORKSPACE/user-config.py
tox -e venv -- pip install -rdev-requirements.txt
tox -e venv -- python setup.py nosetests --tests tests --verbosity=2 -a '"family=wpbeta,code=en"'

We can look at moving that as a script directly into pywikibot/core.git, add a new testenv in tox (eg: integration-beta) and then simply have a Jenkins job that invokes that testenv. Then move forward and fix it :)

Dvorapa added a comment.EditedMar 4 2019, 7:34 PM

Hello, this got raised to Release-Engineering-Team meeting last Monday, thought I was busy with other random duties.
This week will be easier so I gotta review the request but in short Wikimedia CI does not support Travis or AppVeyor at all.

I understand. Some other devs already told me you were all busy last week and you are busy in general.

For Travis, it is heavily tight to github and we cant reproduce that with Gerrit and our CI (Jenkins/Zuul/Docker). That being said, most of the time it is very straightforward to rewrite the travis logic to a Jenkins job.

The basic thing why we are running Travis and Appveyor is to run unittests and nosetests and some others on multiple devices, like 32bit, 64bit, Win, Linux, Mac, Python 2.7, 3.4, 3.5, 3.6, 3.7 and 3.8. I mean various combinations of those and various presets (pwb env variables on/off, several Wikimedia wikis). The logic of both services are really close to one another, so it should be ok to merge the main points in Wikimedia CI. Also in my opinion there is no need to test py2.7, 3.4 and also perhaps 32bit anymore (as these are slowly ending their career).

AppVeyor I don't think I have ever heard of it. But I guess it is a third party closed source software and we would probably not be able to reproduce it.

The config and the logs could be examined to get the idea of what these services are doing:
https://phabricator.wikimedia.org/diffusion/PWBC/browse/master/.travis.yml
https://travis-ci.org/wikimedia/pywikibot/builds

https://phabricator.wikimedia.org/diffusion/PWBC/browse/master/.appveyor.yml
https://ci.appveyor.com/project/ladsgroup/pywikibot-g4xqx

Note: we used to have a Jenkins job that ran the pywikibot tests against the Beta-Cluster-Infrastructure (901f445d1058f55d8ceae23cff4cf7b58537a778 (T100903).
) but it was broken/never completed and I have removed it T188256. The job was doing:

export PYWIKIBOT2_DIR=$WORKSPACE
export TOX_TESTENV_PASSENV=PY_COLORS
export PY_COLORS=1
tox -e venv -- pip install --upgrade ndg-httpsclient
tox -e venv -- python -m generate_family_file 'http://en.wikipedia.beta.wmflabs.org/' 'wpbeta' 'y'
tox -e venv -- python -m generate_user_files -dir:$WORKSPACE -family:wpbeta -lang:en -v
echo "console_encoding='utf8'" >> $WORKSPACE/user-config.py
tox -e venv -- pip install -rdev-requirements.txt
tox -e venv -- python setup.py nosetests --tests tests --verbosity=2 -a '"family=wpbeta,code=en"'

We can look at moving that as a script directly into pywikibot/core.git, add a new testenv in tox (eg: integration-beta) and then simply have a Jenkins job that invokes that testenv. Then move forward and fix it :)

I see, the point was to test the every new MediaWiki version coming up. We run some beta-cluster tests on Travis and Appveyor currently, so it should be easy to mimick it. It should be ok to add a script or a plaintext list of commands into the repo, add new tox environment and run it. I can prepare a glance what it could look like.

The main issue (as you can see by the priority) is this one, the formerly known Magul's quick Travis test. Every Gerrit patchset is tested usng Jenkins by unittest, nosetest, doctest, some other similar stuff, but all on one platform with one exact presets. The Travis/Appveyor tests are quite heavy and time-consuming (They are running in several environments to compare), so there is no need to run them on every patchset. But if we added Magul back then into reviewers, he/his tool run the unittests on Travis on demand. This is what we currently miss the most I think as Travis and Appveyor after merge are still working for us (although we have no access to the Wikimedia Appveyor account and perhaps neither Wikimedia has), so there is no need to hurry with moving completely to Wikimedia CI. Also we run beta-cluster tests there, so also not a priority I think.

The easy solution for the top priority Magul's test replacement would be something like this: Someone enters a magic keyword in the Gerrit patchset comment -> Wikimedia CI registers it and creates a pull request to GitHub with the code (perhaps using GH API, also closes it immediately as it wil not be needed on GitHub) -> and GitHub-Travis-Appveyor handles the rest of the work.

The harder solution would be to mimick Travis/Appveyor in Wikimedia CI as you said, then when someone enters a magic keyword in the Gerrit patchset comment or someone +2 the patchset -> Wikimedia CI registers it and runs the tox on predefined set of environments including several Python versions, three major OS platforms and several presets.

Dvorapa added a comment.EditedMar 4 2019, 8:07 PM

Broken:

WhenWhatWherestatus
on demand for one patchsetnose and unittest on various wikis, py versions, environments, platforms and architecturesoriginally Magul or his tool activating Travis CIdown, high priority

Bypassed:

WhenWhatWherestatus
before new MediaWiki version comes outnose and unittest of current master on beta-clusterWikimedia CIwish, low priority
postmergenose and unittest on beta-clusterTravis, Appveyorok
postmergenose and unittest on various wikis, py versions, environments, platforms and architecturesWikimedia CI instead of Travis/Appveyorwish, lowest priority
postmergenose and unittest on various wikis, py versions, environments, platforms and architecturesTravis, Appveyorok with some smaller issues

@Dvorapa thank you so much for your replies. I guess I have all the context to jump in and analysis a proper solution :-]

The easy solution for the top priority Magul's test replacement would be something like this: Someone enters a magic keyword in the Gerrit patchset comment -> Wikimedia CI registers it and creates a pull request to GitHub with the code (perhaps using GH API, also closes it immediately as it wil not be needed on GitHub) -> and GitHub-Travis-Appveyor handles the rest of the work.

Our situation with GitHub is already rather sort of a mess. Others previously suggested to have a Gerrit -> GitHub -> Travis -> 3rd party tools, I vetoed it as not robust enough. There are too many stacks and it is asking for troubles. That being said, people do take advantage of repositories being synced to Github to activate jobs that run once a change is merged. That is fine since it does not touch Gerrit / Wikimedia CI and thus Release-Engineering-Team is not involved :-]

The harder solution would be to mimick Travis/Appveyor in Wikimedia CI as you said, then when someone enters a magic keyword in the Gerrit patchset comment or someone +2 the patchset -> Wikimedia CI registers it and runs the tox on predefined set of environments including several Python versions, three major OS platforms and several presets.

The Wikimedia CI uses a Docker container that runs tox. It recently got support for python 2.7, 3.4, 3.5, 3.6 and 3.7 T191764. We should be able to add them right now in tox.ini and add CI jobs for them.


32bits, yes I agree we can drop it. I don't think pywikibot would have many issues with it anyway.

That leaves us with the .travis.yml. Seems all of that can be converted to testenv in tox.ini, possibly by invoking different testing scripts. It does not seem too complicated :]

The Wikimedia CI uses a Docker container that runs tox. It recently got support for python 2.7, 3.4, 3.5, 3.6 and 3.7 T191764. We should be able to add them right now in tox.ini and add CI jobs for them.

Amazing! I didn't know about this. That makes things much easier.

Dvorapa updated the task description. (Show Details)Mar 5 2019, 9:44 AM
Dvorapa updated the task description. (Show Details)
Dvorapa added a comment.EditedMar 5 2019, 9:54 AM

Other tests we are running (just for the context):

WhenWhatWherestatus
on every patchset and on +2style, nose, nose34 and doctestJenkins CIok
postmergedoctest and doc generationJenkins CIok
postmergecoverage and redundancy testCodecov and Codeclimateok
on every patchset for novicesstyletestJenkins CIok
on every patchset on i18n reponpm and grunttestJenkins CIok

From the .travis.yml file. Some global environment variables:

env:
  global:
    - TEST_TIMEOUT=300
    - PYWIKIBOT_NO_L10N_TESTS=1

Then there are two set of globals:

env:
  matrix:
    - LANGUAGE=en FAMILY=wikipedia PYWIKIBOT_TEST_PROD_ONLY=1
    - LANGUAGE=zh FAMILY=wikisource PYSETUP_TEST_EXTRAS=1 PYWIKIBOT_TEST_PROD_ONLY=1 PYWIKIBOT_TEST_NO_RC=1

Then there are 13 entries in the matrix, one of them is allowing to fail (python 3.8-dev). The huge combination of 26 axis ends up being:

pythonFAMILYLANGUAGEOAUTH_DOMAINPYSETUP_TEST_EXTRASNO_L10N_TESTSSITE_ONLYTEST_NO_RCPROD_ONLYTEST_TIMEOUT
2.7_with_system_site_packageswikipedianbNone11None11300
2.7wpbetaen"en.wikipedia.beta.wmflabs.org"None11None1300
3.6wpbetazh"zh.wikipedia.beta.wmflabs.org"None11None1300
3.4wsbetaenNoneNone01None1300
2.7wikiawikiaNoneNone1None11300
3.5musicbrainzenNoneNone11None1300
3.4wikipediatest"test.wikipedia.org"None11None1300
3.4wikidatatestNoneNone11None1300
3.4wiktionaryarNoneNone1None11300
3.6wikidatawikidataNoneNone11None1300
3.7wikipediadeNoneNone1NoneNone1300
3.8-devwikipediatestNoneNone11None1300
3.8-dev(failling)wikipediatestNoneNone11None1300
2.7_with_system_site_packageswikipedianbNone11None11300
2.7wpbetaen"en.wikipedia.beta.wmflabs.org"11111300
3.6wpbetazh"zh.wikipedia.beta.wmflabs.org"11111300
3.4wsbetaenNone10111300
2.7wikiawikiaNone11None11300
3.5musicbrainzenNone11111300
3.4wikipediatest"test.wikipedia.org"11111300
3.4wikidatatestNone11111300
3.4wiktionaryarNone11None11300
3.6wikidatawikidataNone11111300
3.7wikipediadeNone11None11300
3.8-devwikipediatestNone11111300
3.8-dev(failling)wikipediatestNone11111300

Which I have generated with the very lame script at F28367661

hashar added a comment.EditedMar 11 2019, 1:48 PM

A few things that are worth investigating and probably could use to be simplified.

sudo

The default is sudo: false, a few environments do have sudo: require when they have dist: xenial:

include:
    - python: '2.7_with_system_site_packages'
      env: LANGUAGE=nb FAMILY=wikipedia PYSETUP_TEST_EXTRAS=1 PYWIKIBOT_TEST_NO_RC=1
      dist: xenial
      sudo: required
      addons:
        apt:
          packages:
            - djvulibre-bin
            - graphviz
            - liblua5.1-0-dev
            - python-ipaddr
    - python: '3.7'
      env: LANGUAGE=de FAMILY=wikipedia
      dist: xenial
      sudo: required
    - python: '3.8-dev'
      env: LANGUAGE=test FAMILY=wikipedia PYWIKIBOT_SITE_ONLY=1
      dist: xenial
      sudo: required
  allow_failures:
    - python: '3.8-dev'
      env: LANGUAGE=test FAMILY=wikipedia PYWIKIBOT_SITE_ONLY=1
      dist: xenial
      sudo: required

One sure thing, we can't right know support the env '2.7_with_system_site_packages' since the CI job do not have sudo access to install extra packages. For the others, I am not sure why sudo is required on xenial? Maybe to get python 3.7 or 3.8 dev to be installed?

global env

env:
  global:
    - TEST_TIMEOUT=300
    - PYWIKIBOT_NO_L10N_TESTS=1

  matrix:
    - LANGUAGE=en FAMILY=wikipedia PYWIKIBOT_TEST_PROD_ONLY=1
    - LANGUAGE=zh FAMILY=wikisource PYSETUP_TEST_EXTRAS=1 PYWIKIBOT_TEST_PROD_ONLY=1 PYWIKIBOT_TEST_NO_RC=1

The global env matrix sets LANGUAGE and FAMILY but they are always set in the matrix below. I would drop them from the global and have the script: part to fail immediately when either is not set.

PYWIKIBOT_TEST_PROD_ONLY=1 is always set, could be moved up in env.global. Note the environment variable is only listed in tests.aspects.py:

for data in cls.sites.values():
    if ('code' in data and data['code'] in ('test', 'mediawiki')
            and 'PYWIKIBOT_TEST_PROD_ONLY' in os.environ and not dry):
        raise unittest.SkipTest(
            'Site code "{}" and PYWIKIBOT_TEST_PROD_ONLY is set.'
            .format(data['code']))

Which seems to indicate tests are skipped when they are marked with code=test or code=mediawiki. Seems to me the variable can be set when LANGUAGE is set to test or mediawiki instead of always setting it.

The second global matrix sets PYWIKIBOT_TEST_NO_RC=1, seems to be set because the test suite does not accommodate for some customization made on sites.

PYSETUP_TEST_EXTRAS seems to be a huge hack that affects the build a lot with a few hacks all other the place.

script

The super long yaml script is not convenient. It would be nicer to have all the logic extracted to a standalone file and just invoke that. Eg:

script:
  - citestrunner.sh

OAUTH_DOMAIN

It is set from the matrix, seems to me it is easier to set it in the script itself.

Dvorapa added a subscriber: zhuyifei1999.EditedMar 11 2019, 3:28 PM

sudo

sudo and xenial are required by Travis to make py3.7 and py3.8 work, you are right. On WM CI any setup that can run py3.5-3.8 could be used for the job, sudo and xenial are not required.

global env

I think the 2.7_with_system_side_packages does not install anything using pip and use what's installled on the system instead and that's what it differs from the others (but I'm not sure, pinging @Xqt, @Dalba, @zhuyifei1999). Anyway this test is needed, perhaps you haven't understood its purpose correctly?

TEST_PROD_ONLY is not used globally, see e.g. last build https://travis-ci.org/wikimedia/pywikibot, it is used only on some of the tests. You are right, this is for tests, that can be run only on production wikis (not mw or test). I would leave all the env vars as they are and leave this topic of deprecating some env vars to another task(s)

TEST_NO_RC is usually used for sites, which's RC is usually empty. This is also needed.

The global env matrix sets LANGUAGE and FAMILY

I don't feel you got the .travis.yml right. It creates a matrix of enwp and zhws with python versions and then 12 extra tests.

The huge combination of 26 axis

Ummm, the number differs from the actual tests, taht are done. See the last build: https://travis-ci.org/wikimedia/pywikibot it runs only 20 tests, not 26. What's the difference between your table and Travis table?

script
The super long yaml script is not convenient. It would be nicer to have all the logic extracted to a standalone file and just invoke that.

Yeah, we discussed that earlier, I agree.

OAUTH_DOMAIN
It is set from the matrix, seems to me it is easier to set it in the script itself.

I'm not sure I understand what you mean by this.

Here is the fixed axis of 20 tests, that are run actually:

pythonFAMILYLANGUAGEOAUTH_DOMAINPYSETUP_TEST_EXTRASNO_L10N_TESTSSITE_ONLYTEST_NO_RCPROD_ONLYTEST_TIMEOUT
2.7wikipediaenNoneNone1NoneNone1300
3.4wikipediaenNoneNone1NoneNone1300
3.5wikipediaenNoneNone1NoneNone1300
3.6wikipediaenNoneNone1NoneNone1300
2.7wikisourcezhNone11None11300
3.4wikisourcezhNone11None11300
3.5wikisourcezhNone11None11300
3.6wikisourcezhNone11None11300
2.7_with_system_site_packageswikipedianbNone11None1None300
2.7wpbetaen"en.wikipedia.beta.wmflabs.org"None11NoneNone300
3.6wpbetazh"zh.wikipedia.beta.wmflabs.org"None11NoneNone300
3.4wsbetaenNoneNone01NoneNone300
2.7wikiawikiaNoneNone1None1None300
3.5musicbrainzenNoneNone11NoneNone300
3.4wikipediatest"test.wikipedia.org"None11NoneNone300
3.4wikidatatestNoneNone11NoneNone300
3.4wiktionaryarNoneNone1None1None300
3.6wikidatawikidataNoneNone11NoneNone300
3.7wikipediadeNoneNone1NoneNoneNone300
3.8-dev (allowed failure)wikipediatestNoneNone11NoneNone300
Dvorapa added a comment.EditedMar 11 2019, 4:02 PM

Let's describe it better:

  • first there is a matrix of supported py versions and two wikis with different settings
  • then no pip package installed - only system side packages
  • then a matrix of multiple wikis in the lowest supported py3 (3.4) version (enwpbeta, zhwpbeta, enwsbeta, wikia, musicbrainz, testwiki, testwd, arwikt, wd), sometimes different py version or different environment to make it interesting :D + three oauth tests
  • then py3.7 and failing 3.8-dev

Note: py3.7 may be added into the original matrix instead of having a special entry

hashar removed hashar as the assignee of this task.May 10 2019, 4:29 PM

Despite a first try at looking at pywikibot/core and attempting to run the test suite / get familiar with the code, I eventually dropped/forgot about this. I had too many tasks to handle on the CI infrastructure front :-(

Dvorapa added a subscriber: greg.EditedMay 14 2019, 11:03 PM

This will be challenge for Hackaton then. I heard that @greg will attend, so I will try to work this out with his help perhaps?

Change 513700 had a related patch set uploaded (by Dvorapa; owner: Dvorapa):
[pywikibot/core@master] [TEST] Test Pywikibot in Wikimedia CI (1/2)

https://gerrit.wikimedia.org/r/513700

Change 510927 had a related patch set uploaded (by Dvorapa; owner: Dvorapa):
[integration/config@master] [WIP] Test Pywikibot in Wikimedia CI (2/2)

https://gerrit.wikimedia.org/r/510927

Huji moved this task from Backlog to Needs Review on the Pywikibot board.Sep 10 2019, 11:30 AM
Dvorapa moved this task from Backlog to Framework on the Pywikibot-tests board.Apr 3 2020, 4:29 PM

Mentioned in SAL (#wikimedia-cloud) [2020-05-10T08:58:46Z] <Dvorapa> requested Pywikibot-gerritbot account on Toolforge - T186208