Page MenuHomePhabricator

Setup trending service CI
Closed, DeclinedPublic

Description

To be able to test a critical piece of the service - subscribtions to kafka and receiving messages from it, we need to have a running kafka instance in the CI.

Event Timeline

Given the importance of the service being able to properly follow events from Kafka, it is essential to be able to test this as part of regular testing, so +1k from me.

I agree that testing this service properly is a requirement. I cc'ed Release-Engineering-Team to give them a chance to weigh in on whether they could provide a solution for testing with Kafka in Jenkins. We know that this will work in Travis, so unless Release-Engineering-Team provide a different solution, it seems that moving to github / travis will be the most straightforward solution.

It's unlikely we would be able to support Kafka on a short timeline. Use Github/Travis to your heart's content, no objections here :)

Our current jenkins infrastructure is not able to install kafka

Well you partly can. Marko wrote a puppet define service::packages which is meant to declare the packages needed on production and at the same time define development packages that can then be installed on the CI machines.

Graphoid as an example https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/graphoid/manifests/packages.pp

class graphoid::packages {

    service::packages { 'graphoid':
        pkgs     => ['libcairo2', 'libgif4', 'libjpeg62-turbo', 'libpango1.0-0'],
        dev_pkgs => ['libcairo2-dev', 'libgif-dev', 'libpango1.0-dev',
        'libjpeg62-turbo-dev'],
    }

}

Our CI images are refreshed once per day (at 14:14 UTC) using puppet.git has a reference and our own set of manifests. So we can just then include trendingedits::package and the CI images would have whatever packages. https://phabricator.wikimedia.org/diffusion/CICF/browse/master/dib/puppet/ciimage.pp;589390283c41dd7e31212a0d89d3ad3bdb709c56$132-135

But I guess having Kafka setup properly is a bit more challenging, though since you already did it for Travis, it should be straightforward to set it up for our CI images.

But I guess having Kafka setup properly is a bit more challenging, though since you already did it for Travis, it should be straightforward to set it up for our CI images.

Basically, we don't require a lot of configuration for running tests so it's a matter of installing it and starting it up before. In travis installing is done via wget and for starting it up we have a simple script.

I just want to say here: let's not make any hasty decisions on the day before turkey-day in the US :) I'd like to see what would be involved with supporting this on our infrastructure first.

Can we just mirror this or do we need to completely migrate?

This has been brought during the releng meeting. What needs to be done:

In puppet create a trendingedits::package puppet class that uses service::packages to define dependencies of the service for production and development purposes (eg CI).

We can get the Kafka package from our jessie-wikimedia installed on the CI images, that is quite trivial. A bit harder would be to pass delete.topic.enable=true and use the start_kafka.sh script, but we can surely hack dedicated Jenkins job that does exactly the same sequence of command.

I am all open to pair on this. With PST the best overlap for me is noon-2pm PST or 9pm-11pm CET.

A bit harder would be to pass delete.topic.enable=true

This part is not needed in CI, it's only needed for local testing when the tests are run repeatedly to clean up kafka from events remaining from the previous test run.

Neat! So if we have Kafka + Zookeeper packages installed in the CI images, probably the Jenkins job will boil down to something like:

git clone && git checkout
npm install
test/utils/start_kafka.sh
npm run coverage

So probably straightforward.

Puppet wise, looks like we need to transcribe the list of deps from the package.json to a service::packages:

"dependencies": {
  "debian": [
    {
      "repo_url": "https://apt.wikimedia.org/wikimedia",
      "release": "jessie-wikimedia",
      "pool": "main",
      "packages": [
        "librdkafka-dev"
      ]
    },
    "libsasl2-dev"
  ]

And maybe add kafka/zookeeper to it.

Should we re-title/focus this task or decline this one and make a new task to track making this work? :)

Change 325330 had a related patch set uploaded (by Hashar):
Add trendingedits packages to CI image

https://gerrit.wikimedia.org/r/325330

Too late for me to refresh the CI image that adds trendingedits::packages https://gerrit.wikimedia.org/r/325330

To take those patches in account, they have to be merged then we have to update the Jessie snapshot ( https://wikitech.wikimedia.org/wiki/Nodepool#Manually_generate_a_new_snapshot ) and new spawned instances will get the packages installed.

Change 325330 merged by jenkins-bot:
Add trendingedits packages to CI image

https://gerrit.wikimedia.org/r/325330

If I am the only one that thinks this way feel free to ignore but I do not really buy the notion of turning travis into a testing environment for all services. I think we are trying to fix the lack of proper staging environments with scripts on travis and that is not likely to be solid enough for testing purposes given that Travis is a CI system, not a virtual cloud with the ability to spawn and clean up environments.

We use puppet in production to make things play together well, I think trying to bring install of kafka and consumers into travis will create work when it comes to
manage proper versions/patches and all that work will be happening outside of existing puppet software specifications.

I have refreshed the CI image. Turns out that librdkafka-dev package was already installed since it is required by kafka-confluence ( cf2cbd82f51538cf812564e01c680fe9dc0dc3e7 ). So at least that part is covered and the CI image should be able to support some baby step toward running integration tests with Kafka.


Now I am confused by the integration tests you would want to work on. Is change-propagation at the hearth of the system? Do we care of the version of Kafka? How do other services work on top of that and what kind of combination do we want to test?

If we want want to tests together all the services and our kafka flavor, it is surely doable with the Wikimedia Jenkins. We could even create a new repo holding all the scripts used to setup an integration environment and run all the tests from all the repositories. For example we could have tests for:

  • a change to change-propagation which runs tests of trendingedits master branch and trendingedits/deploy (if you dont want to break prod).
  • operations/debs/kafka changes running the integration tests of all the services

Or whatever else our imagination can end up with.


Petr mentioned a boot script start_kafka.sh which is part of the change-propagation service. Looks like it solved exactly the feature requested for trending-edits. So maybe we can look at running the change-propagation on Jenkins fist and trending-edits will then be straightforward? I am proposing to create a subtask to setup change-propagation.

greg renamed this task from Move primary trending service development to github to Setup trending service CI.Dec 5 2016, 9:29 PM
greg triaged this task as Medium priority.
greg updated the task description. (Show Details)

Change-propagation at the hearth of the system?

Trending service is completely independent of ChangeProp, I'm using CP as an example of the other service dependent on kafka and as a reference for scripts we will need in the trending service too.

Now I am confused by the integration tests you would want to work on.

We don't have any integration tests written for trending service because we don't have CI to run them. I can add a simple integration test tomorrow to be able to check that CI is working for it.

Do we care of the version of Kafka?

Yes, we need 0.9.0.1 - the same version we use in production.

We could even create a new repo holding all the scripts used to setup an integration environment and run all the tests from all the repositories.

The start_kafka.sh will probably be the only one needed to be shared. There's also a clean_kafka.sh but it's a bit service-dependent as it contains a full list of of topics used by the service. However the functions from it could be moved to a separate script and sourced in a smaller per-service script. Overall I think it's a good idea, could call it something like kafka-management-tools.

Petr mentioned a boot script start_kafka.sh which is part of the change-propagation service. Looks like it solved exactly the feature requested for trending-edits. So maybe we can look at running the change-propagation on Jenkins fist and trending-edits will then be straightforward? I am proposing to create a subtask to setup change-propagation.

We mostly use github+travis for ChangeProp development and it works pretty well for us..

Hm, so this is a bit of a chicken and egg problem. I can't set up tests until CI is ready, and we need to test CI with integration tests. @hashar Would it be useful if I make some tests first, or better set up the CI first and then make some tests?

​FYI, I’ve made some slight modifications to Petr’s kafka .sh scripts over
here: https://github.com/wikimedia/KafkaSSE/tree/master/test/utils

Most notably, the alteration lets me use the same scripts with the Kafka
.deb package installed in mediawiki-vagrant by overriding some commands
with environment variables.

Change 325937 had a related patch set uploaded (by Hashar):
dib: provision netcat-openbsd

https://gerrit.wikimedia.org/r/325937

Change 325937 merged by jenkins-bot:
dib: provision netcat-openbsd

https://gerrit.wikimedia.org/r/325937

Mentioned in SAL (#wikimedia-releng) [2016-12-08T15:28:55Z] <hashar> Updating Nodepool Jessie image to ship netcat T151469 T152684

@Pchelolo since change-propagation already has the Kafka related glue, I have spin off T152684 and added the npm job to it. There is probably not much to add to the clean_kafka.sh so it download/starts Kafka whenever it detects it runs under Jenkins env. Lets follow up on T152684 and we can come back here.

(just realized I have forked this task and related discussion are happening on the sub task T152684) sorry I am terrible with Phabricator and keep filling too many tasks :o\

Was in "Next up" on Trending-Service workboard as of 04/25/2017.

The trending service has been undeployed from production, so this is not needed anymore.