Page MenuHomePhabricator

Jenkins: Re-enable lint checks for Apache config in operations-puppet
Closed, ResolvedPublic

Description

In the past Apache configs were in the apache-config repo.

When they were there and since Tim Starling wrote the generator for redirects you had to edit redirects.dat, then generate redirects.conf from that and upload both to Gerrit. Then jenkins would check if they matched and vote on that.

Then, the Apache configs have been moved into the mediawiki module (which i'm not entirely sure i support because many redirects have nothing to do with mediawiki, but that's not the point here). The point of this bug is that now f.e. on https://gerrit.wikimedia.org/r/#/c/149890/ you can see how it's possible to change just the .dat file, not upload the .conf file, but jenkins doesn't say anything about it, where it should vote it down. That is a regression compared to for example https://gerrit.wikimedia.org/r/#/c/146334/.

Can we have the Apache lint checks back even though now the configs are in the Mediawiki module?


Version: unspecified
Severity: major

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:29 AM
bzimport set Reference to bz70068.

also i'm going to note that now a change as the one above turns into "puppet merge" & "apache deployment" at the same time if merged both has to be done, and this could have broken all the redirects

Do we know when this behavior changed?

I am just noticing this bug :-/ The apache configuration got moved from the standalone repository operations/apache-config.git to puppet.

The Jenkins job need to be ported as well.

Change 163812 had a related patch set uploaded by Hashar:
Reenable Apache lint check

https://gerrit.wikimedia.org/r/163812

Change 163813 had a related patch set uploaded by Hashar:
Retrigger operations-apache-config-lint (non voting)

https://gerrit.wikimedia.org/r/163813

Change 163813 merged by jenkins-bot:
Retrigger operations-apache-config-lint (non voting)

https://gerrit.wikimedia.org/r/163813

I have adjusted the job to clone both operations/puppet.git and operations/mediawiki-config.git using Zuul cloner: https://gerrit.wikimedia.org/r/#/c/163812/ . It is deployed though there is a mass amount of sed scripts which I am not fan of.

The job pass, but I am not sure whether it catch all potentials issues that might occurs in Apache configuration files. We need to propose patches to verify the job is failing properly.

I have update Zuul configuration ( https://gerrit.wikimedia.org/r/#/c/163813/ ) to trigger the job on both puppet and mediawiki-config repositories.

I have proposed two test changes to play with, both have a build passing when doing a noop change:

operations/puppet:
https://gerrit.wikimedia.org/r/#/c/163814/
https://integration.wikimedia.org/ci/job/operations-apache-config-lint/569/console

operations/mediawiki-config:
https://gerrit.wikimedia.org/r/#/c/163815/
https://integration.wikimedia.org/ci/job/operations-apache-config-lint/570/console

The patchsets can be reused to propose bad configurations and verify the job are failing properly. Whenever that is known to work, we can make them voting and hence vote Verified -2 in Gerrit.

Change 166033 had a related patch set uploaded by Hashar:
Reenable Apache lint check

https://gerrit.wikimedia.org/r/166033

Change 163812 abandoned by Hashar:
Reenable Apache lint check

Reason:
Migrated to integration/config.git as https://gerrit.wikimedia.org/r/#/c/166033/

https://gerrit.wikimedia.org/r/163812

Patch is pending peer reviewing/testing to verify the test script would properly detects issues.

hashar raised the priority of this task from High to Needs Triage.Nov 24 2014, 7:10 AM
hashar triaged this task as High priority.
hashar lowered the priority of this task from High to Medium.Nov 24 2014, 9:22 AM
hashar added a project: acl*sre-team.
hashar set Security to None.

+SRE

The patch is pending review, I have deployed the job on Sep. 30th https://integration.wikimedia.org/ci/job/operations-apache-config-lint/ but I am unsure whether it is going to catch all potential errors or is actually working properly. So meanwhile it has been non voting and will not prevent a bad apache change to sneak in.

We need people with good knowledge of our Apache configuration files to review what is being executed by the Jenkins job and +1 / validate it is fine. The change is https://gerrit.wikimedia.org/r/#/c/166033/

The relevant code: https://gerrit.wikimedia.org/r/#/c/166033/2/jjb/operations-apache-config.yaml

hashar raised the priority of this task from Medium to Needs Triage.Nov 24 2014, 9:26 AM

I have emailed the ops list to attract some reviewers:


Hello,

The wiki Apache conf has been moved to puppet during the summer and since then the Jenkins job in charge of validating the configuration has been non voting.

I think I have managed to port it properly to run on operations/puppet.git but would need some review to ensure it is working properly and going to catch possible errors.

The job is currently running on operations/puppet but is not voting :-/ The relevant code to review is:

https://gerrit.wikimedia.org/r/#/c/166033/2/jjb/operations-apache-config.yaml

I would appreciate a few other pair of eyes to look at the crazy shell and eventually send dummy commits with errors to verify they are caught properly.

Thanks!

Task reference: https://phabricator.wikimedia.org/T72068


hashar removed hashar as the assignee of this task.Dec 1 2014, 9:30 AM

FYI: Maybe T76955 is related or even blocks this task.

FYI: Maybe T76955 is related or even blocks this task.

Yup some oddity in zuul-cloner caused the Jenkins job workspace to be left in a bad state (missing git remote) which caused the job to always fail early on). I have fixed it up by deleting all the files and the job is building fine.

Krinkle triaged this task as Medium priority.Dec 16 2014, 5:21 PM

Could the operations-apache-config-lint job be completely removed until it is enabled again?

Change 194261 had a related patch set uploaded (by Krinkle):
Disable operations-apache-config-lint

https://gerrit.wikimedia.org/r/194261

Change 194261 merged by jenkins-bot:
Disable operations-apache-config-lint

https://gerrit.wikimedia.org/r/194261

Could the operations-apache-config-lint job be completely removed until it is enabled again?

Yep, makes sense. Done.

Nobody bothered reviewing the patches I proposed back in October / November though :( I kind of loose interest in pushing this though.

Quoting an earlier comment on this task:

+SRE

The patch is pending review, I have deployed the job on Sep. 30th https://integration.wikimedia.org/ci/job/operations-apache-config-lint/ but I am unsure whether it is going to catch all potential errors or is actually working properly. So meanwhile it has been non voting and will not prevent a bad apache change to sneak in.

We need people with good knowledge of our Apache configuration files to review what is being executed by the Jenkins job and +1 / validate it is fine. The change is https://gerrit.wikimedia.org/r/#/c/166033/

The relevant code: https://gerrit.wikimedia.org/r/#/c/166033/2/jjb/operations-apache-config.yaml

Or we can get the test script embeded in operations/puppet.git directly

I'm wondering if it's an option to just go back to the separate apache-config repo we had before and run the same tests we had before. But that would be a question for Ori and _joe_.

You are right about the general review problem, but this change is also in the integration/config repository and the 2 reviewers on it are both not in ops.

I'd recommend for someone experienced with apache config and operations/puppet (maybe @Dzahn) add an executable script to the puppet repository. Then we'll add a job that executes that. I don't think we have the resources for developing (and maintaining) such test right now.

As added bonus it would make it easy for maintainers of operations/puppet to then run it locally, too.

Why do we have to write something new? I wasn't asking for a new feature, just reported a regression from what we had in the past.

agreed if that's a regression we should automatically check whether the redirects files are synched, what do you think @Joe ?

Hashar, if this is still blocked on reviews then ping me on IRC this week and I'll try to get things unblocked.

Change 204994 had a related patch set uploaded (by Legoktm):
redirects: Add test to verify redirects.conf has been regenerated from redirects.dat

https://gerrit.wikimedia.org/r/204994

greg changed the task status from Open to Stalled.Apr 28 2015, 4:28 PM

We talked about this task during our weekly RelEng checkin. The Jenkins job definition is quite trivial, but the testing script is a bit more challenging. The script I came up with is not robust and is hardcoded in the Jenkins job.

Ideally the test script would be directly in operations/puppet.git repo and Jenkins would simply invoke it. That offers much more liberty to define what is being tested.

Change 204994 merged by Yuvipanda:
mediawiki: Add test to verify redirects.conf has been regenerated from redirects.dat

https://gerrit.wikimedia.org/r/204994

Good Job Lego (As he asked me to type) - the redirects are checked now. There is lots of other apache configs, not sure if those also need to be checked?

tested it.

just changed redirects.conf, but not .dat

https://gerrit.wikimedia.org/r/#/c/240626/1
AssertionError: redirects.conf not regenerated. Run ./refreshDomainRedirects.

then changed redirects.dat, but not .conf

https://gerrit.wikimedia.org/r/#/c/240626/3
AssertionError: redirects.conf not regenerated. Run ./refreshDomainRedirects.

then got them both in sync:
https://gerrit.wikimedia.org/r/#/c/240626/4
verified

Dzahn claimed this task.
Dzahn added a subscriber: Legoktm.

thank you @Legoktm

@yuvipanda maybe there should be more checks, yea. But since i created this just for the original regression and that works again like before i'll call it resolved

Change 243923 had a related patch set uploaded (by Hashar):
Remove operations-apache-config-lint

https://gerrit.wikimedia.org/r/243923

This is not yet doing any apache linting / parse checking, which it did before. Does this need to be reopened?

@JanZerebecki new task that is linked there though please?

We no have the redirects expansion being done. Still need to run the Apache2 configtest. I removed the Jenkins job but you can get some idea with https://gerrit.wikimedia.org/r/#/c/243923/1/jjb/operations-apache-config.yaml,unified

Change 243923 merged by jenkins-bot:
Remove operations-apache-config-lint

https://gerrit.wikimedia.org/r/243923