Page MenuHomePhabricator

Graduate codesearch to production
Open, LowPublic

Assigned To
None
Authored By
Legoktm
Nov 19 2020, 1:43 AM
Referenced Files
None
Tokens
"Heartbreak" token, awarded by Misfortunesdaughter."Pterodactyl" token, awarded by LSobanski."Love" token, awarded by bd808."Like" token, awarded by Dzahn.

Description

codesearch has become a tool that developers increasingly rely upon. It's probably past time we graduate it into production.

Moving into production will bring other benefits, primarily allowing Gerrit to replicate Git repos instead of codesearch having to poll. Also better monitoring, etc.

The current architecture is roughly documented at https://www.mediawiki.org/wiki/Codesearch/Admin and is already fully puppetized.

Event Timeline

Legoktm updated the task description. (Show Details)

In general I like the idea but I'm slightly worried with the limited time and energy we have and given that we would eventually migrate to gitlab, we could use this time and energy on something more useful. I'm not saying we shouldn't do it, I'm saying given the limited resources we can do something more lasting. If anyone feels strongly otherwise, I'm definitely okay with helping them out getting it to production.

I am for this. This is actually pretty easy to do since we already puppetized it. If we just create a ganeti VM and put the puppet role on it then all that is left should be basically creating a certificate, including one more profile, a few lines of config in Trafficserver. And the resources it needs are not an issue either when comparing to other VMs.

This would be finally an example of a service actually moving from cloud into prod instead of becoming something that we rely on but also stays "semi-prod" in cloud forever which usually happened.

If it's that simple and a ganeti VM makes things much easier, I think doing it would be pretty easy. 1- We need to use webproxy for pulling from github 2- it needs a security review?

Yes, you are right it would need a security review, that's a good point. That might take a bit to get scheduled.

Using webproxy should not be an issue, we do that for other things as well, for example planet pulling in RSS feeds.

Dzahn triaged this task as Medium priority.Nov 23 2020, 11:01 PM

We will also need to build the docker image ourselves instead of relying on upstream's on docker hub.

In general I like the idea but I'm slightly worried with the limited time and energy we have and given that we would eventually migrate to gitlab, we could use this time and energy on something more useful. I'm not saying we shouldn't do it, I'm saying given the limited resources we can do something more lasting. If anyone feels strongly otherwise, I'm definitely okay with helping them out getting it to production.

I felt the same way with the GitLab migration coming, but I think having reviewed what GitLab CE offers and what the migration process is expected to be like, we're going to end up supporting codesearch in some fashion for at least another year, very likely 2. In which case moving to production will probably be worth it. Also after we migrate if we can switch to using Gerrit replication instead of polling, that gets rid of the biggest failure point of this too.

As for moving forward, I think we have some clean up to do first, I'll look through those and mark some as blockers.

I agree that we should not fall into the trap of not doing things "because soon gitlab" because it will likely take a long time during which we will have gerrit, phabricator, github and gitlab repos.

LSobanski raised the priority of this task from Medium to Needs Triage.Nov 4 2022, 3:36 PM
LSobanski moved this task from Incoming to Backlog on the collaboration-services board.
LSobanski subscribed.

Need to figure out long term ownership before committing to this work.

given that we would eventually migrate to gitlab

GitLab CE does not have any cross project code search capabilities. Cross project search is an aspect of the advanced search feature which is only available in the non-gratis, non-libre Premium and Ultimate product offerings.

Now we have to upgrade the buster machines in cloud anyways, for T367479.

This is a good opportunity because it's basically a good chunk of the work also needed for this ticket.

Change #1043901 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] codesearch: add support for docker-ce on bookworm

https://gerrit.wikimedia.org/r/1043901

Change #1043901 abandoned by Dzahn:

[operations/puppet@production] codesearch: add support for docker-ce on bookworm

Reason:

in favor of https://gerrit.wikimedia.org/r/c/operations/puppet/+/1046724

https://gerrit.wikimedia.org/r/1043901

One thing to be done here:

add a systemd unit for the frontend. See T367479#9904556