Page MenuHomePhabricator

Add analytics/* gerrit repos to code search
Closed, ResolvedPublic

Description

I was just about to try to locate the code for the sqoop of wb_terms into hadoop and I figured code search might be a good starting place.

However I don't appear to be able to find the analytics repos in the tool at all

Would we be able to add them?

EDIT: the list of repos is:

Important repos:

  • analytics/analytics.wikimedia.org
  • analytics/aqs
  • analytics/camus
  • analytics/dashiki
  • analytics/kafkatee
  • analytics/mediawiki-storage
  • analytics/quarry/web
  • analytics/refinery
  • analytics/refinery/scap
  • analytics/refinery/source
  • analytics/reportupdater
  • analytics/reportupdater-queries
  • analytics/statsv
  • analytics/ua-parser
  • analytics/ua-parser/uap-java
  • analytics/wikihadoop
  • analytics/wikimetrics
  • analytics/wikistats
  • analytics/wikistats2

All matching analytics.*: analytics/abacist, analytics/aggregator, analytics/aggregator/data, analytics/aggregator/projectview/data, analytics/analytics.wikimedia.org, analytics/aqs, analytics/aqs/deploy, analytics/asana-stats, analytics/blog, analytics/camus, analytics/dashiki, analytics/data-warehouse, analytics/dclass, analytics/discovery-stats, analytics/geowiki, analytics/geowiki/data-public, analytics/glass, analytics/global-dev/dashboard, analytics/global-dev/dashboard-data, analytics/hdfs-tools/deploy, analytics/jupyterhub/deploy, analytics/kafkatee, analytics/kraken, analytics/kraken/deploy, analytics/libanon, analytics/libcidr, analytics/limn, analytics/limn-analytics-data, analytics/limn-edit-data, analytics/limn-ee-data, analytics/limn-extdist-data, analytics/limn-flow-data, analytics/limn-language-data, analytics/limn-mobile-data, analytics/limn-multimedia-data, analytics/limn-wikidata-data, analytics/limn-wikidata-data/vendor, analytics/log2udp2, analytics/mediawiki-storage, analytics/metrics, analytics/multimedia, analytics/multimedia/config, analytics/pageview-api, analytics/pivot, analytics/pivot/deploy, analytics/proof-of-concept, analytics/quarry/web, analytics/refinery, analytics/refinery/scap, analytics/refinery/source, analytics/reportcard, analytics/reportcard/data, analytics/reportupdater, analytics/reportupdater-queries, analytics/snuggle, analytics/statsd-ganglia, analytics/statsv, analytics/superset/deploy, analytics/swap/deploy, analytics/tools/kripke, analytics/turnilo/deploy, analytics/ua-parser, analytics/ua-parser/uap-core, analytics/ua-parser/uap-java, analytics/udp-filters, analytics/udplog, analytics/user-metrics, analytics/vagrant/build, analytics/vagrant/kraken, analytics/websites_maintenance, analytics/webstatscollector, analytics/wikihadoop, analytics/wikimetrics, analytics/wikimetrics-deploy, analytics/wikipagestats, analytics/wikistats, analytics/wikistats2, analytics/wmde, analytics/wmde/NewEditors, analytics/wmde/NewEditors/wmdeBannerCampaigns_Dashboard, analytics/wmde/TW, analytics/wmde/TW/AdvancedSearchExtension-Dashboard, analytics/wmde/WD, analytics/wmde/WD/WD_identifierLandscape, analytics/wmde/WD/WD_languagesLandscape, analytics/wmde/WD/WD_percentUsageDashboard, analytics/wmde/WDCM, analytics/wmde/WDCM-Biases-Dashboard, analytics/wmde/WDCM-GeoDashboard, analytics/wmde/WDCM-Journal, analytics/wmde/WDCM-Overview-Dashboard, analytics/wmde/WDCM-Semantics-Dashboard, analytics/wmde/WDCM-ShinyServerFrontPage, analytics/wmde/WDCM-Sitelinks-Dashboard, analytics/wmde/WDCM-Structure-Dashboard, analytics/wmde/WDCM-Titles-Dashboard, analytics/wmde/WDCM-Usage-Dashboard, analytics/wmde/WDCM-WikipediaSemantics-Dashboard, analytics/wmde/WDCM-packages, analytics/wmde/Wiktionary, analytics/wmde/WiktionaryCognateDashboard, analytics/wmde/scripts, analytics/wmde/toolkit-analyzer, analytics/wmde/toolkit-analyzer-build, analytics/wmf-product, analytics/wp-zero, analytics/zero-sms

Event Timeline

Addshore created this task.Apr 3 2020, 10:28 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 3 2020, 10:28 AM

Oh boy that's a LOT of repos, listing them would be pretty fun. I was thinking maybe that needs a dedicated group. What do you think @Legoktm? Maybe we can add the most important ones for now

Milimetric moved this task from Incoming to Ops Week on the Analytics board.
Milimetric added a subscriber: Milimetric.

I don't know how to add them. But would love to.

I don't know how to add them. But would love to.

Awesome. It's in this file: https://gerrit.wikimedia.org/r/plugins/gitiles/labs/codesearch/+/master/write_config.py

The script builds the hound config json file.

Milimetric added a comment.EditedApr 16 2020, 7:55 PM

Ok, easy enough, putting the list in the description so others can annotate it. I'm not sure what you mean by making a group, that python script looks rather manual, shall I just start a new convention and make a yaml file to list the repos and load from there?

Milimetric updated the task description. (Show Details)Apr 16 2020, 8:04 PM

poke @Ladsgroup / @Legoktm: any guidance here before I start a refactor?

Your idea seems good and sensible to me. I don't see anything wrong with it. Maybe write tests to make sure things work? That's a standard guidance in every refactor :P

I had some old code lying around that automatically generate the repo lists from Gerrit's prefix search, just committed in https://gerrit.wikimedia.org/r/c/labs/codesearch/+/608193

Creating an analytics filter using that will be trivial, I'll submit a patch in a few.

Change 608200 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[labs/codesearch@master] Add analytics search profile

https://gerrit.wikimedia.org/r/c/labs/codesearch/ /608200

Change 608200 merged by jenkins-bot:
[labs/codesearch@master] Add analytics search profile

https://gerrit.wikimedia.org/r/c/labs/codesearch/ /608200

Change 608203 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[operations/puppet@production] codesearch: Add port for analytics search profile

https://gerrit.wikimedia.org/r/c/operations/puppet/ /608203

I deployed the codesearch part, once the puppet patch is merged this should go live.

Change 608203 merged by Dzahn:
[operations/puppet@production] codesearch: Add port for analytics search profile

https://gerrit.wikimedia.org/r/c/operations/puppet/ /608203

https://codesearch.wmflabs.org/analytics/ is live now, except... uBlock Origin blocks analytics/js by default, so it doesn't work unless you disable that rule. How big of a problem is that going to be? We could rename the path to something like analytics-real to prevent the rule from matching I suppose...

Milimetric closed this task as Resolved.Thu, Jul 9, 5:44 PM

Thank you very much for doing it better than I was going to!

I don't use uBlock so I have no opinions, but it seems like the main request here is done.