Page MenuHomePhabricator

Extract bot classification into new private repo
Closed, ResolvedPublic

Description

In our revamping of bot classification, we would like to extract and centralize the new process into its own repo, private.

Linked with:

Details

Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Use main image in canary events dagrepos/data-engineering/airflow-dags!2063aquT415874_use_main_airflow_image_in_canary_events_dagmain
Fix Airflow image with missing refinery jarsrepos/data-engineering/airflow-dags!2062aquT415874_fix_airflow_imagemain
bot-detection pipeline fixrepos/data-engineering/airflow-dags!2059aquT415874_switch_bot_pipeline_repository_fix2main
Hotfix: Remove optimization on webrequest_actor pipelinerepos/data-engineering/airflow-dags!2058aquT415874_switch_bot_pipeline_repository_fixmain
analytics-test: Fix a dag of test using new Airflow imagerepos/data-engineering/airflow-dags!2054aquT415874_fix_analytics_test_test_dagmain
Bump workflow-utils from 0.30.0 to 0.32.0repos/data-engineering/airflow-dags!2051aquT415874_bump_prod_image_with_workflow_utilsmain
Update blunderbuss with last version of workflow_utilsrepos/data-engineering/blunderbuss!15aquT415874_workflow_utils_bumpmain
Fix fsspec_options passingrepos/data-engineering/workflow_utils!64aquT415874_pass_fsspec_options_to_fsmain
Bump blunderbuss bugler versionrepos/data-engineering/airflow-dags!2044aquT415874_bump_blunderbuss_bugler_1.5.2main
Print debug lines about env varsrepos/data-engineering/blunderbuss-bugler!24aquT415874_add_debug_printingmain
Bump blunderbuss Bugler to support extra parameterrepos/data-engineering/airflow-dags!2043aquT415874_bump_blunderbuss_buglermain
Analytics-test: Test new Airflow image with bump in workflow-utilsrepos/data-engineering/airflow-dags!2035aquT415874_test_new_docker_image_in_analytics_testmain
Bump workflow-utils from 0.13 to 0.30.0repos/data-engineering/airflow-dags!2029aquT415874_bump_workflow_utils_in_mainmain
Add Jinja2 rendering of artifact_sources config via blunderbuss_bugler_env_varsrepos/data-engineering/blunderbuss!13aquT415874_private_repo_supportmain
Add fsspec_options parameter to FsArtifactSourcerepos/data-engineering/workflow_utils!59aquT415874_pass_custom_headers_for_urlmain
Add blunderbuss_bugler_env_vars to request payloadrepos/data-engineering/blunderbuss-bugler!22aquT415874_private_repo_supportmain
Use refinery-private archive for webrequest actor HQL scriptsrepos/data-engineering/airflow-dags!1988aquT415874_switch_bot_pipeline_repositorymain
Show related patches Customize query in GitLab

Event Timeline

Change #1237928 had a related patch set uploaded (by Aqu; author: Aqu):

[analytics/refinery@master] Move bot detection pipeline into new repo

https://gerrit.wikimedia.org/r/1237928

aqu opened https://gitlab.wikimedia.org/repos/data-engineering/blunderbuss/-/merge_requests/13

Add Jinja2 rendering of artifact_sources config via blunderbuss_bugler_env_vars

aqu merged https://gitlab.wikimedia.org/repos/data-engineering/blunderbuss/-/merge_requests/13

Add Jinja2 rendering of artifact_sources config via blunderbuss_bugler_env_vars

Change #1243221 had a related patch set uploaded (by Aqu; author: Aqu):

[operations/deployment-charts@master] Bump Blunderbuss image

https://gerrit.wikimedia.org/r/1243221

Change #1243221 merged by jenkins-bot:

[operations/deployment-charts@master] Bump Blunderbuss image

https://gerrit.wikimedia.org/r/1243221

Change #1247570 had a related patch set uploaded (by Aqu; author: Aqu):

[operations/deployment-charts@master] dse-k8s airflow-analytics-test: Bump image

https://gerrit.wikimedia.org/r/1247570

Change #1248015 had a related patch set uploaded (by Aqu; author: Aqu):

[operations/deployment-charts@master] dse-k8s-services Blunderbuss: Bump image

https://gerrit.wikimedia.org/r/1248015

Change #1248015 merged by jenkins-bot:

[operations/deployment-charts@master] dse-k8s-services Blunderbuss: Bump image

https://gerrit.wikimedia.org/r/1248015

Change #1247570 merged by Brouberol:

[operations/deployment-charts@master] dse-k8s airflow-analytics-test: Bump image

https://gerrit.wikimedia.org/r/1247570

Change #1249094 had a related patch set uploaded (by Aqu; author: Aqu):

[operations/deployment-charts@master] dse-k8s-services Airflow: Bump image

https://gerrit.wikimedia.org/r/1249094

Change #1249094 merged by Brouberol:

[operations/deployment-charts@master] dse-k8s-services Airflow: Bump image

https://gerrit.wikimedia.org/r/1249094

Antoine_Quhen renamed this task from Extract bot classification into new repo to Extract bot classification into new private repo.Mar 9 2026, 8:59 AM
Antoine_Quhen updated the task description. (Show Details)

Summary of the work done here.

Change #1249936 had a related patch set uploaded (by Aqu; author: Aqu):

[operations/deployment-charts@master] Bump Airflow image to include missing jars

https://gerrit.wikimedia.org/r/1249936

Change #1249936 merged by Brouberol:

[operations/deployment-charts@master] Bump Airflow image to include missing jars

https://gerrit.wikimedia.org/r/1249936

Change #1237928 merged by Aqu:

[analytics/refinery@master] Move bot detection pipeline into new repo

https://gerrit.wikimedia.org/r/1237928

The legacy pipeline has been extracted, and removed from analytics-refinery.