Page MenuHomePhabricator

Investigate whether we should switch from miniconda to miniforge
Open, MediumPublic

Description

The Data-Platform team currently makes use of Miniconda to provide a number of Anaconda based environments.

These include:

As well as some other projects that make use of WMF Data Workflow Utils

Anaconda has announced that the licensing for their software will be changing in such a way that non-profit and academic institutions will have to start paying for their software.
c.f. https://www.theregister.com/2024/08/08/anaconda_puts_the_squeeze_on/

We may want to start thinking about a migration from Miniconda to Miniforge, before this becomes a problem.

Event Timeline

Gehel triaged this task as Medium priority.Aug 14 2024, 8:33 AM
Gehel moved this task from Incoming to Scratch on the Data-Platform-SRE board.
Gehel moved this task from Scratch to Software Upgrades on the Data-Platform-SRE board.

It seems I was able to replace miniconda by miniforge for airflow-dags quite easily: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/829/diffs

In the conda env build job output, we clearly see that the installed packages come from the conda-forge channel:

#16 4.409   Package                         Version  Build               Channel           Size
#16 4.409 ───────────────────────────────────────────────────────────────────────────────────────
#16 4.409   Install:
#16 4.409 ───────────────────────────────────────────────────────────────────────────────────────
#16 4.409 
#16 4.409   + _libgcc_mutex                     0.1  conda_forge         conda-forge     Cached
#16 4.409   + _openmp_mutex                     4.5  2_gnu               conda-forge     Cached
#16 4.409   + boltons                        23.0.0  pyhd8ed1ab_0        conda-forge     Cached
#16 4.409   + brotli-python                   1.0.9  py310hd8f1fbe_9     conda-forge     Cached
#16 4.409   + bzip2                           1.0.8  h7f98852_4          conda-forge     Cached
#16 4.409   + c-ares                         1.19.1  hd590300_0          conda-forge     Cached
#16 4.409   + ca-certificates             2023.7.22  hbcca054_0          conda-forge     Cached
#16 4.409   + certifi                     2023.7.22  pyhd8ed1ab_0        conda-forge     Cached
#16 4.409   + cffi                           1.15.1  py310h255011f_3     conda-forge     Cached
#16 4.409   + charset-normalizer              3.2.0  pyhd8ed1ab_0        conda-forge     Cached
#16 4.409   + colorama                        0.4.6  pyhd8ed1ab_0        conda-forge     Cached
#16 4.409   + conda                          23.3.1  py310hff52083_0     conda-forge     Cached
#16 4.409   + conda-libmamba-solver          23.3.0  pyhd8ed1ab_0        conda-forge     Cached
#16 4.409   + conda-package-handling          2.2.0  pyh38be061_0        conda-forge     Cached
#16 4.409   + conda-package-streaming         0.9.0  pyhd8ed1ab_0        conda-forge     Cached
...

Actually, I won't claim that task until we discuss it further. We now do have a PoC that shows how we could swap miniconda for miniforge. Let's discuss it further.

Actually, now that we've merged the work to the airflow-dags repo, I can tackle the same thing in the conda-analytics repo, and I'll defer the mediawiki-content-dump change to someone else, I think.

Actually, now that we've merged the work to the airflow-dags repo, I can tackle the same thing in the conda-analytics repo, and I'll defer the mediawiki-content-dump change to someone else, I think.

mediawiki-content-dump will require extra surgery; it will not just be that particular repo, but the repo that has all the tools to build conda-packs, which is https://gitlab.wikimedia.org/repos/data-engineering/workflow_utils. Good thing is that that would also cover upstream dependencies for all other PySpark jobs.

A question though because I am confused: are we committing to do this work or is this still just a PoC?

I think that given that it's an easy switch and that we might have legal exposure, I'm just making it happen in codebases I "control".

A question though because I am confused: are we committing to do this work or is this still just a PoC?

At least for the repositories where the transition is easy, we should do it. For more complex projects, we can wait on Legal's evaluation of the urgency.

Data-Platform-SRE will document the process and create tasks for the use cases we know, but we will let the owners of each projects to do the implementation.

I'm going to have a look at this for a little while, since we have a gritty problem with the CI in https://gitlab.wikimedia.org/repos/data-engineering/conda-analytics/-/merge_requests/52

I think that it's something to do with the way that conda-analytics-clone is working, but I'm not sure yet.

I have been attempting to recreate this locally, rather than in CI.
The command that I am using to build the environment is:

docker build --platform linux/amd64 -f docker/Dockerfile -t conda-analytics .

I can get a shell in the conda-analytics enviraonment with:

docker run --platform linux/amd64 --memory="4g" --cpus="2.0" --rm -it conda-analytics

I have reproduced the issue with the dependencies.

However, when I run the command conda-analytics-clone mytestenv interactively in the conda-analytics container, it succeeds.

btullis@marlin:~/wmf/conda-analytics$ docker run --platform linux/amd64 --memory="4g" --cpus="2.0" --rm -it conda-analytics
root@5207f88f0826:/# conda-analytics-clone mytestenv
Creating new cloned conda env mytestenv...
Source:      /opt/conda-analytics
Destination: /root/.conda/envs/mytestenv
The following packages cannot be cloned out of the root environment:
 - conda-forge/linux-64::conda-23.10.0-py310hff52083_1
 - conda-forge/noarch::conda-libmamba-solver-23.12.0-pyhd8ed1ab_0
Packages: 222
Files: 958

Downloading and Extracting Packages:


Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate mytestenv
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /root/.conda/envs/mytestenv

  added / updated specs:
    - conda-libmamba-solver=23.12.0
    - conda=23.10.0


The following NEW packages will be INSTALLED:

  conda              conda-forge/linux-64::conda-23.10.0-py310hff52083_1 
  conda-libmamba-so~ conda-forge/noarch::conda-libmamba-solver-23.12.0-pyhd8ed1ab_0 



Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Wed Sep 25 17:34:53 UTC 2024 Created user conda environment mytestenv

To activate this environment with vanilla conda run:
  source /opt/conda-analytics/etc/profile.d/conda.sh
  conda activate mytestenv

Alternatively, you can use the conda-analytic helper script:
  source conda-analytics-activate mytestenv

root@5207f88f0826:/#

The difference seems to be the versions of conda and conda-libmamba-solver that cannot be cloned out of the root environment.
The version from the CI output shows this:

The following packages cannot be cloned out of the root environment:
- conda-forge/linux-64::conda-23.10.0-py310hff52083_1
- conda-forge/noarch::conda-libmamba-solver-24.7.0-pyhd8ed1ab_0

...whereas the versions that were shown when I ran the command interactively were this:

The following packages cannot be cloned out of the root environment:
 - conda-forge/linux-64::conda-23.10.0-py310hff52083_1
 - conda-forge/noarch::conda-libmamba-solver-23.12.0-pyhd8ed1ab_0

I think I have found out what the issue is. It is down to the way that we were running the command to install the conda and conda-libmamba-solver packages to the newly clone environment.
The patch is here: https://gitlab.wikimedia.org/repos/data-engineering/conda-analytics/-/merge_requests/52/diffs?commit_id=f8fe991aac0b561118e65bbce8d5dc243953f699

Running a build now, to see if the package checks out.

Woohoo, it worked! Thanks a ton Ben :)

I'll just test the build on an-test-client1002 before merging, to make sure that things like jupyterhub and pyspark don't show any problems. If they work, then I will merge it, add the deb to the apt repo, then push out the new version of conda-analytics to the test cluster.

btullis@an-test-client1002:~$ wget https://gitlab.wikimedia.org/api/v4/projects/359/packages/generic/conda-analytics/0.0.36/conda-analytics-0.0.36_amd64.deb
--2024-09-26 12:21:38--  https://gitlab.wikimedia.org/api/v4/projects/359/packages/generic/conda-analytics/0.0.36/conda-analytics-0.0.36_amd64.deb
Resolving gitlab.wikimedia.org (gitlab.wikimedia.org)... 2620:0:860:1:208:80:153:8, 208.80.153.8
Connecting to gitlab.wikimedia.org (gitlab.wikimedia.org)|2620:0:860:1:208:80:153:8|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1049362336 (1001M) [application/octet-stream]
Saving to: ‘conda-analytics-0.0.36_amd64.deb’

conda-analytics-0.0.36_amd64.deb                     100%[=====================================================================================================================>]   1001M  87.3MB/s    in 10s     

2024-09-26 12:21:48 (97.9 MB/s) - ‘conda-analytics-0.0.36_amd64.deb’ saved [1049362336/1049362336]

btullis@an-test-client1002:~$ sudo dpkg -i conda-analytics-0.0.36_amd64.deb 
(Reading database ... 266242 files and directories currently installed.)
Preparing to unpack conda-analytics-0.0.36_amd64.deb ...
Unpacking conda-analytics (0.0.36) over (0.0.35) ...
Setting up conda-analytics (0.0.36) ...
Post install script.
  Running /opt/conda-analytics/bin/python /opt/conda-analytics/bin/conda-unpack...
btullis@an-test-client1002:~$ sudo systemctl restart jupyterhub-conda.service 
btullis@an-test-client1002:~$

Oh, sadly starting a new Jupyter session with a freshly cloned environment didn't work.

image.png (406×1 px, 39 KB)

Checking the logs of my server with journalctl -u jupyter-btullis-singleuser-conda-analytics.service I can see an error:

-- Journal begins at Thu 2024-09-26 01:20:36 UTC, ends at Thu 2024-09-26 12:35:08 UTC. --
Sep 26 12:28:58 an-test-client1002 systemd[1]: Started /bin/bash -c cd /home/btullis && exec /etc/jupyterhub-conda/jupyterhub-singleuser-conda-env.sh __NEW__ --port=36157 --SingleUserNotebookApp.default_url=/la>
Sep 26 12:28:58 an-test-client1002 jupyterhub-conda-singleuser[1067451]: Creating new cloned conda env 2024-09-26T12.28.58_btullis...
Sep 26 12:29:05 an-test-client1002 jupyterhub-conda-singleuser[1067474]: Source:      /opt/conda-analytics
Sep 26 12:29:05 an-test-client1002 jupyterhub-conda-singleuser[1067474]: Destination: /home/btullis/.conda/envs/2024-09-26T12.28.58_btullis
Sep 26 12:29:05 an-test-client1002 jupyterhub-conda-singleuser[1067474]: The following packages cannot be cloned out of the root environment:
Sep 26 12:29:05 an-test-client1002 jupyterhub-conda-singleuser[1067474]:  - conda-forge/linux-64::conda-23.10.0-py310hff52083_1
Sep 26 12:29:05 an-test-client1002 jupyterhub-conda-singleuser[1067474]:  - conda-forge/noarch::conda-libmamba-solver-24.7.0-pyhd8ed1ab_0
Sep 26 12:29:05 an-test-client1002 jupyterhub-conda-singleuser[1067474]: Packages: 225
Sep 26 12:29:05 an-test-client1002 jupyterhub-conda-singleuser[1067474]: Files: 1257
Sep 26 12:29:05 an-test-client1002 jupyterhub-conda-singleuser[1067474]: Downloading and Extracting Packages: ...working... done
Sep 26 12:29:05 an-test-client1002 jupyterhub-conda-singleuser[1067474]: Downloading and Extracting Packages: ...working... done
Sep 26 12:29:07 an-test-client1002 jupyterhub-conda-singleuser[1067474]: Preparing transaction: ...working... done
Sep 26 12:29:16 an-test-client1002 jupyterhub-conda-singleuser[1067474]: Verifying transaction: ...working... done
Sep 26 12:29:42 an-test-client1002 jupyterhub-conda-singleuser[1067474]: Executing transaction: ...working... done
Sep 26 12:29:42 an-test-client1002 jupyterhub-conda-singleuser[1067474]: #
Sep 26 12:29:42 an-test-client1002 jupyterhub-conda-singleuser[1067474]: # To activate this environment, use
Sep 26 12:29:42 an-test-client1002 jupyterhub-conda-singleuser[1067474]: #
Sep 26 12:29:42 an-test-client1002 jupyterhub-conda-singleuser[1067474]: #     $ conda activate 2024-09-26T12.28.58_btullis
Sep 26 12:29:42 an-test-client1002 jupyterhub-conda-singleuser[1067474]: #
Sep 26 12:29:42 an-test-client1002 jupyterhub-conda-singleuser[1067474]: # To deactivate an active environment, use
Sep 26 12:29:42 an-test-client1002 jupyterhub-conda-singleuser[1067474]: #
Sep 26 12:29:42 an-test-client1002 jupyterhub-conda-singleuser[1067474]: #     $ conda deactivate
Sep 26 12:29:42 an-test-client1002 jupyterhub-conda-singleuser[1067451]: Installing the conda and conda-libmamba-solver packages to the newly cloned environment: 2024-09-26T12.28.58_btullis
Sep 26 12:29:44 an-test-client1002 jupyterhub-conda-singleuser[1067673]: Channels:
Sep 26 12:29:44 an-test-client1002 jupyterhub-conda-singleuser[1067673]:  - conda-forge
Sep 26 12:29:44 an-test-client1002 jupyterhub-conda-singleuser[1067673]: Platform: linux-64
Sep 26 12:30:03 an-test-client1002 jupyterhub-conda-singleuser[1067673]: Collecting package metadata (repodata.json): ...working... done
Sep 26 12:31:01 an-test-client1002 jupyterhub-conda-singleuser[1067673]: Solving environment: ...working... failed
Sep 26 12:31:01 an-test-client1002 jupyterhub-conda-singleuser[1067673]: LibMambaUnsatisfiableError: Encountered problems while solving:
Sep 26 12:31:01 an-test-client1002 jupyterhub-conda-singleuser[1067673]:   - package conda-libmamba-solver-24.7.0-pyhd8ed1ab_0 is excluded by strict repo priority
Sep 26 12:31:01 an-test-client1002 systemd[1]: jupyter-btullis-singleuser-conda-analytics.service: Main process exited, code=exited, status=1/FAILURE
Sep 26 12:31:01 an-test-client1002 systemd[1]: jupyter-btullis-singleuser-conda-analytics.service: Failed with result 'exit-code'.
Sep 26 12:31:01 an-test-client1002 systemd[1]: jupyter-btullis-singleuser-conda-analytics.service: Consumed 1min 45.288s CPU time.

So it's the same command that is failing, but I don't yet understand why.

This one worked.
I removed the strict channel priority from /opt/conda-analytics/condarc in https://gitlab.wikimedia.org/repos/data-engineering/conda-analytics/-/merge_requests/52/diffs?commit_id=583705655587d8be8c81497564e0068f4efb4728 and then reinstalled the deb and restarted the jupytherhub-conda service.

Sep 26 13:59:02 an-test-client1002 jupyterhub-conda-singleuser[1114433]: Creating new cloned conda env 2024-09-26T13.59.02_btullis...
Sep 26 13:59:10 an-test-client1002 jupyterhub-conda-singleuser[1114458]: Source:      /opt/conda-analytics
Sep 26 13:59:10 an-test-client1002 jupyterhub-conda-singleuser[1114458]: Destination: /home/btullis/.conda/envs/2024-09-26T13.59.02_btullis
Sep 26 13:59:10 an-test-client1002 jupyterhub-conda-singleuser[1114458]: The following packages cannot be cloned out of the root environment:
Sep 26 13:59:10 an-test-client1002 jupyterhub-conda-singleuser[1114458]:  - conda-forge/linux-64::conda-23.10.0-py310hff52083_1
Sep 26 13:59:10 an-test-client1002 jupyterhub-conda-singleuser[1114458]:  - conda-forge/noarch::conda-libmamba-solver-24.7.0-pyhd8ed1ab_0
Sep 26 13:59:10 an-test-client1002 jupyterhub-conda-singleuser[1114458]: Packages: 225
Sep 26 13:59:10 an-test-client1002 jupyterhub-conda-singleuser[1114458]: Files: 1257
Sep 26 13:59:10 an-test-client1002 jupyterhub-conda-singleuser[1114458]: Downloading and Extracting Packages: ...working... done
Sep 26 13:59:10 an-test-client1002 jupyterhub-conda-singleuser[1114458]: Downloading and Extracting Packages: ...working... done
Sep 26 13:59:12 an-test-client1002 jupyterhub-conda-singleuser[1114458]: Preparing transaction: ...working... done
Sep 26 13:59:21 an-test-client1002 jupyterhub-conda-singleuser[1114458]: Verifying transaction: ...working... done
Sep 26 13:59:44 an-test-client1002 jupyterhub-conda-singleuser[1114458]: Executing transaction: ...working... done
Sep 26 13:59:44 an-test-client1002 jupyterhub-conda-singleuser[1114458]: #
Sep 26 13:59:44 an-test-client1002 jupyterhub-conda-singleuser[1114458]: # To activate this environment, use
Sep 26 13:59:44 an-test-client1002 jupyterhub-conda-singleuser[1114458]: #
Sep 26 13:59:44 an-test-client1002 jupyterhub-conda-singleuser[1114458]: #     $ conda activate 2024-09-26T13.59.02_btullis
Sep 26 13:59:44 an-test-client1002 jupyterhub-conda-singleuser[1114458]: #
Sep 26 13:59:44 an-test-client1002 jupyterhub-conda-singleuser[1114458]: # To deactivate an active environment, use
Sep 26 13:59:44 an-test-client1002 jupyterhub-conda-singleuser[1114458]: #
Sep 26 13:59:44 an-test-client1002 jupyterhub-conda-singleuser[1114458]: #     $ conda deactivate
Sep 26 13:59:44 an-test-client1002 jupyterhub-conda-singleuser[1114433]: Installing the conda and conda-libmamba-solver packages to the newly cloned environment: 2024-09-26T13.59.02_btullis
Sep 26 13:59:46 an-test-client1002 jupyterhub-conda-singleuser[1114646]: Channels:
Sep 26 13:59:46 an-test-client1002 jupyterhub-conda-singleuser[1114646]:  - defaults
Sep 26 13:59:46 an-test-client1002 jupyterhub-conda-singleuser[1114646]: Platform: linux-64
Sep 26 13:59:50 an-test-client1002 jupyterhub-conda-singleuser[1114646]: Collecting package metadata (repodata.json): ...working... done
Sep 26 13:59:52 an-test-client1002 jupyterhub-conda-singleuser[1114646]: Solving environment: ...working... done
Sep 26 13:59:53 an-test-client1002 jupyterhub-conda-singleuser[1114646]: ## Package Plan ##
Sep 26 13:59:53 an-test-client1002 jupyterhub-conda-singleuser[1114646]:   environment location: /home/btullis/.conda/envs/2024-09-26T13.59.02_btullis
Sep 26 13:59:53 an-test-client1002 jupyterhub-conda-singleuser[1114646]:   added / updated specs:
Sep 26 13:59:53 an-test-client1002 jupyterhub-conda-singleuser[1114646]:     - conda-libmamba-solver=24.7.0
Sep 26 13:59:53 an-test-client1002 jupyterhub-conda-singleuser[1114646]:     - conda=23.10.0
Sep 26 13:59:53 an-test-client1002 jupyterhub-conda-singleuser[1114646]: The following NEW packages will be INSTALLED:
Sep 26 13:59:53 an-test-client1002 jupyterhub-conda-singleuser[1114646]:   conda              conda-forge/linux-64::conda-23.10.0-py310hff52083_1
Sep 26 13:59:53 an-test-client1002 jupyterhub-conda-singleuser[1114646]:   conda-libmamba-so~ conda-forge/noarch::conda-libmamba-solver-24.7.0-pyhd8ed1ab_0
Sep 26 13:59:53 an-test-client1002 jupyterhub-conda-singleuser[1114646]: Downloading and Extracting Packages: ...working... done
Sep 26 13:59:53 an-test-client1002 jupyterhub-conda-singleuser[1114646]: Preparing transaction: ...working... done
Sep 26 13:59:54 an-test-client1002 jupyterhub-conda-singleuser[1114646]: Verifying transaction: ...working... done
Sep 26 13:59:54 an-test-client1002 jupyterhub-conda-singleuser[1114646]: Executing transaction: ...working... done
Sep 26 13:59:55 an-test-client1002 jupyterhub-conda-singleuser[1114433]: Thu 26 Sep 2024 01:59:55 PM UTC Created user conda environment 2024-09-26T13.59.02_btullis
Sep 26 13:59:55 an-test-client1002 jupyterhub-conda-singleuser[1114433]: To activate this environment with vanilla conda run:
Sep 26 13:59:55 an-test-client1002 jupyterhub-conda-singleuser[1114433]:   source /opt/conda-analytics/etc/profile.d/conda.sh
Sep 26 13:59:55 an-test-client1002 jupyterhub-conda-singleuser[1114433]:   conda activate 2024-09-26T13.59.02_btullis
Sep 26 13:59:55 an-test-client1002 jupyterhub-conda-singleuser[1114433]: Alternatively, you can use the conda-analytic helper script:
Sep 26 13:59:55 an-test-client1002 jupyterhub-conda-singleuser[1114433]:   source conda-analytics-activate 2024-09-26T13.59.02_btullis

I have added the new conda-analytics version 0.0.36 to the apt repository:

btullis@apt1002:~$ wget https://gitlab.wikimedia.org/api/v4/projects/359/packages/generic/conda-analytics/0.0.36/conda-analytics-0.0.36_amd64.deb
--2024-09-27 12:01:58--  https://gitlab.wikimedia.org/api/v4/projects/359/packages/generic/conda-analytics/0.0.36/conda-analytics-0.0.36_amd64.deb
Resolving gitlab.wikimedia.org (gitlab.wikimedia.org)... 2620:0:860:1:208:80:153:8, 208.80.153.8
Connecting to gitlab.wikimedia.org (gitlab.wikimedia.org)|2620:0:860:1:208:80:153:8|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1049427424 (1001M) [application/octet-stream]
Saving to: ‘conda-analytics-0.0.36_amd64.deb’

conda-analytics-0.0.36_amd64.deb                     100%[=====================================================================================================================>]   1001M   110MB/s    in 9.4s    

2024-09-27 12:02:08 (107 MB/s) - ‘conda-analytics-0.0.36_amd64.deb’ saved [1049427424/1049427424]

btullis@apt1002:~$ sudo -i reprepro -C main includedeb bookworm-wikimedia `pwd`/conda-analytics-0.0.36_amd64.deb
Exporting indices...
btullis@apt1002:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia `pwd`/conda-analytics-0.0.36_amd64.deb
Exporting indices...

I'll push this out to the test cluster today.

I have deployed this new version to the hadoop-test cluster.

btullis@cumin1002:~$ sudo debdeploy deploy -u 2024-09-27-conda-analytics.yaml -s hadoop-test
Rolling out conda-analytics:
Library update, several services might need to be restarted

conda-analytics was updated: 0.0.35 -> 0.0.36
  an-test-coord1001.eqiad.wmnet,an-test-
master[1001-1002].eqiad.wmnet,an-test-worker[1001-1003].eqiad.wmnet (6
hosts)

These hosts are already up-to-date:
  an-test-client1002.eqiad.wmnet (1 hosts)

The package to be updated isn't installed on these hosts:
  an-test-ui1001.eqiad.wmnet (1 hosts)

If everything is OK over the weekend, then I'll upgrade production early next week.

BTullis updated the task description. (Show Details)
BTullis updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-analytics) [2024-09-30T11:19:55Z] <btullis> rolling out conda-analytics v0.0.36 to production for T372417