Page MenuHomePhabricator

Rebuild all hadoop packages for bullseye with different distribution suffix mechanism
Closed, ResolvedPublic

Description

Problem Statement

We originally built hadoop packages for bullseye in the following ticket: T310643: Build Bigtop 1.5 Hadoop packages for Bullseye
In doing so, we encountered a problem in that the bigtop build scripts create identically named packages for each distribution.
For example, the deb file for hive-hcatalog was named: hive-hcatalog_2.3.6-3_all.deb whether it was built for buster or bullseye.
This identical filename causes a conflict with reprepro which is how we host our packages for local distribtion on apt.wikimedia.org.

In order to try to work around this problem, we tried modifying the bigtop.bom to add a -deb11 suffix to the package version string.

Sadly, we have just discovered that this attempted workaround causes subtle problems with the generated packages, which may be difficult to detect. The original ticket description below explains how the first issue was detected.

Suggested fix

The suggested fix is to build a set of packages for bullseye using an unmodified bigtop.bom and then use the dpkg-repack tool to modify them in order to add the suffix.

This will be scripted to handle all of the packages that we need.

Original ticket description follows

We have upgraded one of our Hadoop workers to bullseye, but have discovered a problem with one of the packages.

This is the hive-hcatalog packages, which is missing a vital set of symlinks in the /usr/lib/hive-hcatalog/share/hcatalog/ directory.

That directory on a buster host contains this:

btullis@an-test-worker1002:~$ ls -l /usr/lib/hive-hcatalog/share/hcatalog/
total 516
-rw-r--r-- 1 root root 264740 Jan  4  2022 hive-hcatalog-core-2.3.6.jar
lrwxrwxrwx 1 root root     28 Jan  4  2022 hive-hcatalog-core.jar -> hive-hcatalog-core-2.3.6.jar
-rw-r--r-- 1 root root  53963 Jan  4  2022 hive-hcatalog-pig-adapter-2.3.6.jar
lrwxrwxrwx 1 root root     35 Jan  4  2022 hive-hcatalog-pig-adapter.jar -> hive-hcatalog-pig-adapter-2.3.6.jar
-rw-r--r-- 1 root root  73711 Jan  4  2022 hive-hcatalog-server-extensions-2.3.6.jar
lrwxrwxrwx 1 root root     41 Jan  4  2022 hive-hcatalog-server-extensions.jar -> hive-hcatalog-server-extensions-2.3.6.jar
-rw-r--r-- 1 root root 128401 Jan  4  2022 hive-hcatalog-streaming-2.3.6.jar
lrwxrwxrwx 1 root root     33 Jan  4  2022 hive-hcatalog-streaming.jar -> hive-hcatalog-streaming-2.3.6.jar

On the bullseye host, those unversioned symlinks are missing:

btullis@an-test-worker1001:~$ ls -l /usr/lib/hive-hcatalog/share/hcatalog/
total 520
-rw-r--r-- 1 root root 264798 Aug 12  2022 hive-hcatalog-core-2.3.6.jar
-rw-r--r-- 1 root root  54023 Aug 12  2022 hive-hcatalog-pig-adapter-2.3.6.jar
-rw-r--r-- 1 root root  73772 Aug 12  2022 hive-hcatalog-server-extensions-2.3.6.jar
-rw-r--r-- 1 root root 128459 Aug 12  2022 hive-hcatalog-streaming-2.3.6.jar

The install_hive.sh script contains a section that was supposed to create those symlinks at the time of the package creation:

for DIR in ${HCATALOG_SHARE_DIR} ; do
    (cd $DIR &&
     for j in hive-hcatalog-*.jar; do
       if [[ $j =~ hive-hcatalog-(.*)-${HIVE_VERSION}.jar ]]; then
         name=${BASH_REMATCH[1]}
         ln -s $j hive-hcatalog-$name.jar
       fi
    done)
done

We need to understand why that script didn't work as expected and rebuild the package.
At the same time as fixing this issue, we need to:

  1. Be on the lookout for any other implications or occurrences of this issue about missing symlinks.
  2. Update the instructions on Wikitech for building our hadoop packages - Currently the best instructions are in T310643: Build Bigtop 1.5 Hadoop packages for Bullseye
  3. Refine the build process if necessary/possible

Details

ReferenceSource BranchDest BranchAuthorTitle
repos/data-engineering/bigtop!4fix_hive_buildbranch-1.5btullisFix a problem with the hive builds
repos/data-engineering/bigtop!3fix_oozie_buildbranch-1.5btullisConfigure oozie builds to use archiva
repos/data-engineering/bigtop!2fix_hbase_buildbranch-1.5btullisUpdate the version to slf4j-reload4j
repos/data-engineering/bigtop!1update_bigtop_1.5_buildbranch-1.5btullisAdd a new build script to allow us to rebuild bigtop easily for WMF
Customize query in GitLab

Event Timeline

BTullis triaged this task as High priority.

Expediting this, since it will:

  1. block any further work on bullseye upgrades for the analytics cluster
  2. prevent any security patching or any other work invoving upgrades to packages that we use from bigtop

OK, this is looking better now. I've added a patch to the hive component that allows us to use archiva's mirror for the missing jar files.
I built a version with this command:

docker run --rm  -v `pwd`:/ws --workdir /ws bigtop/slaves:1.5.0-debian-11 bash -c '. /etc/profile.d/bigtop.sh; ./gradlew allclean hive-pkg'

And the symlinks are present in the generated deb file.

btullis@marlin:~/wmf/bigtop$ dpkg-deb --contents output/hive/hive-hcatalog_2.3.6-3_all.deb |tail -n 5
lrwxrwxrwx root/root         0 2023-06-08 16:05 ./usr/lib/hive-hcatalog/etc/hcatalog -> /etc/hive-hcatalog/conf
lrwxrwxrwx root/root         0 2023-06-08 16:05 ./usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar -> hive-hcatalog-core-2.3.6.jar
lrwxrwxrwx root/root         0 2023-06-08 16:05 ./usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-pig-adapter.jar -> hive-hcatalog-pig-adapter-2.3.6.jar
lrwxrwxrwx root/root         0 2023-06-08 16:05 ./usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-server-extensions.jar -> hive-hcatalog-server-extensions-2.3.6.jar
lrwxrwxrwx root/root         0 2023-06-08 16:05 ./usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-streaming.jar -> hive-hcatalog-streaming-2.3.6.jar

The only thing is that I had temporarily reverted this change that modifies the package name to append -deb11.
This was originally done becuse of a conflict with reprepro hosting identically named files with different versions.

I will try again without this reversion to see if that might have caused the issue with the symlinks.

Confirmed, I built the package again with the patch to change the suffix in the name of the package and the symlinks are indeed missing again.

btullis@marlin:~/wmf/bigtop$ dpkg-deb --contents output/hive/hive-hcatalog_2.3.6-deb11-4_all.deb
drwxr-xr-x root/root         0 2023-06-08 16:49 ./
drwxr-xr-x root/root         0 2023-06-08 16:49 ./etc/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./etc/hive-hcatalog/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./etc/hive-hcatalog/conf.dist/
-rw-r--r-- root/root      1505 2019-08-12 23:04 ./etc/hive-hcatalog/conf.dist/jndi.properties
-rw-r--r-- root/root      4593 2019-08-12 23:04 ./etc/hive-hcatalog/conf.dist/proto-hive-site.xml
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/bin/
-rwxr-xr-x root/root       471 2023-06-08 16:49 ./usr/bin/hcat
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/lib/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/lib/hive-hcatalog/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/lib/hive-hcatalog/bin/
-rwxr-xr-x root/root      1608 2019-08-12 23:04 ./usr/lib/hive-hcatalog/bin/common.sh
-rwxr-xr-x root/root      5501 2019-08-12 23:06 ./usr/lib/hive-hcatalog/bin/hcat
-rwxr-xr-x root/root      5170 2019-08-12 23:04 ./usr/lib/hive-hcatalog/bin/hcat.py
-rwxr-xr-x root/root      3400 2019-08-12 23:04 ./usr/lib/hive-hcatalog/bin/hcatcfg.py
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/lib/hive-hcatalog/etc/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/lib/hive-hcatalog/libexec/
-rwxr-xr-x root/root      2595 2019-08-12 23:04 ./usr/lib/hive-hcatalog/libexec/hcat-config.sh
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/lib/hive-hcatalog/sbin/
-rwxr-xr-x root/root      5450 2019-08-12 23:04 ./usr/lib/hive-hcatalog/sbin/hcat_server.py
-rwxr-xr-x root/root      4569 2019-08-12 23:04 ./usr/lib/hive-hcatalog/sbin/hcat_server.sh
-rwxr-xr-x root/root      3400 2019-08-12 23:04 ./usr/lib/hive-hcatalog/sbin/hcatcfg.py
-rwxr-xr-x root/root     10013 2019-08-12 23:04 ./usr/lib/hive-hcatalog/sbin/update-hcatalog-env.sh
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/lib/hive-hcatalog/share/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/lib/hive-hcatalog/share/hcatalog/
-rw-r--r-- root/root    264799 2023-06-08 16:49 ./usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core-2.3.6.jar
-rw-r--r-- root/root     54023 2023-06-08 16:49 ./usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-pig-adapter-2.3.6.jar
-rw-r--r-- root/root     73773 2023-06-08 16:49 ./usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-server-extensions-2.3.6.jar
-rw-r--r-- root/root    128458 2023-06-08 16:49 ./usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-streaming-2.3.6.jar
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/share/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/share/doc/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/share/doc/hive-hcatalog/
-rw-r--r-- root/root       133 2023-06-08 16:49 ./usr/share/doc/hive-hcatalog/changelog.Debian.gz
-rw-r--r-- root/root       409 2023-06-08 16:49 ./usr/share/doc/hive-hcatalog/copyright
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/share/man/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./usr/share/man/man1/
-rw-r--r-- root/root      1331 2023-06-08 16:49 ./usr/share/man/man1/hive-hcatalog.1.gz
drwxr-xr-x root/root         0 2023-06-08 16:49 ./var/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./var/lib/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./var/lib/hive-hcatalog/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./var/log/
drwxr-xr-x root/root         0 2023-06-08 16:49 ./var/log/hive-hcatalog/
lrwxrwxrwx root/root         0 2023-06-08 16:49 ./usr/lib/hive-hcatalog/etc/hcatalog -> /etc/hive-hcatalog/conf

So now I've got to work out how best to proceed.

Maybe there's a different way that I can add the suffix to the filenames, other than modifying bigtop.bom.

BTullis renamed this task from Rebuild hive-hcatalog package for bullseye to address missing symlinks to Rebuild all hadoop packages for bullseye with different distribution suffix mechanism.Jun 9 2023, 12:30 PM
BTullis updated the task description. (Show Details)

I think I've made good progress on this now. After discussions with @MoritzMuehlenhoff I have settled on a method of extracting each package, modifying its metadata to append the distribution suffix, then repacking it.

Here is a script to build all of our bigtop components and carry out this post-build step on each package.

There is also a simple wrapper script that performs the build for both buster and bullseye.

I'm waiting for this combined build to complete successfully, then I'll merge it to our bigtop 1.5 branch.

btullis opened https://gitlab.wikimedia.org/repos/data-engineering/bigtop/-/merge_requests/4

Draft: Add some build scripts to allow us to build bigtop for WMF easily

I've now got a successful rebuild of all of bigtop for buster and bullseye, using this new script. I'll start adding them to apt.wikimedia.org tomorrow.

btullis@marlin:~/wmf/bigtop$ ./build_all_bigtop_distros_wmf.sh
<snip snip>
btullis@marlin:~/wmf/bigtop$ tree repackaged/
repackaged/
├── deb10
│   ├── bigtop-groovy_2.5.4-1_all-deb10.deb
│   ├── bigtop-jsvc_1.0.15-1_amd64-deb10.deb
│   ├── bigtop-tomcat_8.5.57-1_all-deb10.deb
│   ├── bigtop-utils_1.5.0-1_all-deb10.deb
│   ├── hadoop_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-client_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-conf-pseudo_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-doc_2.10.2-1_all-deb10.deb
│   ├── hadoop-hdfs_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-hdfs-datanode_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-hdfs-fuse_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-hdfs-journalnode_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-hdfs-namenode_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-hdfs-secondarynamenode_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-hdfs-zkfc_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-httpfs_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-kms_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-mapreduce_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-mapreduce-historyserver_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-yarn_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-yarn-nodemanager_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-yarn-proxyserver_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-yarn-resourcemanager_2.10.2-1_amd64-deb10.deb
│   ├── hadoop-yarn-timelineserver_2.10.2-1_amd64-deb10.deb
│   ├── hbase_1.5.0-1_amd64-deb10.deb
│   ├── hbase-doc_1.5.0-1_all-deb10.deb
│   ├── hbase-master_1.5.0-1_all-deb10.deb
│   ├── hbase-regionserver_1.5.0-1_all-deb10.deb
│   ├── hbase-rest_1.5.0-1_all-deb10.deb
│   ├── hbase-thrift_1.5.0-1_all-deb10.deb
│   ├── hive_2.3.6-3_all-deb10.deb
│   ├── hive-hbase_2.3.6-3_all-deb10.deb
│   ├── hive-hcatalog_2.3.6-3_all-deb10.deb
│   ├── hive-hcatalog-server_2.3.6-3_all-deb10.deb
│   ├── hive-jdbc_2.3.6-3_all-deb10.deb
│   ├── hive-metastore_2.3.6-3_all-deb10.deb
│   ├── hive-server2_2.3.6-3_all-deb10.deb
│   ├── hive-webhcat_2.3.6-3_all-deb10.deb
│   ├── hive-webhcat-server_2.3.6-3_all-deb10.deb
│   ├── libhdfs0_2.10.2-1_amd64-deb10.deb
│   ├── libhdfs0-dev_2.10.2-1_amd64-deb10.deb
│   ├── mahout_0.13.0-1_all-deb10.deb
│   ├── mahout-doc_0.13.0-1_all-deb10.deb
│   ├── oozie_4.3.0-2_all-deb10.deb
│   ├── oozie-client_4.3.0-2_all-deb10.deb
│   ├── solr_6.6.6-1_all-deb10.deb
│   ├── solr-doc_6.6.6-1_all-deb10.deb
│   ├── solr-server_6.6.6-1_all-deb10.deb
│   ├── spark-core_2.4.5-1_all-deb10.deb
│   ├── spark-datanucleus_2.4.5-1_all-deb10.deb
│   ├── spark-external_2.4.5-1_all-deb10.deb
│   ├── spark-history-server_2.4.5-1_all-deb10.deb
│   ├── spark-master_2.4.5-1_all-deb10.deb
│   ├── spark-python_2.4.5-1_all-deb10.deb
│   ├── spark-sparkr_2.4.5-1_all-deb10.deb
│   ├── spark-thriftserver_2.4.5-1_all-deb10.deb
│   ├── spark-worker_2.4.5-1_all-deb10.deb
│   ├── spark-yarn-shuffle_2.4.5-1_all-deb10.deb
│   ├── sqoop_1.4.6-1_all-deb10.deb
│   ├── sqoop2_1.99.4-1_all-deb10.deb
│   ├── sqoop2-client_1.99.4-1_all-deb10.deb
│   ├── sqoop2-server_1.99.4-1_all-deb10.deb
│   └── sqoop-metastore_1.4.6-1_all-deb10.deb
└── deb11
    ├── bigtop-groovy_2.5.4-1_all-deb11.deb
    ├── bigtop-jsvc_1.0.15-1_amd64-deb11.deb
    ├── bigtop-tomcat_8.5.57-1_all-deb11.deb
    ├── bigtop-utils_1.5.0-1_all-deb11.deb
    ├── hadoop_2.10.2-1_amd64-deb11.deb
    ├── hadoop-client_2.10.2-1_amd64-deb11.deb
    ├── hadoop-conf-pseudo_2.10.2-1_amd64-deb11.deb
    ├── hadoop-doc_2.10.2-1_all-deb11.deb
    ├── hadoop-hdfs_2.10.2-1_amd64-deb11.deb
    ├── hadoop-hdfs-datanode_2.10.2-1_amd64-deb11.deb
    ├── hadoop-hdfs-fuse_2.10.2-1_amd64-deb11.deb
    ├── hadoop-hdfs-journalnode_2.10.2-1_amd64-deb11.deb
    ├── hadoop-hdfs-namenode_2.10.2-1_amd64-deb11.deb
    ├── hadoop-hdfs-secondarynamenode_2.10.2-1_amd64-deb11.deb
    ├── hadoop-hdfs-zkfc_2.10.2-1_amd64-deb11.deb
    ├── hadoop-httpfs_2.10.2-1_amd64-deb11.deb
    ├── hadoop-kms_2.10.2-1_amd64-deb11.deb
    ├── hadoop-mapreduce_2.10.2-1_amd64-deb11.deb
    ├── hadoop-mapreduce-historyserver_2.10.2-1_amd64-deb11.deb
    ├── hadoop-yarn_2.10.2-1_amd64-deb11.deb
    ├── hadoop-yarn-nodemanager_2.10.2-1_amd64-deb11.deb
    ├── hadoop-yarn-proxyserver_2.10.2-1_amd64-deb11.deb
    ├── hadoop-yarn-resourcemanager_2.10.2-1_amd64-deb11.deb
    ├── hadoop-yarn-timelineserver_2.10.2-1_amd64-deb11.deb
    ├── hbase_1.5.0-1_amd64-deb11.deb
    ├── hbase-doc_1.5.0-1_all-deb11.deb
    ├── hbase-master_1.5.0-1_all-deb11.deb
    ├── hbase-regionserver_1.5.0-1_all-deb11.deb
    ├── hbase-rest_1.5.0-1_all-deb11.deb
    ├── hbase-thrift_1.5.0-1_all-deb11.deb
    ├── hive_2.3.6-3_all-deb11.deb
    ├── hive-hbase_2.3.6-3_all-deb11.deb
    ├── hive-hcatalog_2.3.6-3_all-deb11.deb
    ├── hive-hcatalog-server_2.3.6-3_all-deb11.deb
    ├── hive-jdbc_2.3.6-3_all-deb11.deb
    ├── hive-metastore_2.3.6-3_all-deb11.deb
    ├── hive-server2_2.3.6-3_all-deb11.deb
    ├── hive-webhcat_2.3.6-3_all-deb11.deb
    ├── hive-webhcat-server_2.3.6-3_all-deb11.deb
    ├── libhdfs0_2.10.2-1_amd64-deb11.deb
    ├── libhdfs0-dev_2.10.2-1_amd64-deb11.deb
    ├── mahout_0.13.0-1_all-deb11.deb
    ├── mahout-doc_0.13.0-1_all-deb11.deb
    ├── oozie_4.3.0-2_all-deb11.deb
    ├── oozie-client_4.3.0-2_all-deb11.deb
    ├── solr_6.6.6-1_all-deb11.deb
    ├── solr-doc_6.6.6-1_all-deb11.deb
    ├── solr-server_6.6.6-1_all-deb11.deb
    ├── spark-core_2.4.5-1_all-deb11.deb
    ├── spark-datanucleus_2.4.5-1_all-deb11.deb
    ├── spark-external_2.4.5-1_all-deb11.deb
    ├── spark-history-server_2.4.5-1_all-deb11.deb
    ├── spark-master_2.4.5-1_all-deb11.deb
    ├── spark-python_2.4.5-1_all-deb11.deb
    ├── spark-sparkr_2.4.5-1_all-deb11.deb
    ├── spark-thriftserver_2.4.5-1_all-deb11.deb
    ├── spark-worker_2.4.5-1_all-deb11.deb
    ├── spark-yarn-shuffle_2.4.5-1_all-deb11.deb
    ├── sqoop_1.4.6-1_all-deb11.deb
    ├── sqoop2_1.99.4-1_all-deb11.deb
    ├── sqoop2-client_1.99.4-1_all-deb11.deb
    ├── sqoop2-server_1.99.4-1_all-deb11.deb
    └── sqoop-metastore_1.4.6-1_all-deb11.deb

3 directories, 126 files
(base) btullis@marlin:~/wmf/bigtop$ du -sh repackaged/
3.8G	repackaged/

I noticed that there were some packages incorrectly added for i386 in hadoop, so I removed them:

sudo -i reprepro -C thirdparty/bigtop15 -A i386 list bullseye-wikimedia | awk -F ":" '{print $2}'| awk -F " " '{print $1}' > i386-packages.txt
for p in $(cat i386-packages.txt); do sudo -i reprepro -C thirdparty/bigtop15 -A i386 remove bullseye-wikimedia $p; done

I have to remove the existing packages from reprepro, since the change in the suffix is interpreted as a downgrade.
Here's an example with the bigtop-groovy package.

btullis@apt1001:~/deb11$ sudo -i reprepro list bullseye-wikimedia bigtop-groovy
bullseye-wikimedia|thirdparty/bigtop15|amd64: bigtop-groovy 2.5.4-deb11-1

btullis@apt1001:~/deb11$ sudo -i reprepro -C thirdparty/bigtop15 -A amd64 includedeb bullseye-wikimedia `pwd`/bigtop-groovy_2.5.4-1_all-deb11.deb
Skipping inclusion of 'bigtop-groovy' '2.5.4-1deb11' in 'bullseye-wikimedia|thirdparty/bigtop15|amd64', as it has already '2.5.4-deb11-1'.
Deleting files just added to the pool but not used.
(to avoid use --keepunusednewfiles next time)

btullis@apt1001:~/deb11$ sudo -i reprepro -C thirdparty/bigtop15 -A amd64 remove bullseye-wikimedia bigtop-groovy
Exporting indices...
Deleting files no longer referenced...

btullis@apt1001:~/deb11$ sudo -i reprepro -C thirdparty/bigtop15 -A amd64 includedeb bullseye-wikimedia `pwd`/bigtop-groovy_2.5.4-1_all-deb11.deb
Exporting indices...

btullis@apt1001:~/deb11$ sudo -i reprepro list bullseye-wikimedia bigtop-groovy
bullseye-wikimedia|thirdparty/bigtop15|amd64: bigtop-groovy 2.5.4-1deb11

On the hosts where this is installed it is also interpreted as a downgrade, but that's OK.

btullis@an-test-worker1001:~$ apt-cache policy bigtop-groovy
bigtop-groovy:
  Installed: 2.5.4-deb11-1
  Candidate: 2.5.4-1deb11
  Version table:
 *** 2.5.4-deb11-1 100
        100 /var/lib/dpkg/status
     2.5.4-1deb11 1001
       1001 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 Packages
btullis@an-test-worker1001:~$ sudo apt install bigtop-groovy
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be DOWNGRADED:
  bigtop-groovy
0 upgraded, 0 newly installed, 1 downgraded, 0 to remove and 11 not upgraded.
Need to get 4,847 kB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n]

I'm going to proceed, but I'm only going to change the packages for bullseye at the moment.

Removed existing bullseye packages with:

for p in $(sudo -i reprepro -C thirdparty/bigtop15 -A amd64 list bullseye-wikimedia | awk -F ":" '{print $2}'| awk -F " " '{print $1}'); do sudo -i reprepro -C thirdparty/bigtop15 -A amd64 remove bullseye-wikimedia $p;done

Added new packages with:

for p in $(ls *); do sudo -i reprepro -C thirdparty/bigtop15 -A amd64 includedeb bullseye-wikimedia `pwd`/$p; done

Ah, it was going so well. I tried to downgrade the packages on an-test-worker1001, but there are dependency problems.

It looks like I will have to modify more values in the control file.

btullis@an-test-worker1001:~$ dpkg -l |egrep "\-deb11"|awk '{print $2}'|xargs sudo apt install
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 hadoop-client : Depends: hadoop (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
                 Depends: hadoop-hdfs (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
                 Depends: hadoop-yarn (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
                 Depends: hadoop-mapreduce (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
 hadoop-hdfs : Depends: hadoop (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
 hadoop-hdfs-datanode : Depends: hadoop-hdfs (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
 hadoop-hdfs-journalnode : Depends: hadoop-hdfs (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
 hadoop-mapreduce : Depends: hadoop-yarn (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
 hadoop-yarn : Depends: hadoop (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
 hadoop-yarn-nodemanager : Depends: hadoop-yarn (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
 hive : Depends: hive-jdbc (= 2.3.6-3) but 2.3.6-3deb11 is to be installed
 libhdfs0 : Depends: hadoop (= 2.10.2-1) but 2.10.2-1deb11 is to be installed
E: Unable to correct problems, you have held broken packages.

Some progress, but it's still not quite right. If we look at one of the packages, we can see that one reference to a package in the Depends: field has been correctly updated, but the second one hasn't.

image.png (508×981 px, 104 KB)

I'll keep iterating on it, but I don't think it's far off now.

I've got another set of packages that I'm happy with now and an upgrade on an-test-worker1001 was seemingly successful.

btullis@an-test-worker1001:~$ dpkg -l |egrep "\-deb11"|awk '{print $2}'|xargs sudo apt install
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be upgraded:
  bigtop-groovy bigtop-jsvc bigtop-utils hadoop hadoop-client hadoop-hdfs hadoop-hdfs-datanode hadoop-hdfs-journalnode hadoop-mapreduce hadoop-yarn hadoop-yarn-nodemanager hive hive-hcatalog hive-jdbc libhdfs0
  sqoop
16 upgraded, 0 newly installed, 0 to remove and 3 not upgraded.
Need to get 501 MB of archives.
After this operation, 33.4 MB of additional disk space will be used.
Get:1 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 bigtop-utils all 1.5.0-deb11-2 [4,296 B]
Get:2 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 bigtop-groovy all 2.5.4-deb11-2 [4,847 kB]
Get:3 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 bigtop-jsvc amd64 1.0.15-deb11-2 [26.5 kB]
Get:4 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hadoop-hdfs-journalnode amd64 2.10.2-deb11-2 [4,108 B]
Get:5 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hadoop-hdfs-datanode amd64 2.10.2-deb11-2 [4,132 B]
Get:6 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hadoop-client amd64 2.10.2-deb11-2 [3,452 B]
Get:7 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hadoop-hdfs amd64 2.10.2-deb11-2 [30.4 MB]
Get:8 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hadoop-yarn-nodemanager amd64 2.10.2-deb11-2 [4,084 B]
Get:9 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hadoop-mapreduce amd64 2.10.2-deb11-2 [123 MB]
Get:10 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hadoop-yarn amd64 2.10.2-deb11-2 [49.7 MB]
Get:11 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 libhdfs0 amd64 2.10.2-deb11-2 [27.1 kB]
Get:12 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hadoop amd64 2.10.2-deb11-2 [29.2 MB]
Get:13 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hive all 2.3.6-deb11-4 [194 MB]
Get:14 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hive-jdbc all 2.3.6-deb11-4 [58.1 MB]
Get:15 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 hive-hcatalog all 2.3.6-deb11-4 [481 kB]                                                                                   
Get:16 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/thirdparty/bigtop15 amd64 sqoop all 1.4.6-deb11-2 [11.4 MB]                                                                                          
Fetched 501 MB in 6s (77.8 MB/s)                                                                                                                                                                                  
su: warning: cannot change directory to /nonexistent: No such file or directory
INFO:debmonitor:Got 16 updates from dpkg hook version 3
INFO:debmonitor:Successfully sent the dpkg_hook update to the DebMonitor server
(Reading database ... 176627 files and directories currently installed.)
Preparing to unpack .../00-bigtop-utils_1.5.0-deb11-2_all.deb ...
Unpacking bigtop-utils (1.5.0-deb11-2) over (1.5.0-deb11-1) ...
Preparing to unpack .../01-bigtop-groovy_2.5.4-deb11-2_all.deb ...
Unpacking bigtop-groovy (2.5.4-deb11-2) over (2.5.4-deb11-1) ...
Preparing to unpack .../02-bigtop-jsvc_1.0.15-deb11-2_amd64.deb ...
Unpacking bigtop-jsvc (1.0.15-deb11-2) over (1.0.15-deb11-1) ...
Preparing to unpack .../03-hadoop-hdfs-journalnode_2.10.2-deb11-2_amd64.deb ...
Unpacking hadoop-hdfs-journalnode (2.10.2-deb11-2) over (2.10.2-deb11-1) ...
Preparing to unpack .../04-hadoop-hdfs-datanode_2.10.2-deb11-2_amd64.deb ...
Unpacking hadoop-hdfs-datanode (2.10.2-deb11-2) over (2.10.2-deb11-1) ...
Preparing to unpack .../05-hadoop-client_2.10.2-deb11-2_amd64.deb ...
Unpacking hadoop-client (2.10.2-deb11-2) over (2.10.2-deb11-1) ...
Preparing to unpack .../06-hadoop-hdfs_2.10.2-deb11-2_amd64.deb ...
Unpacking hadoop-hdfs (2.10.2-deb11-2) over (2.10.2-deb11-1) ...
Preparing to unpack .../07-hadoop-yarn-nodemanager_2.10.2-deb11-2_amd64.deb ...
Unpacking hadoop-yarn-nodemanager (2.10.2-deb11-2) over (2.10.2-deb11-1) ...
Preparing to unpack .../08-hadoop-mapreduce_2.10.2-deb11-2_amd64.deb ...
Unpacking hadoop-mapreduce (2.10.2-deb11-2) over (2.10.2-deb11-1) ...
Preparing to unpack .../09-hadoop-yarn_2.10.2-deb11-2_amd64.deb ...
Unpacking hadoop-yarn (2.10.2-deb11-2) over (2.10.2-deb11-1) ...
Preparing to unpack .../10-libhdfs0_2.10.2-deb11-2_amd64.deb ...
Unpacking libhdfs0 (2.10.2-deb11-2) over (2.10.2-deb11-1) ...
Preparing to unpack .../11-hadoop_2.10.2-deb11-2_amd64.deb ...
Unpacking hadoop (2.10.2-deb11-2) over (2.10.2-deb11-1) ...
Preparing to unpack .../12-hive_2.3.6-deb11-4_all.deb ...
Unpacking hive (2.3.6-deb11-4) over (2.3.6-deb11-3) ...
Preparing to unpack .../13-hive-jdbc_2.3.6-deb11-4_all.deb ...
Unpacking hive-jdbc (2.3.6-deb11-4) over (2.3.6-deb11-3) ...
Preparing to unpack .../14-hive-hcatalog_2.3.6-deb11-4_all.deb ...
Unpacking hive-hcatalog (2.3.6-deb11-4) over (2.3.6-deb11-3) ...
Preparing to unpack .../15-sqoop_1.4.6-deb11-2_all.deb ...
Unpacking sqoop (1.4.6-deb11-2) over (1.4.6-deb11-1) ...
Setting up bigtop-utils (1.5.0-deb11-2) ...
Setting up bigtop-groovy (2.5.4-deb11-2) ...
Setting up hadoop (2.10.2-deb11-2) ...
Setting up bigtop-jsvc (1.0.15-deb11-2) ...
Setting up hadoop-yarn (2.10.2-deb11-2) ...
Setting up hadoop-hdfs (2.10.2-deb11-2) ...
Setting up hadoop-mapreduce (2.10.2-deb11-2) ...
Setting up hadoop-yarn-nodemanager (2.10.2-deb11-2) ...
Job for hadoop-yarn-nodemanager.service failed because the control process exited with error code.
See "systemctl status hadoop-yarn-nodemanager.service" and "journalctl -xe" for details.
invoke-rc.d: initscript hadoop-yarn-nodemanager, action "start" failed.
● hadoop-yarn-nodemanager.service - LSB: Hadoop nodemanager
     Loaded: loaded (/etc/init.d/hadoop-yarn-nodemanager; generated)
    Drop-In: /etc/systemd/system/hadoop-yarn-nodemanager.service.d
             └─puppet-override.conf
     Active: failed (Result: exit-code) since Thu 2023-06-22 15:59:58 UTC; 12ms ago
       Docs: man:systemd-sysv-generator(8)
    Process: 3944072 ExecStart=/etc/init.d/hadoop-yarn-nodemanager start (code=exited, status=1/FAILURE)
        CPU: 16.738s

Jun 22 15:59:52 an-test-worker1001 systemd[1]: Starting LSB: Hadoop nodemanager...
Jun 22 15:59:52 an-test-worker1001 runuser[3944093]: pam_unix(runuser:session): session opened for user yarn(uid=904) by (uid=0)
Jun 22 15:59:52 an-test-worker1001 hadoop-yarn-nodemanager[3944094]: starting nodemanager, logging to /var/log/hadoop-yarn/yarn-yarn-nodemanager-an-test-worker1001.out
Jun 22 15:59:53 an-test-worker1001 runuser[3944093]: pam_unix(runuser:session): session closed for user yarn
Jun 22 15:59:58 an-test-worker1001 hadoop-yarn-nodemanager[3944072]: Failed to start Hadoop nodemanager. Return value: 1 ...
Jun 22 15:59:58 an-test-worker1001 hadoop-yarn-nodemanager[3944526]:  failed!
Jun 22 15:59:58 an-test-worker1001 systemd[1]: hadoop-yarn-nodemanager.service: Control process exited, code=exited, status=1/FAILURE
Jun 22 15:59:58 an-test-worker1001 systemd[1]: hadoop-yarn-nodemanager.service: Failed with result 'exit-code'.
Jun 22 15:59:58 an-test-worker1001 systemd[1]: Failed to start LSB: Hadoop nodemanager.
Jun 22 15:59:58 an-test-worker1001 systemd[1]: hadoop-yarn-nodemanager.service: Consumed 16.738s CPU time.
Setting up hadoop-hdfs-datanode (2.10.2-deb11-2) ...
Setting up hadoop-client (2.10.2-deb11-2) ...
Setting up sqoop (1.4.6-deb11-2) ...
update-alternatives: using /etc/sqoop/conf.dist to provide /etc/sqoop/conf (sqoop-conf) in auto mode
Setting up libhdfs0 (2.10.2-deb11-2) ...
Setting up hadoop-hdfs-journalnode (2.10.2-deb11-2) ...
Setting up hive-jdbc (2.3.6-deb11-4) ...
Setting up hive (2.3.6-deb11-4) ...
Setting up hive-hcatalog (2.3.6-deb11-4) ...
update-alternatives: using /etc/hive-hcatalog/conf.dist to provide /etc/hive-hcatalog/conf (hive-hcatalog-conf) in auto mode
Processing triggers for man-db (2.9.4-2) ...
Processing triggers for libc-bin (2.31-13+deb11u6) ...

The failure of the hadoop-yarn-nodemanager.service is expected, because this host is currently excluded from YARN. I'll add it back in now.

The symlinks are present in the hive-hcatalog package, where we expect to find them:

btullis@an-test-worker1001:~$ ls -l /usr/lib/hive-hcatalog/share/hcatalog/
total 520
-rw-r--r-- 1 root root 266097 Jun 22 12:18 hive-hcatalog-core-2.3.6.jar
lrwxrwxrwx 1 root root     28 Jun 22 12:18 hive-hcatalog-core.jar -> hive-hcatalog-core-2.3.6.jar
-rw-r--r-- 1 root root  55326 Jun 22 12:18 hive-hcatalog-pig-adapter-2.3.6.jar
lrwxrwxrwx 1 root root     35 Jun 22 12:18 hive-hcatalog-pig-adapter.jar -> hive-hcatalog-pig-adapter-2.3.6.jar
-rw-r--r-- 1 root root  75056 Jun 22 12:18 hive-hcatalog-server-extensions-2.3.6.jar
lrwxrwxrwx 1 root root     41 Jun 22 12:18 hive-hcatalog-server-extensions.jar -> hive-hcatalog-server-extensions-2.3.6.jar
-rw-r--r-- 1 root root 129745 Jun 22 12:18 hive-hcatalog-streaming-2.3.6.jar
lrwxrwxrwx 1 root root     33 Jun 22 12:18 hive-hcatalog-streaming.jar -> hive-hcatalog-streaming-2.3.6.jar

I believe that this is now complete. I've merged the build script into our branch and created some instructions here: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Bigtop_Packages
The since hadoop worker that we have on bullseye has been running YARN jobs successfully since yesterday and puppet runs cleanly.

Hopefully this unblocks a bit part of upgrading Hadoop to bullseye.

Mentioned in SAL (#wikimedia-analytics) [2023-09-26T10:11:23Z] <btullis> running 'dpkg -l |egrep "\-deb11"|awk '{print $2}'|xargs sudo apt install` on an-test-client1002 for T337465

Mentioned in SAL (#wikimedia-analytics) [2023-09-26T10:28:12Z] <btullis> upgrading outdated bigtop packages on stat1009 with dpkg -l |egrep "\-deb11"|awk '{print $2}'|xargs sudo apt install for T337465