Page MenuHomePhabricator

Upgrade Presto to version 0.283
Closed, ResolvedPublic

Description

We wish to use the latest build of presto, which includes an update of the bundled version of Aluxio to 2.9.3.

See: https://github.com/prestodb/presto/pull/19594

We believe that this will fix an issue using the local caching feature of Presto with HDFS external storage.

Event Timeline

BTullis renamed this task from Upgrade Presto to version 0.282 to Upgrade Presto to version 0.283.Aug 18 2023, 2:27 PM

Presto version 0.284 was released yesterday: https://github.com/prestodb/presto/releases/tag/0.284

The release notes seem to have been slightly delayed, but are available here.

There are some changes mentioned regarding cache performance for the history based optimizer, plus some hive changes and some iceberg changes.

However, now that I look again, it seems that the upgrade of the alluxio client jar was reverted.
Here is the pom.xml file at version 0.284.
It shows that the Alluxio version is back at 2.8.1

It was reverted in this commit, which was between version 0.283 and 0.284. I wonder why.
So for now I suggest that we don't upgrade beyond 0.283 and we try to find out why the alluxio jar was downgraded.

Indeed that's weird - I'll contact Alluxio folks.

Beginning work on the new build now, referring to the build for 0.281 T337335 and the README.Debian file.

Configured the version:

export VERSION=0.283

Downloaded the release tarball:

wget -P /tmp https://repo1.maven.org/maven2/com/facebook/presto/presto-server/$VERSION/presto-server-$VERSION.tar.gz

Checked the checksum

[ "$(curl -s https://repo1.maven.org/maven2/com/facebook/presto/presto-server/$VERSION/presto-server-$VERSION.tar.gz.sha1)" == "$(shasum -a 1 /tmp/presto-server-$VERSION.tar.gz | awk '{print $1}')" ] && echo "presto-server sha1 checksum matches, continue" || echo "presto-server sha1 checksum does not match!"

The following output was shown.

presto-server sha1 checksum matches, continue

Updated the tarball from which the debian package is built.

gbp import-orig -u $VERSION --upstream-branch=master --debian-branch=debian --merge-mode=replace /tmp/presto-server-$VERSION.tar.gz

The following output was shown:

gbp:info: Importing '/tmp/presto-server-0.283.tar.gz' to branch 'master'...
gbp:info: Source package is presto
gbp:info: Upstream version is 0.283
gbp:info: Replacing upstream source on 'debian'
gbp:info: Successfully imported version 0.283 of /tmp/presto-server-0.283.tar.gz

Removed the presto-cli jar file.

git rm debian/lib/presto-cli-*-executable.jar

The following output was shown:

rm 'debian/lib/presto-cli-0.281-executable.jar'

Downloaded the new jar

wget -P debian/lib https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/$VERSION/presto-cli-$VERSION-executable.jar

Check the checksum of this jar:

[ "$(curl -s https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/$VERSION/presto-cli-$VERSION-executable.jar.sha1)" == "$(shasum -a 1 ./debian/lib/presto-cli-$VERSION-executable.jar | awk '{print $1}')" ] && echo "presto-cli sha1 checksum matches, continue" || echo "presto-cli sha1 checksum does not match!"

The following output was shown:

presto-cli sha1 checksum matches, continue

Configured the presto-cli jar to be executable:

chmod 755 debian/lib/presto-cli-$VERSION-executable.jar

Find the binaries that need to be included:

find {debian/lib,lib,plugin} -type f -exec file {} \; | grep -v text | awk -F ':' '{print $1}' | sort > debian/source/include-binaries

Edited the version of the presto-cli jar file mentioned in in debian/presto-cli.links and debian/presto-cli.install from 0.281 to 0.283

Created a new changelog entry with:

dch -v 0.283-1 -D bullseye-wikimedia --force-distribution

Commited with:

git add debian
git commit -m "Upstream release $VERSION"

I pushed both the master and debian branches to gerrit. Last time I forgot to push the master branch, which resulted in a build failure.

(base) btullis@marlin:~/wmf/debs/presto$ git push
Enumerating objects: 261, done.
Counting objects: 100% (220/220), done.
Delta compression using up to 16 threads
Compressing objects: 100% (170/170), done.
Writing objects: 100% (170/170), 139.65 MiB | 3.10 MiB/s, done.
Total 170 (delta 46), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (46/46)
remote: Processing changes: refs: 1, done    
To ssh://gerrit.wikimedia.org:29418/operations/debs/presto
   d2f7f77..d45707c  debian -> debian
(base) btullis@marlin:~/wmf/debs/presto$ git checkout master 
Switched to branch 'master'
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)
(base) btullis@marlin:~/wmf/debs/presto$ git push
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
remote: Processing changes: refs: 1, done    
To ssh://gerrit.wikimedia.org:29418/operations/debs/presto
   3ecfb7e..af0a3de  master -> master

On build2001 - check out the repository.

git clone "https://gerrit.wikimedia.org/r/operations/debs/presto" && (cd "presto" && mkdir -p `git rev-parse --git-dir`/hooks/ && curl -Lo `git rev-parse --git-dir`/hooks/commit-msg https://gerrit.wikimedia.org/r/tools/hooks/commit-msg; chmod +x `git rev-parse --git-dir`/hooks/commit-msg)

Check that the version on the debian branch is correct.

btullis@build2001:~$ cd presto/
btullis@build2001:~/presto$ git checkout debian 
Branch 'debian' set up to track remote branch 'debian' from 'origin'.
Switched to a new branch 'debian'
btullis@build2001:~/presto$ git log -n 1
commit d45707cce2a8c87070384aea64d98044842985ec (HEAD -> debian, origin/debian)
Author: Ben Tullis <btullis@wikimedia.org>
Date:   Tue Oct 10 14:53:16 2023 +0100

    Upstream release 0.283
    
    Change-Id: I1a2a9bf1e2ec8b19b63d1918aad2e05d862bed08
btullis@build2001:~/presto$

Now issue the package build command.

GIT_PBUILDER_AUTOCONF=no DIST=bullseye WIKIMEDIA=yes gbp buildpackage -sa -us -uc --git-builder=git-pbuilder --source-option="--include-removal"

Copied the debs to apt1001.

btullis@apt1001:~$ rsync rsync://build2001.codfw.wmnet/pbuilder-result/bullseye-amd64/presto* .

Added the debs to the apt-repository with:

btullis@apt1001:~$ sudo -i reprepro include bullseye-wikimedia `pwd`/presto_0.283-1_amd64.changes
btullis@apt1001:~$ sudo -i reprepro --ignore=wrongdistribution include buster-wikimedia `pwd`/presto_0.283-1_amd64.changes

Checked that they are available for installation.

btullis@apt1001:~$ sudo -i reprepro ls presto-cli
presto-cli | 0.283-1 |   buster-wikimedia | amd64, i386
presto-cli | 0.283-1 | bullseye-wikimedia | amd64, i386
btullis@apt1001:~$ sudo -i reprepro ls presto-server
presto-server | 0.283-1 |   buster-wikimedia | amd64, i386
presto-server | 0.283-1 | bullseye-wikimedia | amd64, i386
BTullis triaged this task as High priority.Oct 11 2023, 9:10 AM

I'm pushing out the new version of presto to the test cluster with:

btullis@cumin1001:~$ sudo debdeploy deploy -u 2023-10-12-presto.yaml -Q 'P{O:analytics_test_cluster::presto::server} or P{O:analytics_test_cluster::coordinator}'

I've done some basic tests on the test cluster and things appear to be working properly.

presto> SELECT node_id,node_version FROM system.runtime.nodes;
            node_id             | node_version  
--------------------------------+---------------
 an-test-coord1001-eqiad-wmnet  | 0.283-1fa586a 
 an-test-presto1001-eqiad-wmnet | 0.283-1fa586a 
(2 rows)

Query 20231012_131619_00009_h599v, FINISHED, 2 nodes
Splits: 17 total, 17 done (100.00%)
[Latency: client-side: 219ms, server-side: 182ms] [2 rows, 184B] [10 rows/s, 1010B/s]

presto> select uri_host,uri_path from wmf.webrequest where year=2023 and month=10 and day=12 and hour=4 limit 10;
      uri_host      |        uri_path         
--------------------+-------------------------
 zh.wikipedia.org   | /w/api.php              
 en.m.wikipedia.org | /wiki/Beer_on_the_Table 
 meta.wikimedia.org | /w/index.php            
 en.wikipedia.org   | /w/api.php              
 en.wikipedia.org   | /wiki/John_Fetterman    
 de.wikipedia.org   | /w/api.php              
 www.wikipedia.org  | /                       
 en.m.wikipedia.org | /wiki/Greater_Khorasan  
 en.wikipedia.org   | /w/load.php             
 meta.wikimedia.org | /w/index.php            
(10 rows)

Query 20231012_131637_00010_h599v, FINISHED, 1 node
Splits: 273 total, 59 done (21.61%)
[Latency: client-side: 0:01, server-side: 0:01] [630 rows, 721KB] [914 rows/s, 1.02MB/s]

@JAllemandou - should I push this out to the production presto cluster today, or would you rather that we do some more testing on this cluster first?

Hi @BTullis - I'm sorry I missed the ping yesterday.
I think we can go to prod with the new version.
Let's do that on Monday :)

Mentioned in SAL (#wikimedia-analytics) [2023-10-16T10:06:03Z] <btullis> deploying presto version 0.283 to production for T342343 with sudo debdeploy deploy -u 2023-10-12-presto.yaml -Q 'P{O:analytics_cluster::presto::server} or P{O:analytics_cluster::coordinator} or A:stat'

This looks good. All presto servers and client deployed.

btullis@stat1009:~$ presto --catalog analytics_hive
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
presto> SELECT node_id,node_version FROM system.runtime.nodes;
          node_id          | node_version  
---------------------------+---------------
 an-coord1001-eqiad-wmnet  | 0.283-1fa586a 
 an-presto1001-eqiad-wmnet | 0.283-1fa586a 
 an-presto1002-eqiad-wmnet | 0.283-1fa586a 
 an-presto1003-eqiad-wmnet | 0.283-1fa586a 
 an-presto1004-eqiad-wmnet | 0.283-1fa586a 
 an-presto1005-eqiad-wmnet | 0.283-1fa586a 
 an-presto1006-eqiad-wmnet | 0.283-1fa586a 
 an-presto1007-eqiad-wmnet | 0.283-1fa586a 
 an-presto1008-eqiad-wmnet | 0.283-1fa586a 
 an-presto1009-eqiad-wmnet | 0.283-1fa586a 
 an-presto1010-eqiad-wmnet | 0.283-1fa586a 
 an-presto1011-eqiad-wmnet | 0.283-1fa586a 
 an-presto1012-eqiad-wmnet | 0.283-1fa586a 
 an-presto1013-eqiad-wmnet | 0.283-1fa586a 
 an-presto1014-eqiad-wmnet | 0.283-1fa586a 
 an-presto1015-eqiad-wmnet | 0.283-1fa586a 
(16 rows)

Query 20231016_101803_00018_vrxm9, FINISHED, 2 nodes
Splits: 17 total, 17 done (100.00%)
[Latency: client-side: 344ms, server-side: 290ms] [16 rows, 1.29KB] [55 rows/s, 4.44KB/s]

presto>

A simple query runs successfully on 15 nodes.

presto> select uri_host,uri_path from wmf.webrequest where year=2023 and month=10 and day=16 and hour=4 limit 10;
      uri_host      |                                           uri_path                                           
--------------------+----------------------------------------------------------------------------------------------
 en.wikipedia.org   | /wiki/File:World_energy_consumption.svg                                                      
 en.m.wikipedia.org | /w/api.php                                                                                   
 www.wikidata.org   | /wiki/Special:EntityData/Q117789109.ttl                                                      
 es.m.wikipedia.org | /w/load.php                                                                                  
 el.wikipedia.org   | /api/rest_v1/page/summary/%CE%92%CE%B1%CF%80%CF%84%CE%B9%CF%83%CF%84%CE%AE%CF%81%CE%B9%CE%BF 
 en.wikipedia.org   | /w/load.php                                                                                  
 fa.m.wikipedia.org | /w/load.php                                                                                  
 en.wikipedia.org   | /w/load.php                                                                                  
 en.wikipedia.org   | /w/extensions/Wikibase/client/resources/images/edit.svg                                      
 it.m.wikipedia.org | /w/load.php                                                                                  
(10 rows)

Query 20231016_101923_00019_vrxm9, FINISHED, 15 nodes
Splits: 1,391 total, 810 done (58.23%)
[Latency: client-side: 0:04, server-side: 0:04] [2.81K rows, 1.3GB] [655 rows/s, 309MB/s]

presto>