Page MenuHomePhabricator

Decide whether to migrate from Presto to Trino
Open, MediumPublic

Description

In 2019, Presto forked into two separate projects: Presto (formerly PrestoDB), backed by Facebook and run under the Linux Foundation, and Trino (formerly PrestoSQL), backed by the original Presto creators and run by a separate foundation. Some information on their differences:

@elukey says "Given my experience with other Facebook projects and https://github.com/prestodb/presto/issues/15207, it seems to me that Trino is more focused on community driver development, but it could be my own impression. I'd be inclined to test the latest Trino and migrate if it suits our needs. Trino definitely has a big community, and the project seems very active."

Event Timeline

fdans triaged this task as Medium priority.Oct 29 2020, 4:47 PM
fdans moved this task from Incoming to Operational Excellence on the Analytics board.

PrestoSQL has been rebranded to Trino: https://trino.io/blog/2020/12/27/announcing-trino.html

In my opinion, we could upgrade to the last Presto version simply reusing our current package, and get all the goodies and perf improvements without lagging too much from upstream, and then migrate to Trino as separate project (since it will likely need a new package etc..)

Thoughts?

Team decision to move to the last upstream of PrestoDB for the moment, since we are lagging ~20 versions :(

Today I built version 0.246 of PrestoDB and uploaded it to our APT repository. Tested in Hadoop test and it worked, but then it failed when I tried to roll it out on an-coord1001 and an-presto1001.

I have created an-test-presto1001 to replicate the set up and have a better upgrade experience next time (also to ease the Alluxio testing).

What I found is that we have another occurrence of https://github.com/prestodb/presto/issues/15207, that in theory was fixed with ad-hoc settings. Will keep testing to find a solution..

Opened https://github.com/prestodb/presto/pull/15655 to prestodb, it contains all the info related to the problem that I am seeing.

To test the fix, I have done the following:

  • cloned the prestodb git repo
  • applied the above patch
  • executed ./mvnw clean install -DskipTests
  • copied the presto-main-246.jar file to an-test-coord1001 and an-test-presto1001

The TLS authentication issues disappeared and the cluster finally works with 0.246. We can wait for the review of the pull request and then either use the custom jar for a new debian 0.246-1~wmf version (since we package the upstream tarball release with jars pre-built, we cannot really apply a clean debian patch) or we can wait for a new release (that might take a little).

This is also a good test to see how upstream handles the community, if patches are welcome or not etc..

Change 660756 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_test_cluster::presto::server: add new TLS configs

https://gerrit.wikimedia.org/r/660756

Change 660756 merged by Elukey:
[operations/puppet@production] role::analytics_test_cluster::presto::server: add new TLS configs

https://gerrit.wikimedia.org/r/660756

Some updates:

  • my patch was merged by upstream, but it will not go out in 0.247, but probably in 0.248
  • the option hive.force-local-scheduling was deprecated in favor of hive.node-selection-strategy.
  • if we set the /srv/presto/var/log dir to root:root (need to figure out if it the package or puppet at fault) some logs like the http-request.log are not created (failing silently sigh).

The only weird thing is this:

presto:wmf> select * from webrequest where webrequest_source='test_text' and year=2021 and day=20 and month=1 and hour=1 limit 10;
Query 20210204_172606_00009_8uag6 failed: / by zero

I ran the same query in beeline and it worked..

elukey renamed this task from Decide to move or not to PrestoSQL to Decide to move or not to PrestoSQL/Trino.Feb 5 2021, 10:17 AM
elukey updated the task description. (Show Details)

@Ottomata https://github.com/prestodb/presto/pull/15655 got merged by upstream, but it will be released in 0.248 and I am not sure when they have scheduled it. I uploaded the 0.246 deb some days ago (before discovering the issues, the first tests were fine with the coordinator-only node) and tested my patch rebuilding presto by myself and uploading the right jar to an-test-presto1001 (I found the right one inspecting the classes via jar tf etc..). If you are ok I'd package 0.246-1~wmf1 with the "custom" jar built on deneb, and then rollout the upgrade. Otherwise if it is not ok I'll wait for 248 and package it :)

If you are ok I'd package 0.246-1~wmf1 with the "custom" jar built on deneb,

Sure of course!

Test cluster upgraded with the new package, all good! Going to wait for the +1 from somebody in the team to proceed with the prod upgrade :)

\o/
I ran a query on the test-cluster, it's all good on my side :)
+1!

Change 665067 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] presto: remove hive.force-local-scheduling from config

https://gerrit.wikimedia.org/r/665067

Change 665067 merged by Elukey:
[operations/puppet@production] presto: remove hive.force-local-scheduling from config

https://gerrit.wikimedia.org/r/665067

Change 665069 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] presto: add new specific settings for internal TLS conns

https://gerrit.wikimedia.org/r/665069

Change 665069 merged by Elukey:
[operations/puppet@production] presto: add new specific settings for internal TLS conns

https://gerrit.wikimedia.org/r/665069

Cluster deployed, we are using 0.246 now! This will unblock Alluxio testing..

Still to decide, now that we have a more up to date version.. Trino o PrestoDB?

nshahquinn-wmf renamed this task from Decide to move or not to PrestoSQL/Trino to Decide whether to migrate from Presto to Trino.Sep 2 2021, 10:10 PM
nshahquinn-wmf added a subscriber: nshahquinn-wmf.

We should find out if swapping to Trino is a mostly drop in replacement.

  • Can we re-use our existing Presto debian packaging scripts?
  • Does superset just work with Trino as is?

etc.