Page MenuHomePhabricator

Support expired tile deduplication
Closed, ResolvedPublic

Description

Context

On each OSM import, imposm exports a list of expired tile list at a specified zoom level. While the tile expiration propagates to different zoom levels we have overlapping tiles (eg. 2 tiles have the same parent tile). In order to avoid pregenerating tiles multiple times we need a mechanism to deduplicate the list.

Acceptance criteria
  • Implement a simple CLI util that does the deduplication
  • Package the CLI util
  • Install package in maps master nodes

Event Timeline

Jgiannelos triaged this task as Medium priority.Sep 14 2021, 3:30 PM
Jgiannelos created this task.

Here is the small python util for that purpose: https://gitlab.wikimedia.org/jgiannelos/maps-deduped-tilelist
There are no dependencies to other libraries and I added some configuration for python and debian packaging generated from CI.

Here is the latest release: https://gitlab.wikimedia.org/jgiannelos/maps-deduped-tilelist/-/releases/0.0.2
I also added some docs here with examples: https://gitlab.wikimedia.org/jgiannelos/maps-deduped-tilelist/-/blob/main/README.md

I used gitlab out of curiosity/dogfooding/interest to see how CI works. We can always move it to gerrit or github.

The approach of the CLI looks good to me. We should now see how to backport the script to debian buster to use on the maps clusters.

@MoritzMuehlenhoff do you have any thoughts regarding the debian packaging backport? How can we proceed with this?

The approach of the CLI looks good to me. We should now see how to backport the script to debian buster to use on the maps clusters.

@MoritzMuehlenhoff do you have any thoughts regarding the debian packaging backport? How can we proceed with this?

For a simple script like this we don't necessarily need a deb, we could also simply have shipped it via Puppet. But if the work is already done by Yiannis, let's use it :-) I'll have a look tomorrow.

@Jgiannelos One of the tests fails with Python 3.7 (the Python version in Buster):

test_main (tests.test_cli.CliTest) ... FAIL
test_tile_equal (tests.test_tileset.TestTile) ... ok
test_tile_hashable (tests.test_tileset.TestTile) ... ok
test_tile_not_equal (tests.test_tileset.TestTile) ... ok
test_parse_entry (tests.test_tileset.TestTileSet) ... ok
test_parse_entry_trailing_newline (tests.test_tileset.TestTileSet) ... ok
test_tileset_add_multiple_maxzoom (tests.test_tileset.TestTileSet) ... ok
test_tileset_add_multiple_minzoom (tests.test_tileset.TestTileSet) ... ok
test_tileset_add_multiple_overlapping_maxzoom (tests.test_tileset.TestTileSet) ... ok
test_tileset_add_multiple_overlapping_minzoom (tests.test_tileset.TestTileSet) ... ok
test_tileset_add_multiple_overlapping_z_between_maxzoom (tests.test_tileset.TestTileSet) ... ok
test_tileset_add_multiple_z_between_maxzoom (tests.test_tileset.TestTileSet) ... ok
test_tileset_add_single_maxzoom (tests.test_tileset.TestTileSet) ... ok
test_tileset_add_single_minzoom (tests.test_tileset.TestTileSet) ... ok
test_tileset_add_single_z_between_maxzoom (tests.test_tileset.TestTileSet) ... ok
test_tileset_read (tests.test_tileset.TestTileSet) ... ok

======================================================================
FAIL: test_main (tests.test_cli.CliTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.7/unittest/mock.py", line 1204, in patched
    return func(*args, **keywargs)
  File "/build/maps-deduped-tilelist-0.0.2/.pybuild/cpython3_3.7_maps-deduped-tilelist/build/tests/test_cli.py", line 32, in test_main
    mock_print.assert_has_calls(call_list)
  File "/usr/lib/python3.7/unittest/mock.py", line 861, in assert_has_calls
    ) from cause
AssertionError: Calls not found.
Expected: [call('1/0/1'), call('1/1/0'), call('0/0/0'), call('1/0/0'), call('1/1/1')]
Actual: [call('1/1/0'), call('1/0/0'), call('1/0/1'), call('0/0/0'), call('1/1/1')]

----------------------------------------------------------------------
Ran 16 tests in 0.005s

I think the test assumes order where its not necessarily ensured. Looking at it.

Mentioned in SAL (#wikimedia-operations) [2021-09-20T07:49:03Z] <moritzm> uploaded maps-deduped-tilelist 0.0.3~deb10u1 to buster-wikimedia/main T290982

Change 722264 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Install python3-maps-deduped-tilelist on maps masters

https://gerrit.wikimedia.org/r/722264

Should be fixed here after this change.

Ack, that fixed it. I've built the package, uploaded it to apt.wikimedia.org and prepared a patch to install it on the maps masters.

Change 722264 merged by Muehlenhoff:

[operations/puppet@production] Install python3-maps-deduped-tilelist on maps masters

https://gerrit.wikimedia.org/r/722264

Script is now deployed on the masters

Puppet currently fails on deployment-maps08.deployment-prep.eqiad1.wikimedia.cloud:

Error: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install python3-maps-deduped-tilelist' returned 100: Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package python3-maps-deduped-tilelist
Error: /Stage[main]/Profile::Maps::Osm_master/Package[python3-maps-deduped-tilelist]/ensure: change from 'purged' to 'present' failed: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install python3-maps-deduped-tilelist' returned 100: Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package python3-maps-deduped-tilelist

Seems related to this task.

Closing this one since the package is already installed in maps production masters. I will file a ticket for deployment-prep specifically.

I can easily rebuild/upload a fixed package for apt.wikimedia.org, though. Just let me know.