chad@wmf3179 analytics % cat .gitattributes
*.whl filter=fat -text
*.jar filter=fat -text
chad@wmf3179 analytics % find . -name '*.whl' -ls -exec cat {} \; -o -name '*.jar' -ls -exec cat {} \;
28980522 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/decorator-5.0.9-py3-none-any.whl
#$# git-fat e0c9be2a8af8c22ed30c73ff8138ac8461099a7a 8901
28982159 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/idna-2.10-py2.py3-none-any.whl
#$# git-fat 999b6718b4d789d8ca0d2ddf7c07826154291825 58811
28982368 0 lrwxrwxrwx 1 ebernhardson ebernhardson 49 Sep 1 09:04 ./artifacts/rdf-spark-tools-latest-jar-with-dependencies.jar -> rdf-spark-tools-0.3.114-jar-with-dependencies.jar
#$# git-fat dc9ec28bbf636718020316dccb8f3ded9066e250 24748809
28982355 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/networkx-1.11-py2.py3-none-any.whl
#$# git-fat 3209bca45fb613b7a4507cff1927b1fd44622e6c 1317927
28982157 0 lrwxrwxrwx 1 ebernhardson ebernhardson 37 Sep 1 09:04 ./artifacts/glent-latest-jar-with-dependencies.jar -> glent-0.2.6-jar-with-dependencies.jar
#$# git-fat f0f391b831f3f09da31b4138f3e1d553d39fe6ae 40995064
28982359 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/prometheus_client-0.11.0-py2.py3-none-any.whl
#$# git-fat 30fef728e9993f3ea69c0b71525f6362508ecc9d 56435
28982364 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/python_json_logger-2.0.1-py34-none-any.whl
#$# git-fat 3726718fd7272fdc4b1f8fae6ebbe7a861662869 7374
28982156 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/glent-0.2.6-jar-with-dependencies.jar
#$# git-fat f0f391b831f3f09da31b4138f3e1d553d39fe6ae 40995064
28982369 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/requests-2.25.1-py2.py3-none-any.whl
#$# git-fat b1009d9fd6acadc64e1a3cecb6f0083fe047e753 61216
28982371 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/setuptools-57.0.0-py3-none-any.whl
#$# git-fat 0b0fcb339be89ae1b6360dbfb2be2075ae9f84c9 821665
28982362 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/pyrsistent-0.17.3-cp37-cp37m-linux_x86_64.whl
#$# git-fat b83fc6cfcacc712024f2803ef0035fb08f5c0c6f 98627
28982158 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/hyperopt-0.1.2-py3-none-any.whl
#$# git-fat 4eaf5f249a184a12bfe3e4fbbe37d39fd4192f90 115233
28982374 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/tqdm-4.61.0-py2.py3-none-any.whl
#$# git-fat 80cc9df9545b54fe3e18b790d6c5187b65aad762 75783
28982352 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/mock-4.0.3-py3-none-any.whl
#$# git-fat 89e027f3561efa6fb1dc9ab30ec60e507695bb76 28536
28980352 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/attrs-21.2.0-py2.py3-none-any.whl
#$# git-fat a72511421b1aca19cc12b17e2859cf755e0a1ca3 53716
28982365 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/python_logstash-0.4.6-py3-none-any.whl
#$# git-fat 04e52db2cb1f3e55ca2bb2b5b24a9ff9e5f2bda0 8150
28982354 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/mwapi-0.5.1-py2.py3-none-any.whl
#$# git-fat 03d878921284b9c2f6af86f7ba8923e29f782a92 10639
28980355 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/chardet-4.0.0-py2.py3-none-any.whl
#$# git-fat e9eb83c71c09b3c8249bd7d6d2619b65fff03874 178743
28982154 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/elasticsearch-hadoop-7.10.2.jar
#$# git-fat d752857f3fb54f51f4bc353075a2aabc2843cc1b 1021515
28980354 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/certifi-2021.5.30-py2.py3-none-any.whl
#$# git-fat 2fcaa39108a9c99700c6f3f4198fcaa47b8ed707 145532
28982372 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/six-1.15.0-py2.py3-none-any.whl
#$# git-fat 8730d16507db66e828c696ecc7cb785e557900bb 10963
28982358 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/pip-21.1.2-py3-none-any.whl
#$# git-fat 296a5082c1e300e302d2d11a447bd92ce20d3d9d 1547997
28982153 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/elasticsearch-5.5.3-py2.py3-none-any.whl
#$# git-fat 0007d4e42ed7bf76489549eabbf437a1d6c328b5 119268
28982356 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/numpy-1.20.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
#$# git-fat 07ebc9f06abf992c1f15e6cd430d8867a3a45fd0 15307196
28982353 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/more_itertools-8.7.0-py3-none-any.whl
#$# git-fat 859eff022eea6153860536e0a943c6967e507d33 48425
28982360 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/py4j-0.10.9.2-py2.py3-none-any.whl
#$# git-fat 8e97d429ab19777c9cc934a36ffe8699081e7455 198796
28982379 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/wmf_mjolnir-1.1-py3-none-any.whl
#$# git-fat bd1d3cac73e3f9b8c712885010192ac1268e6802 155623
28982367 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/rdf-spark-tools-0.3.114-jar-with-dependencies.jar
#$# git-fat dc9ec28bbf636718020316dccb8f3ded9066e250 24748809
28982370 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/scipy-1.6.3-cp37-cp37m-manylinux1_x86_64.whl
#$# git-fat db19626ba45d0b8f81792c89e457d1f6fc817a34 27390562
28982378 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/wheel-0.36.2-py2.py3-none-any.whl
#$# git-fat 9e78f9fc756bc09c02c717fae6610cfd6d6a0fe7 35046
28980353 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/certifi-2020.12.5-py2.py3-none-any.whl
#$# git-fat 7dff15a2066b8809c8772a243991bd1a25740ec3 147526
28982350 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/jsonschema-3.2.0-py2.py3-none-any.whl
#$# git-fat 13a9abc0b85f73adfea760809110f4520118e1a4 56305
28982361 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/pymongo-3.11.4-cp37-cp37m-manylinux2014_x86_64.whl
#$# git-fat b126006fdaa0044fff3f61a6d3e116e62a5359b3 512687
28982380 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/xgboost-0.90-py2.py3-none-manylinux1_x86_64.whl
#$# git-fat 94a40d56f0fd37ed4683c775b455a7264d259037 142822578
28982377 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/urllib3-1.26.5-py2.py3-none-any.whl
#$# git-fat effca0b8a9f0a0d7e546c880da06e9972357b742 138144
28982155 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/future-0.18.2-py3-none-any.whl
#$# git-fat 58b165a584aa5236e44651894736ef781d92f387 491059
28980351 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/PyYAML-5.4.1-cp37-cp37m-manylinux1_x86_64.whl
#$# git-fat 4d04149ec1b0035d5d828dd861009039b54069f5 636647
28982363 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/python_dateutil-2.8.1-py2.py3-none-any.whl
#$# git-fat 3005ff67df93ee276fb8631e17c677df852254ad 227183
28982357 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/oresapi-0.1.0-py2.py3-none-any.whl
#$# git-fat 95f1700b17dfc9a3d7294f79fd1891f7f192caf1 8060
28982373 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/six-1.16.0-py2.py3-none-any.whl
#$# git-fat 79e6f2e4f9e24898f1896df379871b9c9922f147 11053
28982152 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/docopt-0.6.2-py2.py3-none-any.whl
#$# git-fat 15032b3ee3c325e618abb8468116c2c6be633e0e 13704
28980523 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/dnspython-1.16.0-py2.py3-none-any.whl
#$# git-fat 9c44f537aa5fcaa2a3b6529bba9c59fc4dae8c50 188353
28982351 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/kafka_python-1.4.7-py2.py3-none-any.whl
#$# git-fat 2d5dee2f09d2ad3e67addaee9a923dd2751c3a10 266121
28982375 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/typing_extensions-3.10.0.0-py3-none-any.whl
#$# git-fat 6bb39b4a1d4882bb6889c4830c44a7c22eae5bc5 26127
28982366 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/python_snappy-0.6.0-cp37-cp37m-manylinux2010_x86_64.whl
#$# git-fat 0a3c96080d53c90097f3979de90207b340fcb451 55288
28982160 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/importlib_metadata-4.4.0-py3-none-any.whl
#$# git-fat 1fa9299575d630882893a204172782254cb993fa 17263
28982151 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/dnspython-2.0.0-py3-none-any.whl
#$# git-fat 01e7db5fa5fca5b7ee00c45e2fdbb2f209c1b744 208262
28982376 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/urllib3-1.26.3-py2.py3-none-any.whl
#$# git-fat bc1f2e29068a85cefc6c7652ae77eea287e0c9d8 137023
28982381 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./artifacts/zipp-3.4.1-py3-none-any.whl
#$# git-fat 93ce9842312d434a7f2270ec9b02d903e22d7017 5191
26498286 4 -rw-rw-r-- 1 ebernhardson ebernhardson 74 Sep 1 09:04 ./.mvn/wrapper/maven-wrapper.jar
#$# git-fat 99c11907918309fe94d7e7574a144c7c08077dd4 50710Description
Details
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Open | None | T388129 remove git-fat objects from deployment server | |||
| Resolved | dancy | T279509 git-fat replacement/removal | |||
| Invalid | None | T316877 wikimedia/discovery/analytics: replace git-fat with git-lfs |
Event Timeline
I can't explain why the find command above doesn't find them, but the above list is missing a number of things:
find . -name '*.whl' ./artifacts/decorator-5.0.9-py3-none-any.whl ./artifacts/idna-2.10-py2.py3-none-any.whl ./artifacts/networkx-1.11-py2.py3-none-any.whl ./artifacts/prometheus_client-0.11.0-py2.py3-none-any.whl ./artifacts/python_json_logger-2.0.1-py34-none-any.whl ./artifacts/requests-2.25.1-py2.py3-none-any.whl ./artifacts/setuptools-57.0.0-py3-none-any.whl ./artifacts/pyrsistent-0.17.3-cp37-cp37m-linux_x86_64.whl ./artifacts/hyperopt-0.1.2-py3-none-any.whl ./artifacts/tqdm-4.61.0-py2.py3-none-any.whl ./artifacts/mock-4.0.3-py3-none-any.whl ./artifacts/attrs-21.2.0-py2.py3-none-any.whl ./artifacts/python_logstash-0.4.6-py3-none-any.whl ./artifacts/mwapi-0.5.1-py2.py3-none-any.whl ./artifacts/chardet-4.0.0-py2.py3-none-any.whl ./artifacts/certifi-2021.5.30-py2.py3-none-any.whl ./artifacts/six-1.15.0-py2.py3-none-any.whl ./artifacts/pip-21.1.2-py3-none-any.whl ./artifacts/elasticsearch-5.5.3-py2.py3-none-any.whl ./artifacts/numpy-1.20.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl ./artifacts/more_itertools-8.7.0-py3-none-any.whl ./artifacts/py4j-0.10.9.2-py2.py3-none-any.whl ./artifacts/wmf_mjolnir-1.1-py3-none-any.whl ./artifacts/scipy-1.6.3-cp37-cp37m-manylinux1_x86_64.whl ./artifacts/wheel-0.36.2-py2.py3-none-any.whl ./artifacts/certifi-2020.12.5-py2.py3-none-any.whl ./artifacts/jsonschema-3.2.0-py2.py3-none-any.whl ./artifacts/pymongo-3.11.4-cp37-cp37m-manylinux2014_x86_64.whl ./artifacts/xgboost-0.90-py2.py3-none-manylinux1_x86_64.whl ./artifacts/urllib3-1.26.5-py2.py3-none-any.whl ./artifacts/future-0.18.2-py3-none-any.whl ./artifacts/PyYAML-5.4.1-cp37-cp37m-manylinux1_x86_64.whl ./artifacts/python_dateutil-2.8.1-py2.py3-none-any.whl ./artifacts/oresapi-0.1.0-py2.py3-none-any.whl ./artifacts/six-1.16.0-py2.py3-none-any.whl ./artifacts/docopt-0.6.2-py2.py3-none-any.whl ./artifacts/dnspython-1.16.0-py2.py3-none-any.whl ./artifacts/kafka_python-1.4.7-py2.py3-none-any.whl ./artifacts/typing_extensions-3.10.0.0-py3-none-any.whl ./artifacts/python_snappy-0.6.0-cp37-cp37m-manylinux2010_x86_64.whl ./artifacts/importlib_metadata-4.4.0-py3-none-any.whl ./artifacts/dnspython-2.0.0-py3-none-any.whl ./artifacts/urllib3-1.26.3-py2.py3-none-any.whl ./artifacts/zipp-3.4.1-py3-none-any.whl
find . -name '*.whl' -or -name '*.jar' -ls -exec cat {} \;this should instead be (i didn't expect this either):
find . -name '*.whl' -ls -exec cat {} \; -o -name '*.jar' -ls -exec cat {} \;find is indeed tricky from time to time, well done on figuring out the issue.
I have an alternative to list files that have a filter: fat git attribute which would work for any repo and saves one from having to list the file extensions to manage (it also handles the hypothetical case of a .gitattributes in a subdirectory which would set the filter for additional file extensions beside the ones defined in /.gitattributes):
git ls-files | git check-attr --stdin filter|grep -oP '(.*)(?=: filter: fat)'
The Search Platform team will be just watching, I don't think we need to do additional work on this. But ping us if needed!
Change 1010249 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):
[All-Projects@refs/meta/config] Enable LFS for wikimedia/discovery/analytics
Change 1010249 merged by Ahmon Dancy:
[All-Projects@refs/meta/config] Enable LFS for wikimedia/discovery/analytics
Change 1010297 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):
[wikimedia/discovery/analytics@master] Migrate from git-fat to git-lfs
I tried to move this forward today but my change https://gerrit.wikimedia.org/r/c/wikimedia/discovery/analytics/+/1010297 was rejected by CI, saying that the repo has been archived. Does that mean this ticket can be closed?
hrm, repo is still active in gerrit, but I see the commit that moved it to archived in integration/config is T346176: Archive wikimedia/discovery/analytics so sounds like we just need to archive this in gerrit, too, so folks can't push.
Change 1010297 abandoned by Ahmon Dancy:
[wikimedia/discovery/analytics@master] Migrate from git-fat to git-lfs
Reason:
task declined