This task is tracking work already in progress. We're considering a workaround (successfully demonstrated in the beta cluster) where scap won't rewrite our submodule URLs, to avoid an idiosyncracy in git-lfs. Unfortunately, our submodules are hosted on phabricator for historical reasons, but aren't accessible from deployment targets. Therefore, we're creating additional git mirrors in gerrit which will be accessible during deployment.
|Resolved||awight||T187217 [Epic] Support word2vec for production ORES models|
|Resolved||awight||T188446 Package word2vec binaries|
|Resolved||Halfak||T171619 [Epic] ORES should use a git large file plugin for storing serialized binaries|
|Resolved||awight||T181678 Plan migration of ORES repos to git-lfs|
|Resolved||None||T176324 Scoring platform team FY18 Q2|
|Resolved||Halfak||T183198 Scoring Platform FY18 Q3|
|Resolved||awight||T176336 Deploy drafttopic model to production ORES|
|Resolved||mmodell||T180627 Support git-lfs in scap|
|Resolved||mmodell||T192042 Create gerrit mirrors for all github-based ORES repos|
After migrating the github repos to use git lfs, now it returns this error:
amsa@C235:~/ores-prod-deploy/submodules/articlequality$ git reset --hard origin/master Downloading models/enwiki.nettrom_wp10.gradient_boosting.model (60 MB) git pError downloading object: models/enwiki.nettrom_wp10.gradient_boosting.model (9b52295): Smudge error: Error downloading models/enwiki.nettrom_wp10.gradient_boosting.model (9b52295b6bd398ab72f2a6bdace7450672f47d502b893dd8d3b333bfd8077e37): [9b52295b6bd398ab72f2a6bdace7450672f47d502b893dd8d3b333bfd8077e37] Object '9b52295b6bd398ab72f2a6bdace7450672f47d502b893dd8d3b333bfd8077e37' not found:  Object '9b52295b6bd398ab72f2a6bdace7450672f47d502b893dd8d3b333bfd8077e37' not found
It's great we've got so far, just one nudge and lfs for all models is in production \o/
Yeah, The thing is the whole history is rewritten now, so the hashes of commits are different and now your system thinks they are diverged. Nothing to worry about, you basically need to throw everything and build it again :D. I think I've got a solution for this :P
I get an error on the same object:
$ git status On branch master Your branch and 'origin/master' have diverged, and have 251 and 251 different commits each, respectively. (use "git pull" to merge the remote branch into yours) nothing to commit, working tree clean $ git checkout origin/master Downloading models/enwiki.nettrom_wp10.gradient_boosting.model (60 MB) Error downloading object: models/enwiki.nettrom_wp10.gradient_boosting.model (9b52295): Smudge error: Error downloading models/enwiki.nettrom_wp10.gradient_boosting.model (9b52295b6bd398ab72f2a6bdace7450672f47d502b893dd8d3b333bfd8077e37): batch response: Authorization error: https://phabricator.wikimedia.org/source/wikiclass.git/info/lfs/objects/batch Check that you have proper access to the repository Errors logged to /X/ores-prod-deploy/.git/modules/submodules/articlequality/lfs/objects/logs/20180820T142630.554616.log Use `git lfs logs last` to view the log. error: external filter 'git-lfs filter-process' failed fatal: models/enwiki.nettrom_wp10.gradient_boosting.model: smudge filter lfs failed
In the future it seems prudent to do this sort of rewrite into a named branch, but I guess that's not part of the LFS migration recommendations.
Thank you for that! I'm thinking of something else though, that you could do this entire migration as a dry run, and only rename over to the master branch once we've proven that things like continuous integration still work. Luckily, I think that for this repo it's fine because we never need it "bare", it's only a submodule in production so we can simply skip over any corrupted commits.
The phabricator repos themselves are broken:
amsa@C235:~$ git clone http://phabricator.wikimedia.org/source/editquality.git editquality-phab Cloning into 'editquality-phab'... warning: redirecting to https://phabricator.wikimedia.org/source/editquality.git/ remote: Counting objects: 3744, done. remote: Compressing objects: 100% (2644/2644), done. remote: Total 3744 (delta 1909), reused 2716 (delta 1054) Receiving objects: 100% (3744/3744), 3.77 MiB | 6.34 MiB/s, done. Resolving deltas: 100% (1909/1909), done. Downloading models/arwiki.damaging.gradient_boosting.model (378 KB) Error downloading object: models/arwiki.damaging.gradient_boosting.model (c1f9cb0): Smudge error: Error downloading models/arwiki.damaging.gradient_boosting.model (c1f9cb02bac12da3a421ee146c34a9fe7fa89a4993cd471d2f9bbd0ce977719d): batch response: Authorization error: http://phabricator.wikimedia.org/source/editquality.git/info/lfs/objects/batch Check that you have proper access to the repository Errors logged to /home/amsa/editquality-phab/.git/lfs/logs/20180906T173459.087885996.log Use `git lfs logs last` to view the log. error: external filter 'git-lfs filter-process' failed fatal: models/arwiki.damaging.gradient_boosting.model: smudge filter lfs failed warning: Clone succeeded, but checkout failed. You can inspect what was checked out with 'git status' and retry the checkout with 'git checkout -f HEAD'
Phabricator doesn't have proper git-lfs support and I've been told not to put any resources into phabricator's git hosting infrastructure. I want to express my exasperation at this situation. I don't like it really at all. Unfortunately it simply isn't a priority and gerrit is apparently our only officially supported git service for now.