Page MenuHomePhabricator

Rebuild models for new revscoring (2.3.0)
Closed, ResolvedPublic

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Just posted a work in progress: https://gerrit.wikimedia.org/r/#/c/mediawiki/services/ores/deploy/+/481210/

This includes config updates but not the new models as they are still under review.

I'm trying to finish this patch and ran into what looks like another rewrite glitch,

Fetched in submodule path 'submodules/articlequality', but it did not contain 73e7aa93845023d58032d2cc9cc350aa2ded6131. Direct fetching of that commit failed.

Okay, this one is scarier. The history wasn't rewritten, and the gerrit repo seems to be completely empty. Here's what I see, from the new master back to 73e7aa which has been deployed to production forever. I'll open a task to look at the issue.

* commit 5efaaa898761d5e426c983c54ad6666dc259f507 (HEAD -> master, origin/master, origin/HEAD)
| Author: Aaron Halfaker <aaron.halfaker@gmail.com>
| Date:   Fri Dec 21 15:50:53 2018 -0600
| 
|     Updates for revscoring 2.3.0 (#71)
|     
|     * Updated for revscoring 2.3.0 -- make, features, requirments
|     
|     * Models for revscoring 2.3.0 updates.
|     
|     * Minor flake8 fix
|     
|     * Adds model_info for all models.
|     
|     * Adds model_info calls to the Makefile.
| 
* commit 57edf786636548bed466aa4e9d9e213fe8d1093b (hoo/master)
| Author: Aaron Halfaker <aaron.halfaker@gmail.com>
| Date:   Thu Nov 29 15:21:19 2018 -0600
| 
|     glwiki article quality model (#70)
|     
|     * Adds glwiki to the makefile.
|     
|     * Adds features for glwiki
|     
|     * Adds tuning and model for glwiki.
|     
|     Adds tuning and model for glwiki to LFS.  Also includes basic LFS
|     config so future models will be automatically added to LFS.
|   
| * commit 84e2daf5296b614b8296e04fbf170e51cc4102a5 (hoo/test, test)
|/  Author: Marius Hoch <hoo@online.de>
|   Date:   Mon Dec 17 19:39:16 2018 +0100
|   
|       WIP
|   
| * commit 8809c55f6547d357fad3aa6da736cba6b6d33cc0 (origin/glwiki, glwiki)
| | Author: Aaron Halfaker <ahalfaker@wikimedia.org>
| | Date:   Wed Nov 28 21:34:20 2018 +0000
| | 
| |     Adds tuning and model for glwiki.
| | 
| * commit de50f65923690c67f824b3c27bb0464e52f2772a
| | Author: halfak <aaron.halfaker@gmail.com>
| | Date:   Wed Nov 28 15:09:57 2018 -0600
| | 
| |     Adds features for glwiki
| | 
| * commit a364ff0f866d5ea9be1763609d38653f0f8c4106
|/  Author: halfak <aaron.halfaker@gmail.com>
|   Date:   Wed Nov 28 15:00:12 2018 -0600
|   
|       Adds glwiki to the makefile.
| 
* commit 480c879881d5286453e75e5fcdc83c0619efea9f
| Author: Amir Sarabadani <ladsgroup@gmail.com>
| Date:   Wed Aug 22 16:57:59 2018 +0200
| 
|     Update link to the github repo
|   
*   commit 73e7aa93845023d58032d2cc9cc350aa2ded6131
|\  Merge: e3e224d e9c0568
| | Author: Aaron Halfaker <aaron.halfaker@gmail.com>
| | Date:   Mon Jul 16 11:42:58 2018 -0500
| | 
| |     Merge pull request #63 from wiki-ai/fawiki_wp10
| |     
| |     Start fawiki wp10 model
| | 
....

Ship's log, star date 2019.1.2

We're surrounded by broken git repos. I suspect LFS af, but the main sensors can't penetrate the warp fields emanating from the rupture in the space-time continuum.

This time, it's the missing LFS object error:

Error downloading object: models/enwiki.drafttopic.gradient_boosting.model (145b278): Smudge error: Error downloading models/enwiki.drafttopic.gradient_boosting.model (145b278a923bafcf130835553d2a2663cf951087c070d97910c8f751f71bce7b):

If I try to clone from the Phabricator upstream, I get a slightly different error,

Error downloading object: models/enwiki.drafttopic.gradient_boosting.model (145b278): Smudge error: Error downloading models/enwiki.drafttopic.gradient_boosting.model (145b278a923bafcf130835553d2a2663cf951087c070d97910c8f751f71bce7b): batch response: Authorization error: http://phabricator.wikimedia.org/source/drafttopic.git/info/lfs/objects/batch

Cloning from Github is fine.

whelp, borrowing @Ladsgroup's workaround, we're closing in: T212544#4841979

Change 481210 had a related patch set uploaded (by Awight; owner: Halfak):
[mediawiki/services/ores/deploy@master] (WIP) Updates for revscoring 2.3.0.

https://gerrit.wikimedia.org/r/481210

Change 482154 had a related patch set uploaded (by Halfak; owner: Halfak):
[research/ores/wheels@master] Updates for revscoring 2.3.0

https://gerrit.wikimedia.org/r/482154

Change 482154 merged by Halfak:
[research/ores/wheels@master] Updates for revscoring 2.3.0

https://gerrit.wikimedia.org/r/482154

https://github.com/wikimedia/ores-wmflabs-deploy/pull/100 is ready for wmflabs.

We're still blocked on fixing the articlequality repo for prod/beta.

I don't see an issue with updating articlequality from gerrit so I have removed the "(WIP)" and todo item. Should be good to go.

Unrelated, I tried to do a deployment to wmflabs and got the following error:

[ores-staging-01.eqiad.wmflabs] out: Error downloading object: models/lvwiki.goodfaith.gradient_boosting.model (b0a5123): Smudge error: Error downloading models/lvwiki.goodfaith.gradient_boosting.model (b0a51232680bddd6079a51ba7ac77ad57a7eb6073e58e19c11442f7ed2738666): batch response: Rate limit exceeded: https://github.com/wiki-ai/editquality.git/info/lfs/objects/batch

Could we being rate limited by github? It seems like it. I get a 405 error when I try to access https://github.com/wiki-ai/editquality.git/info/lfs/objects/batch

Either way, this seems to have left git on ores-staging-02 in a bad state. So I need to manually reconstruct the repo. What a pain!

I just added a --force to git submodule update --init in the fabfile.py and that let me work around the damaged state, but now I have a 500 error. Digging.

I'm seeing a lot of this in the logs:

[2019-01-04T06:25:36] statsd_send_metric()/sendto(): Resource temporarily unavailable [plugins/stats_pusher_statsd/plugin.c line 40]

Aha! After a bunch of digging, I found this:

[2019-01-04T15:47:41] Traceback (most recent call last):
[2019-01-04T15:47:41]   File "/srv/ores/config/ores_wsgi.py", line 6, in <module>
[2019-01-04T15:47:41]     application = wsgi.build()
[2019-01-04T15:47:41]   File "./ores/applications/wsgi.py", line 55, in build
[2019-01-04T15:47:41]     return server.configure(config)
[2019-01-04T15:47:41]   File "./ores/wsgi/server.py", line 28, in configure
[2019-01-04T15:47:41]     scoring_system = ScoringSystem.from_config(config, ss_name)
[2019-01-04T15:47:41]   File "./ores/scoring_systems/scoring_system.py", line 337, in from_config
[2019-01-04T15:47:41]     return Class.from_config(config, name)
[2019-01-04T15:47:41]   File "./ores/scoring_systems/celery_queue.py", line 231, in from_config
[2019-01-04T15:47:41]     config, name, section_key=section_key)
[2019-01-04T15:47:41]   File "./ores/scoring_systems/scoring_system.py", line 294, in _kwargs_from_config
[2019-01-04T15:47:41]     config, section['scoring_contexts'])
[2019-01-04T15:47:41]   File "./ores/scoring_context.py", line 207, in map_from_config
[2019-01-04T15:47:41]     scorer_model = Model.from_config(config, key)
[2019-01-04T15:47:41]   File "/srv/ores/venv/lib/python3.5/site-packages/revscoring/scoring/models/model.py", line 131, in from_config
[2019-01-04T15:47:41]     return Class.load(stream)
[2019-01-04T15:47:41]   File "/srv/ores/venv/lib/python3.5/site-packages/revscoring/scoring/models/model.py", line 104, in load
[2019-01-04T15:47:41]     model = pickle.load(f)
[2019-01-04T15:47:41]   File "./editquality/feature_lists/translatewiki.py", line 4, in <module>
[2019-01-04T15:47:41]     import langdetect
[2019-01-04T15:47:41] ImportError: No module named 'langdetect'

Change 482319 had a related patch set uploaded (by Halfak; owner: Halfak):
[research/ores/wheels@master] Adds langdetect-1.0.7

https://gerrit.wikimedia.org/r/482319

Change 482319 merged by Halfak:
[research/ores/wheels@master] Adds langdetect-1.0.7

https://gerrit.wikimedia.org/r/482319

Change 481210 merged by Awight:
[mediawiki/services/ores/deploy@master] Updates for revscoring 2.3.0.

https://gerrit.wikimedia.org/r/481210

Mentioned in SAL (#wikimedia-operations) [2019-01-07T21:00:57Z] <awight@deploy1001> Started deploy [ores/deploy@9253beb]: T212530: new ORES models; revscoring 2.3.0

Mentioned in SAL (#wikimedia-operations) [2019-01-07T21:16:24Z] <awight@deploy1001> Finished deploy [ores/deploy@9253beb]: T212530: new ORES models; revscoring 2.3.0 (duration: 15m 28s)