Having added sfn templates to fawiki features in T314302, the next step is to retrain fawiki articlequality model so the model can perform better when ref tags are changed to sfn templates.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | kevinbazira | T314302 ORES doesn't support fawiki family template | |||
Resolved | kevinbazira | T317531 Retrain fawiki articlequality model |
Event Timeline
The training pipeline run into an Error 127 as shown in the screenshot below:
After some digging, I noticed that this error happens when the pipeline tries to fetch labels from wikilabels. Turns out, wikilabels is down:
Will discuss with the team and see the best way we can move forward.
Thanks to the SREs on the team, wikilabels was respawned and the training pipeline was completed successfully.
The new model performs better than the old one, here is how I tested this:
import articlequality from revscoring import Model scorer_model = Model.load(open("fawiki.wp10.gradient_boosting.model", "rb")) text_35130784 = "کتاب زوسیموس با ترجمهٔ [[لاتین]]، برای نخستین بار توسط یوهانس لونکلاویوس و با همراهی جمعی از مورخان منتشر شد.{{sfn|Leunclavius|1576}} کتابهای اول و دوم او بهزبان یونانی، براساس نسخهٔ لونکلاویوس در سال ۱۵۸۱ توسط استفانوس منتشر شد.<ref>{{پک|Stephanus|1581|ک=Hērōdianou Historiōn Biblia|زبان=en}}</ref> اولین نسخهٔ کامل از اثر زوسیموس بهزبان یونانی، توسط فریدریش زیلبورگ انتشار یافت.<ref>{{پک|Sylburg|1679|ک=Historia nova|زبان=en}}</ref> بعدها نسخهٔ ویراستهٔ سلاریوس با حاشیهنویسی خود او و دیگران در شهر [[تسایتس]] منتشر گردید.<ref>{{پک|Cellarius|1679|ک=Historia nova|زبان=en}}</ref> ویرایش بعدی از اثر زوسیموس توسط یوهان فریدریش رایتهمایر ارائه شد که هرچند از نسخههای خطی تازه بهره نگرفتهبود، ولی از اظهارات انتقادی کریستیان هینه و دیگر پژوهشگران بهخوبی استفادهکرد.<ref>{{پک|Reitemeier|Heyne|1784|ک=Historiae|زبان=en}}</ref> ایمانوئل بکر نسخهای قابل اعتماد از اثر زوسیموس را در سال ۱۸۳۷ در شهر [[بن|بُن]] انتشار داد،<ref>{{پک|Bekker|1837|ک=Zosimus|زبان=en}}</ref> این نسخه مبنای ترجمهٔ آلمانی زیبولد و هیلر<ref>{{پک|Heyler|Seybold|1802|ک=Geschichte des Zosimus|زبان=en}}</ref> و همچنین ترجمههای فرانسوی و انگلیسی بود." articlequality.score(scorer_model, text_35130784) text_35130948 = "کتاب زوسیموس با ترجمهٔ [[لاتین]]، برای نخستین بار توسط یوهانس لونکلاویوس و با همراهی جمعی از مورخان منتشر شد.{{sfn|Leunclavius|1576}} کتابهای اول و دوم او بهزبان یونانی، براساس نسخهٔ لونکلاویوس در سال ۱۵۸۱ توسط استفانوس منتشر شد.{{sfn|Stephanus|1581}} اولین نسخهٔ کامل از اثر زوسیموس بهزبان یونانی، توسط فریدریش زیلبورگ انتشار یافت.{{sfn|Sylburg|1679}} بعدها نسخهٔ ویراستهٔ سلاریوس با حاشیهنویسی خود او و دیگران در شهر [[تسایتس]] منتشر گردید.{{sfn|Cellarius|1679}} ویرایش بعدی از اثر زوسیموس توسط یوهان فریدریش رایتهمایر ارائه شد که هرچند از نسخههای خطی تازه بهره نگرفتهبود، ولی از اظهارات انتقادی کریستیان هینه و دیگر پژوهشگران بهخوبی استفادهکرد.{{sfn|Reitemeier|Heyne|1784}} ایمانوئل بکر نسخهای قابل اعتماد از اثر زوسیموس را در سال ۱۸۳۷ در شهر [[بن|بُن]] انتشار داد،{{sfn|Bekker|1837}} این نسخه مبنای ترجمهٔ آلمانی زیبولد و هیلر{{sfn|Heyler|Seybold|1802}} و همچنین ترجمههای فرانسوی و انگلیسی بود." articlequality.score(scorer_model, text_35130948)
The results:
Old Model | New Model | |
Text from revid 35130784 with ref tag returns | {'prediction': 'C', 'probability': {'B': 0.4435163263437092, 'C': 0.508963457085965, 'FA': 1.1977367525449404e-06, 'GA': 3.863449235322138e-05, 'Start': 0.04552491449325536, 'Stub': 0.00195546984796453}} | {'prediction': 'Start', 'probability': {'B': 0.03103463681079601, 'C': 0.01721821754538123, 'FA': 6.384165198092277e-06, 'GA': 1.5625649217676394e-05, 'Start': 0.9517184760076839, 'Stub': 6.659821723053249e-06}} |
Text from revid 35130948 with sfn template returns | {'prediction': 'B', 'probability': {'B': 0.7429307638247918, 'C': 6.817841016698369e-06, 'FA': 6.280020210598454e-07, 'GA': 2.194884614066426e-08, 'Start': 0.257061178972003, 'Stub': 5.894113211414715e-07}} | {'prediction': 'B', 'probability': {'B': 0.9993340768033717, 'C': 0.0003954072383387585, 'FA': 2.5283058740737492e-05, 'GA': 4.24481661723868e-06, 'Start': 0.0001622044405462296, 'Stub': 7.878364238520293e-05}} |
@kevinbazira I see your results were based on the substrings you extracted from the revision_35130784 and revision_35130948, but I think the comparison of such results can't be evidence to show the issue T314302 raised by the community has been solved.
In my opinion, we should use the full revision text instead of extracting arbitrary paragraphs having ref tag and sfn templates. As we compare the results, what was in our mind is, we expect the new model's quality scores will remain stable for revision_35130784 and revision_35130948, while the old model's quality scores drop drastically from the revison_35130784 to revision_35130948 (ref-->sfn conversion), because the old model can't recognize "sfn" templates.
A more reasonable way is to test new model in a docker container. (Please look at Example_1_-_Testing_enwiki-goodfaith) So you can simply put the new model in your local models/ directory. Start a container and query revision 35130948 and 35130784 respectively. You can also put the old model in the local models/ directory and do the same steps to get the prediction result of the old model. But we know that they will be the same as the result from ORES .
https://ores.wikimedia.org/v3/scores/fawiki?models=articlequality&revids=35130784
https://ores.wikimedia.org/v3/scores/fawiki?models=articlequality&revids=35130948
In this way, then we can compare the quality scores obtained from the old model and the new model, and we also make sure the new model works fine as an isvc.
Thank you for the suggestion @achou. I have tested the models using the full revision text and below is the workflow I used.
Built the docker image using the steps mentioned in T322006#8358866 and then run the inference service:
- Old model.
### made sure old model.bin exists in a "models" directory as this would be the bind mount volume $ docker run -p 8080:8080 -e INFERENCE_NAME=fawiki-articlequality -e WIKI_URL=https://fa.wikipedia.org --rm -v `pwd`/models:/mnt/models kevinbazira/articlequality-fawiki ### open new tab $ echo '{ "rev_id": 35130784 }' > input.json $ curl localhost:8080/v1/models/fawiki-articlequality:predict -i -X POST -d@input.json --header "Content-type: application/json" --header "Accept-Encoding: application/json" {"fawiki": {"models": {"articlequality": {"version": "0.8.0"}}, "scores": {"35130784": {"articlequality": {"score": {"prediction": "GA", "probability": {"B": 0.005598481542125568, "C": 0.07391991195477102, "FA": 0.0025928406974106847, "GA": 0.9173855317099725, "Start": 0.00019281307821254978, "Stub": 0.000310421017507702}}}}}}} $ echo '{ "rev_id": 35130948 }' > input.json $ curl localhost:8080/v1/models/fawiki-articlequality:predict -i -X POST -d@input.json --header "Content-type: application/json" --header "Accept-Encoding: application/json" {"fawiki": {"models": {"articlequality": {"version": "0.8.0"}}, "scores": {"35130948": {"articlequality": {"score": {"prediction": "C", "probability": {"B": 0.0006751337116796307, "C": 0.9257694534416373, "FA": 0.0008941132199661709, "GA": 0.039574031101213734, "Start": 0.00010995899268381878, "Stub": 0.03297730953281936}}}}}}}
- New model.
### moved new model.bin to bind mount volume then run the same steps as mentioned above and got the results below: ### "rev_id": 35130784 {"fawiki": {"models": {"articlequality": {"version": "0.9.0"}}, "scores": {"35130784": {"articlequality": {"score": {"prediction": "C", "probability": {"B": 0.02355581843869042, "C": 0.8159629659839032, "FA": 0.0299107760372183, "GA": 0.12797507831238739, "Start": 0.0017379622969669738, "Stub": 0.0008573989308341193}}}}}}} ### "rev_id": 35130948 {"fawiki": {"models": {"articlequality": {"version": "0.9.0"}}, "scores": {"35130948": {"articlequality": {"score": {"prediction": "C", "probability": {"B": 0.0007451703396024546, "C": 0.9921960326428709, "FA": 0.00011850338813056061, "GA": 0.006851237426350331, "Start": 5.1592171143097236e-05, "Stub": 3.7464031902595555e-05}}}}}}}
In conclusion, the new model still performs better than the old one.
@kevinbazira thanks for working on this. I can see that the quality predictions of the new model remain at the C level for both revisions, indicating that the model takes into account both ref tags and sfn templates. That's great! :) Although I'm a bit surprised that the predicted quality is not GA, I think it might be because the new sfn features also affects other related features like the proportion of references in the article, etc, so nevermind.
One little thing that was very confusing when looking at your results is that the model name is "editquality" (also your docker image), it's a typo, right? The model we're fixing is articlequality model.