Page MenuHomePhabricator

AI/ML Infrastructure Request: Assistance in Rolling out Revert Risk to wikis that don't have damaging/goodfaith models
Closed, ResolvedPublic

Description

Please respond to the following questions, and provide as much detail as possible for each.

  • Problem: What problem are you facing that could be resolved or mitigated with infrastructural improvements?

In order to make the Recent Activity module that displays edits which are filtered by their revert risk score in the Moderator Dashboard widely available T408388, we'd like the Machine Learning team to assist with deployment of the revert risk model to this list P84306 of wikis that don't have goodfaith/damaging edits as it did a quarter ago T348298

  • [Optional] Possible solutions: What infrastructural improvement(s) would most meaningfully help you with this problem? Feel free to suggest multiple ideas.

As discussed on Slack, Machine Learning could help in coming up with running an Analysis to get Thresholds for the Wikis in a similar way to idwiki and add this to MediaWiki config, after which Moderator Tools will be able to turn on the model and run a script to backfill scores for edits when deploying to those wikis.

  • Enabled projects: Which specific user-facing features or experiments would be unblocked or meaningfully enabled (in terms of development ease, velocity, etc.) by solving this problem? Which teams are launching these features or experiments?

The Recent Activity Module on the Moderator Dashboard(PersonalDashboard).

  • Urgency and importance: When are these features or experiments expected to launch? How essential is this infrastructure for unblocking development?

The Moderator Dashboard is expected to launch by end of November so getting the thresholds before then would be ideal for the hypothesis.

  • [Optional] Notes: Is there anything else you'd like to share?

The main work around the roll out is being tracked on T408388

Event Timeline

Update

✅ The RevertRisk-Threshold-Analysis is finished running for all the wikis listed here: https://phabricator.wikimedia.org/P84306.

✅ In the paste bellow you can parse the results per wiki which includes: statistics for each sample, sample size, optimal threshold for achieving 15% FPR, and the confusion matrix.

✅ The Threshold Analysis ROC plots, can be found in this ROC_DriveFolder (public for WMF members) .

✅ The Results are updated in the following paste:

1============ - dewiki - ============
2 - Snapshot: 2025-06
3 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4 - Raw data shape: (3584827, 17)
5 - Duplicate rows found and removed: 102417
6 - Clean data shape: (3482410, 17)
7 - Unique revision_ids: 3482410 | Data Shape: 3482410 | Same? : -> True
8 - Removing edits that are reverts from df | New Shape: (3381092, 17)
9 - Is any revert_risk_score NA? : False
10 - Is any user_edit_count NA? : False
11 - Is any time_to_revert NA? : False
12 - ROC_dewiki.png saved!
13 - Optimal threshold for 15.0% FPR is: 0.5600801110267639
14 - confusion_matrix_dewiki.png saved!
15 - False Positive Rate is: 0.14999585634039994
16 - CONFUSION MATRIX -
17Predicted not reverted reverted
18Actual
19not reverted 2759048 486875
20reverted 26904 108265
21
22
23============ - jawiki - ============
24 - Snapshot: 2025-06
25 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
26 - Raw data shape: (2175370, 17)
27 - Duplicate rows found and removed: 66848
28 - Clean data shape: (2108522, 17)
29 - Unique revision_ids: 2108522 | Data Shape: 2108522 | Same? : -> True
30 - Removing edits that are reverts from df | New Shape: (2072019, 17)
31 - Is any revert_risk_score NA? : False
32 - Is any user_edit_count NA? : False
33 - Is any time_to_revert NA? : False
34 - ROC_jawiki.png saved!
35 - Optimal threshold for 15.0% FPR is: 0.7807799577713013
36 - confusion_matrix_jawiki.png saved!
37 - False Positive Rate is: 0.1500011156591438
38 - CONFUSION MATRIX -
39Predicted not reverted reverted
40Actual
41not reverted 1714231 302514
42reverted 24482 30792
43
44
45============ - viwiki - ============
46 - Snapshot: 2025-06
47 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
48 - Raw data shape: (423602, 17)
49 - Duplicate rows found and removed: 16144
50 - Clean data shape: (407458, 17)
51 - Unique revision_ids: 407458 | Data Shape: 407458 | Same? : -> True
52 - Removing edits that are reverts from df | New Shape: (392368, 17)
53 - Is any revert_risk_score NA? : False
54 - Is any user_edit_count NA? : False
55 - Is any time_to_revert NA? : False
56 - ROC_viwiki.png saved!
57 - Optimal threshold for 15.0% FPR is: 0.672927737236023
58 - confusion_matrix_viwiki.png saved!
59 - False Positive Rate is: 0.1500055161947405
60 - CONFUSION MATRIX -
61Predicted not reverted reverted
62Actual
63not reverted 315886 55747
64reverted 3017 17718
65
66
67============ - thwiki - ============
68 - Snapshot: 2025-06
69 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
70 - Raw data shape: (355836, 17)
71 - Duplicate rows found and removed: 7081
72 - Clean data shape: (348755, 17)
73 - Unique revision_ids: 348755 | Data Shape: 348755 | Same? : -> True
74 - Removing edits that are reverts from df | New Shape: (335084, 17)
75 - Is any revert_risk_score NA? : False
76 - Is any user_edit_count NA? : False
77 - Is any time_to_revert NA? : False
78 - ROC_thwiki.png saved!
79 - Optimal threshold for 15.0% FPR is: 0.7088555693626404
80 - confusion_matrix_thwiki.png saved!
81 - False Positive Rate is: 0.14999475309329
82 - CONFUSION MATRIX -
83Predicted not reverted reverted
84Actual
85not reverted 267302 47169
86reverted 2915 17698
87
88
89============ - nowiki - ============
90 - Snapshot: 2025-06
91 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
92 - Raw data shape: (219267, 17)
93 - Duplicate rows found and removed: 4787
94 - Clean data shape: (214480, 17)
95 - Unique revision_ids: 214480 | Data Shape: 214480 | Same? : -> True
96 - Removing edits that are reverts from df | New Shape: (210459, 17)
97 - Is any revert_risk_score NA? : False
98 - Is any user_edit_count NA? : False
99 - Is any time_to_revert NA? : False
100 - ROC_nowiki.png saved!
101 - Optimal threshold for 15.0% FPR is: 0.567750096321106
102 - confusion_matrix_nowiki.png saved!
103 - False Positive Rate is: 0.14995173769563921
104 - CONFUSION MATRIX -
105Predicted not reverted reverted
106Actual
107not reverted 168205 29672
108reverted 992 11590
109
110
111============ - elwiki - ============
112 - Snapshot: 2025-06
113 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
114 - Raw data shape: (214446, 17)
115 - Duplicate rows found and removed: 12313
116 - Clean data shape: (202133, 17)
117 - Unique revision_ids: 202133 | Data Shape: 202133 | Same? : -> True
118 - Removing edits that are reverts from df | New Shape: (196606, 17)
119 - Is any revert_risk_score NA? : False
120 - Is any user_edit_count NA? : False
121 - Is any time_to_revert NA? : False
122 - ROC_elwiki.png saved!
123 - Optimal threshold for 15.0% FPR is: 0.8070309162139893
124 - confusion_matrix_elwiki.png saved!
125 - False Positive Rate is: 0.15001066465405502
126 - CONFUSION MATRIX -
127Predicted not reverted reverted
128Actual
129not reverted 155418 27429
130reverted 3345 10414
131
132
133============ - hywiki - ============
134 - Snapshot: 2025-06
135 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
136 - Raw data shape: (98459, 17)
137 - Duplicate rows found and removed: 1737
138 - Clean data shape: (96722, 17)
139 - Unique revision_ids: 96722 | Data Shape: 96722 | Same? : -> True
140 - Removing edits that are reverts from df | New Shape: (95113, 17)
141 - Is any revert_risk_score NA? : False
142 - Is any user_edit_count NA? : False
143 - Is any time_to_revert NA? : False
144 - ROC_hywiki.png saved!
145 - Optimal threshold for 15.0% FPR is: 0.6658406257629395
146 - confusion_matrix_hywiki.png saved!
147 - False Positive Rate is: 0.14995779494837996
148 - CONFUSION MATRIX -
149Predicted not reverted reverted
150Actual
151not reverted 78549 13857
152reverted 487 2220
153
154
155============ - hiwiki - ============
156 - Snapshot: 2025-06
157 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
158 - Raw data shape: (66121, 17)
159 - Duplicate rows found and removed: 1266
160 - Clean data shape: (64855, 17)
161 - Unique revision_ids: 64855 | Data Shape: 64855 | Same? : -> True
162 - Removing edits that are reverts from df | New Shape: (62611, 17)
163 - Is any revert_risk_score NA? : False
164 - Is any user_edit_count NA? : False
165 - Is any time_to_revert NA? : False
166 - ROC_hiwiki.png saved!
167 - Optimal threshold for 15.0% FPR is: 0.8786675930023193
168 - confusion_matrix_hiwiki.png saved!
169 - False Positive Rate is: 0.14998627504803733
170 - CONFUSION MATRIX -
171Predicted not reverted reverted
172Actual
173not reverted 46449 8196
174reverted 3445 4521
175
176
177============ - bgwiki - ============
178 - Snapshot: 2025-06
179 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
180 - Raw data shape: (132472, 17)
181 - Duplicate rows found and removed: 3490
182 - Clean data shape: (128982, 17)
183 - Unique revision_ids: 128982 | Data Shape: 128982 | Same? : -> True
184 - Removing edits that are reverts from df | New Shape: (123505, 17)
185 - Is any revert_risk_score NA? : False
186 - Is any user_edit_count NA? : False
187 - Is any time_to_revert NA? : False
188 - ROC_bgwiki.png saved!
189 - Optimal threshold for 15.0% FPR is: 0.8672560453414917
190 - confusion_matrix_bgwiki.png saved!
191 - False Positive Rate is: 0.1500045516613564
192 - CONFUSION MATRIX -
193Predicted not reverted reverted
194Actual
195not reverted 93372 16478
196reverted 3266 10389
197
198
199============ - dawiki - ============
200 - Snapshot: 2025-06
201 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
202 - Raw data shape: (89244, 17)
203 - Duplicate rows found and removed: 939
204 - Clean data shape: (88305, 17)
205 - Unique revision_ids: 88305 | Data Shape: 88305 | Same? : -> True
206 - Removing edits that are reverts from df | New Shape: (87250, 17)
207 - Is any revert_risk_score NA? : False
208 - Is any user_edit_count NA? : False
209 - Is any time_to_revert NA? : False
210 - ROC_dawiki.png saved!
211 - Optimal threshold for 15.0% FPR is: 0.7409616112709045
212 - confusion_matrix_dawiki.png saved!
213 - False Positive Rate is: 0.14990577351379133
214 - CONFUSION MATRIX -
215Predicted not reverted reverted
216Actual
217not reverted 69468 12250
218reverted 576 4956
219
220
221============ - hrwiki - ============
222 - Snapshot: 2025-06
223 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
224 - Raw data shape: (92623, 17)
225 - Duplicate rows found and removed: 2677
226 - Clean data shape: (89946, 17)
227 - Unique revision_ids: 89946 | Data Shape: 89946 | Same? : -> True
228 - Removing edits that are reverts from df | New Shape: (85694, 17)
229 - Is any revert_risk_score NA? : False
230 - Is any user_edit_count NA? : False
231 - Is any time_to_revert NA? : False
232 - ROC_hrwiki.png saved!
233 - Optimal threshold for 15.0% FPR is: 0.5204998850822449
234 - confusion_matrix_hrwiki.png saved!
235 - False Positive Rate is: 0.15007170661749641
236 - CONFUSION MATRIX -
237Predicted not reverted reverted
238Actual
239not reverted 66376 11720
240reverted 635 6963
241
242
243============ - skwiki - ============
244 - Snapshot: 2025-06
245 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
246 - Raw data shape: (72376, 17)
247 - Duplicate rows found and removed: 2844
248 - Clean data shape: (69532, 17)
249 - Unique revision_ids: 69532 | Data Shape: 69532 | Same? : -> True
250 - Removing edits that are reverts from df | New Shape: (65873, 17)
251 - Is any revert_risk_score NA? : False
252 - Is any user_edit_count NA? : False
253 - Is any time_to_revert NA? : False
254 - ROC_skwiki.png saved!
255 - Optimal threshold for 15.0% FPR is: 0.8890475034713745
256 - confusion_matrix_skwiki.png saved!
257 - False Positive Rate is: 0.15
258 - CONFUSION MATRIX -
259Predicted not reverted reverted
260Actual
261not reverted 49963 8817
262reverted 1975 5118
263
264
265============ - mswiki - ============
266 - Snapshot: 2025-06
267 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
268 - Raw data shape: (126531, 17)
269 - Duplicate rows found and removed: 3069
270 - Clean data shape: (123462, 17)
271 - Unique revision_ids: 123462 | Data Shape: 123462 | Same? : -> True
272 - Removing edits that are reverts from df | New Shape: (116344, 17)
273 - Is any revert_risk_score NA? : False
274 - Is any user_edit_count NA? : False
275 - Is any time_to_revert NA? : False
276 - ROC_mswiki.png saved!
277 - Optimal threshold for 15.0% FPR is: 0.9427589774131775
278 - confusion_matrix_mswiki.png saved!
279 - False Positive Rate is: 0.149979524979525
280 - CONFUSION MATRIX -
281Predicted not reverted reverted
282Actual
283not reverted 91333 16115
284reverted 6203 2693
285
286
287============ - euwiki - ============
288 - Snapshot: 2025-06
289 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
290 - Raw data shape: (109812, 17)
291 - Duplicate rows found and removed: 367
292 - Clean data shape: (109445, 17)
293 - Unique revision_ids: 109445 | Data Shape: 109445 | Same? : -> True
294 - Removing edits that are reverts from df | New Shape: (108889, 17)
295 - Is any revert_risk_score NA? : False
296 - Is any user_edit_count NA? : False
297 - Is any time_to_revert NA? : False
298 - ROC_euwiki.png saved!
299 - Optimal threshold for 15.0% FPR is: 0.4504416882991791
300 - confusion_matrix_euwiki.png saved!
301 - False Positive Rate is: 0.14992396119659573
302 - CONFUSION MATRIX -
303Predicted not reverted reverted
304Actual
305not reverted 89995 15872
306reverted 400 2622
307
308
309============ - slwiki - ============
310 - Snapshot: 2025-06
311 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
312 - Raw data shape: (70752, 17)
313 - Duplicate rows found and removed: 216
314 - Clean data shape: (70536, 17)
315 - Unique revision_ids: 70536 | Data Shape: 70536 | Same? : -> True
316 - Removing edits that are reverts from df | New Shape: (69708, 17)
317 - Is any revert_risk_score NA? : False
318 - Is any user_edit_count NA? : False
319 - Is any time_to_revert NA? : False
320 - ROC_slwiki.png saved!
321 - Optimal threshold for 15.0% FPR is: 0.8655992746353149
322 - confusion_matrix_slwiki.png saved!
323 - False Positive Rate is: 0.14985160946655507
324 - CONFUSION MATRIX -
325Predicted not reverted reverted
326Actual
327not reverted 55859 9846
328reverted 913 3090
329
330
331============ - ltwiki - ============
332 - Snapshot: 2025-06
333 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
334 - Raw data shape: (110380, 17)
335 - Duplicate rows found and removed: 350
336 - Clean data shape: (110030, 17)
337 - Unique revision_ids: 110030 | Data Shape: 110030 | Same? : -> True
338 - Removing edits that are reverts from df | New Shape: (108931, 17)
339 - Is any revert_risk_score NA? : False
340 - Is any user_edit_count NA? : False
341 - Is any time_to_revert NA? : False
342 - ROC_ltwiki.png saved!
343 - Optimal threshold for 15.0% FPR is: 0.3519546389579773
344 - confusion_matrix_ltwiki.png saved!
345 - False Positive Rate is: 0.14997548477652692
346 - CONFUSION MATRIX -
347Predicted not reverted reverted
348Actual
349not reverted 88417 15600
350reverted 132 4782
351
352
353============ - tawiki - ============
354 - Snapshot: 2025-06
355 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
356 - Raw data shape: (68969, 17)
357 - Duplicate rows found and removed: 125
358 - Clean data shape: (68844, 17)
359 - Unique revision_ids: 68844 | Data Shape: 68844 | Same? : -> True
360 - Removing edits that are reverts from df | New Shape: (68354, 17)
361 - Is any revert_risk_score NA? : False
362 - Is any user_edit_count NA? : False
363 - Is any time_to_revert NA? : False
364 - ROC_tawiki.png saved!
365 - Optimal threshold for 15.0% FPR is: 0.6359190344810486
366 - confusion_matrix_tawiki.png saved!
367 - False Positive Rate is: 0.14996949359365466
368 - CONFUSION MATRIX -
369Predicted not reverted reverted
370Actual
371not reverted 55728 9832
372reverted 331 2463
373
374
375============ - zh_yuewiki - ============
376 - Snapshot: 2025-06
377 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
378 - Raw data shape: (46976, 17)
379 - Duplicate rows found and removed: 1303
380 - Clean data shape: (45673, 17)
381 - Unique revision_ids: 45673 | Data Shape: 45673 | Same? : -> True
382 - Removing edits that are reverts from df | New Shape: (44725, 17)
383 - Is any revert_risk_score NA? : False
384 - Is any user_edit_count NA? : False
385 - Is any time_to_revert NA? : False
386 - ROC_zh_yuewiki.png saved!
387 - Optimal threshold for 15.0% FPR is: 0.8573755621910095
388 - confusion_matrix_zh_yuewiki.png saved!
389 - False Positive Rate is: 0.15008259079170835
390 - CONFUSION MATRIX -
391Predicted not reverted reverted
392Actual
393not reverted 36532 6451
394reverted 1049 693
395
396
397============ - kawiki - ============
398 - Snapshot: 2025-06
399 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
400 - Raw data shape: (62524, 17)
401 - Duplicate rows found and removed: 438
402 - Clean data shape: (62086, 17)
403 - Unique revision_ids: 62086 | Data Shape: 62086 | Same? : -> True
404 - Removing edits that are reverts from df | New Shape: (60871, 17)
405 - Is any revert_risk_score NA? : False
406 - Is any user_edit_count NA? : False
407 - Is any time_to_revert NA? : False
408 - ROC_kawiki.png saved!
409 - Optimal threshold for 15.0% FPR is: 0.4605609178543091
410 - confusion_matrix_kawiki.png saved!
411 - False Positive Rate is: 0.14997275389959813
412 - CONFUSION MATRIX -
413Predicted not reverted reverted
414Actual
415not reverted 49917 8807
416reverted 484 1663
417
418
419============ - eowiki - ============
420 - Snapshot: 2025-06
421 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
422 - Raw data shape: (99614, 17)
423 - Duplicate rows found and removed: 523
424 - Clean data shape: (99091, 17)
425 - Unique revision_ids: 99091 | Data Shape: 99091 | Same? : -> True
426 - Removing edits that are reverts from df | New Shape: (98754, 17)
427 - Is any revert_risk_score NA? : False
428 - Is any user_edit_count NA? : False
429 - Is any time_to_revert NA? : False
430 - ROC_eowiki.png saved!
431 - Optimal threshold for 15.0% FPR is: 0.34750446677207947
432 - confusion_matrix_eowiki.png saved!
433 - False Positive Rate is: 0.15014543042302395
434 - CONFUSION MATRIX -
435Predicted not reverted reverted
436Actual
437not reverted 83273 14712
438reverted 247 522
439
440
441============ - glwiki - ============
442 - Snapshot: 2025-06
443 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
444 - Raw data shape: (131204, 17)
445 - Duplicate rows found and removed: 381
446 - Clean data shape: (130823, 17)
447 - Unique revision_ids: 130823 | Data Shape: 130823 | Same? : -> True
448 - Removing edits that are reverts from df | New Shape: (129653, 17)
449 - Is any revert_risk_score NA? : False
450 - Is any user_edit_count NA? : False
451 - Is any time_to_revert NA? : False
452 - ROC_glwiki.png saved!
453 - Optimal threshold for 15.0% FPR is: 0.2695479691028595
454 - confusion_matrix_glwiki.png saved!
455 - False Positive Rate is: 0.1500246828450309
456 - CONFUSION MATRIX -
457Predicted not reverted reverted
458Actual
459not reverted 108473 19146
460reverted 203 1831
461
462
463============ - urwiki - ============
464 - Snapshot: 2025-06
465 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
466 - Raw data shape: (809991, 17)
467 - Duplicate rows found and removed: 703
468 - Clean data shape: (809288, 17)
469 - Unique revision_ids: 809288 | Data Shape: 809288 | Same? : -> True
470 - Removing edits that are reverts from df | New Shape: (808440, 17)
471 - Is any revert_risk_score NA? : False
472 - Is any user_edit_count NA? : False
473 - Is any time_to_revert NA? : False
474 - ROC_urwiki.png saved!
475 - Optimal threshold for 15.0% FPR is: 0.26034781336784363
476 - confusion_matrix_urwiki.png saved!
477 - False Positive Rate is: 0.15495951241396233
478 - CONFUSION MATRIX -
479Predicted not reverted reverted
480Actual
481not reverted 681875 125039
482reverted 307 1219
483
484
485============ - sqwiki - ============
486 - Snapshot: 2025-06
487 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
488 - Raw data shape: (44462, 17)
489 - Duplicate rows found and removed: 1311
490 - Clean data shape: (43151, 17)
491 - Unique revision_ids: 43151 | Data Shape: 43151 | Same? : -> True
492 - Removing edits that are reverts from df | New Shape: (41855, 17)
493 - Is any revert_risk_score NA? : False
494 - Is any user_edit_count NA? : False
495 - Is any time_to_revert NA? : False
496 - ROC_sqwiki.png saved!
497 - Optimal threshold for 15.0% FPR is: 0.5832709074020386
498 - confusion_matrix_sqwiki.png saved!
499 - False Positive Rate is: 0.15289276562191997
500 - CONFUSION MATRIX -
501Predicted not reverted reverted
502Actual
503not reverted 34379 6205
504reverted 137 1134
505
506
507============ - mywiki - ============
508 - Snapshot: 2025-06
509 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
510 - Raw data shape: (13635, 17)
511 - Duplicate rows found and removed: 130
512 - Clean data shape: (13505, 17)
513 - Unique revision_ids: 13505 | Data Shape: 13505 | Same? : -> True
514 - Removing edits that are reverts from df | New Shape: (13104, 17)
515 - Is any revert_risk_score NA? : False
516 - Is any user_edit_count NA? : False
517 - Is any time_to_revert NA? : False
518 - ROC_mywiki.png saved!
519 - Optimal threshold for 15.0% FPR is: 0.8719736337661743
520 - confusion_matrix_mywiki.png saved!
521 - False Positive Rate is: 0.14975177016358754
522 - CONFUSION MATRIX -
523Predicted not reverted reverted
524Actual
525not reverted 10447 1840
526reverted 273 544
527
528
529============ - ckbwiki - ============
530 - Snapshot: 2025-06
531 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
532 - Raw data shape: (54359, 17)
533 - Duplicate rows found and removed: 323
534 - Clean data shape: (54036, 17)
535 - Unique revision_ids: 54036 | Data Shape: 54036 | Same? : -> True
536 - Removing edits that are reverts from df | New Shape: (53591, 17)
537 - Is any revert_risk_score NA? : False
538 - Is any user_edit_count NA? : False
539 - Is any time_to_revert NA? : False
540 - ROC_ckbwiki.png saved!
541 - Optimal threshold for 15.0% FPR is: 0.2797521650791168
542 - confusion_matrix_ckbwiki.png saved!
543 - False Positive Rate is: 0.1489317372528222
544 - CONFUSION MATRIX -
545Predicted not reverted reverted
546Actual
547not reverted 44933 7863
548reverted 71 724
549
550
551============ - knwiki - ============
552 - Snapshot: 2025-06
553 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
554 - Raw data shape: (23237, 17)
555 - Duplicate rows found and removed: 30
556 - Clean data shape: (23207, 17)
557 - Unique revision_ids: 23207 | Data Shape: 23207 | Same? : -> True
558 - Removing edits that are reverts from df | New Shape: (23131, 17)
559 - Is any revert_risk_score NA? : False
560 - Is any user_edit_count NA? : False
561 - Is any time_to_revert NA? : False
562 - ROC_knwiki.png saved!
563 - Optimal threshold for 15.0% FPR is: 0.44861292839050293
564 - confusion_matrix_knwiki.png saved!
565 - False Positive Rate is: 0.1506315089740749
566 - CONFUSION MATRIX -
567Predicted not reverted reverted
568Actual
569not reverted 19166 3399
570reverted 63 503
571
572
573============ - shwiki - ============
574 - Snapshot: 2025-06
575 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
576 - Raw data shape: (184833, 17)
577 - Duplicate rows found and removed: 606
578 - Clean data shape: (184227, 17)
579 - Unique revision_ids: 184227 | Data Shape: 184227 | Same? : -> True
580 - Removing edits that are reverts from df | New Shape: (183350, 17)
581 - Is any revert_risk_score NA? : False
582 - Is any user_edit_count NA? : False
583 - Is any time_to_revert NA? : False
584 - ROC_shwiki.png saved!
585 - Optimal threshold for 15.0% FPR is: 0.11731643974781036
586 - confusion_matrix_shwiki.png saved!
587 - False Positive Rate is: 0.1495637631888746
588 - CONFUSION MATRIX -
589Predicted not reverted reverted
590Actual
591not reverted 154594 27188
592reverted 23 1545
593
594
595============ - uzwiki - ============
596 - Snapshot: 2025-06
597 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
598 - Raw data shape: (126041, 17)
599 - Duplicate rows found and removed: 1545
600 - Clean data shape: (124496, 17)
601 - Unique revision_ids: 124496 | Data Shape: 124496 | Same? : -> True
602 - Removing edits that are reverts from df | New Shape: (122319, 17)
603 - Is any revert_risk_score NA? : False
604 - Is any user_edit_count NA? : False
605 - Is any time_to_revert NA? : False
606 - ROC_uzwiki.png saved!
607 - Optimal threshold for 15.0% FPR is: 0.5346587300300598
608 - confusion_matrix_uzwiki.png saved!
609 - False Positive Rate is: 0.14997637370055353
610 - CONFUSION MATRIX -
611Predicted not reverted reverted
612Actual
613not reverted 100738 17774
614reverted 331 3476
615
616
617============ - cebwiki - ============
618 - Snapshot: 2025-06
619 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
620 - Raw data shape: (173433, 17)
621 - Duplicate rows found and removed: 96
622 - Clean data shape: (173337, 17)
623 - Unique revision_ids: 173337 | Data Shape: 173337 | Same? : -> True
624 - Removing edits that are reverts from df | New Shape: (173122, 17)
625 - Is any revert_risk_score NA? : False
626 - Is any user_edit_count NA? : False
627 - Is any time_to_revert NA? : False
628 - ROC_cebwiki.png saved!
629 - Optimal threshold for 15.0% FPR is: 0.010278036817908287
630 - confusion_matrix_cebwiki.png saved!
631 - False Positive Rate is: 0.15000752410607832
632 - CONFUSION MATRIX -
633Predicted not reverted reverted
634Actual
635not reverted 146860 25918
636reverted 0 344
637
638
639============ - be_x_oldwiki - ============
640 - Snapshot: 2025-06
641 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
642 - Raw data shape: (16161, 17)
643 - Duplicate rows found and removed: 463
644 - Clean data shape: (15698, 17)
645 - Unique revision_ids: 15698 | Data Shape: 15698 | Same? : -> True
646 - Removing edits that are reverts from df | New Shape: (15541, 17)
647 - Is any revert_risk_score NA? : False
648 - Is any user_edit_count NA? : False
649 - Is any time_to_revert NA? : False
650 - ROC_be_x_oldwiki.png saved!
651 - Optimal threshold for 15.0% FPR is: 0.6138502955436707
652 - confusion_matrix_be_x_oldwiki.png saved!
653 - False Positive Rate is: 0.0797570056829316
654 - CONFUSION MATRIX -
655Predicted not reverted reverted
656Actual
657not reverted 14088 1221
658reverted 59 173
659
660
661============ - aswiki - ============
662 - Snapshot: 2025-06
663 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
664 - Raw data shape: (52428, 17)
665 - Duplicate rows found and removed: 73
666 - Clean data shape: (52355, 17)
667 - Unique revision_ids: 52355 | Data Shape: 52355 | Same? : -> True
668 - Removing edits that are reverts from df | New Shape: (52199, 17)
669 - Is any revert_risk_score NA? : False
670 - Is any user_edit_count NA? : False
671 - Is any time_to_revert NA? : False
672 - ROC_aswiki.png saved!
673 - Optimal threshold for 15.0% FPR is: 0.25013989210128784
674 - confusion_matrix_aswiki.png saved!
675 - False Positive Rate is: 0.14994291022390804
676 - CONFUSION MATRIX -
677Predicted not reverted reverted
678Actual
679not reverted 43925 7748
680reverted 288 238
681
682
683============ - newiki - ============
684 - Snapshot: 2025-06
685 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
686 - Raw data shape: (9322, 17)
687 - Duplicate rows found and removed: 34
688 - Clean data shape: (9288, 17)
689 - Unique revision_ids: 9288 | Data Shape: 9288 | Same? : -> True
690 - Removing edits that are reverts from df | New Shape: (9186, 17)
691 - Is any revert_risk_score NA? : False
692 - Is any user_edit_count NA? : False
693 - Is any time_to_revert NA? : False
694 - ROC_newiki.png saved!
695 - Optimal threshold for 15.0% FPR is: 0.8242648243904114
696 - confusion_matrix_newiki.png saved!
697 - False Positive Rate is: 0.15088920799734895
698 - CONFUSION MATRIX -
699Predicted not reverted reverted
700Actual
701not reverted 7687 1366
702reverted 52 81
703
704
705============ - gawiki - ============
706 - Snapshot: 2025-06
707 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
708 - Raw data shape: (14898, 17)
709 - Duplicate rows found and removed: 28
710 - Clean data shape: (14870, 17)
711 - Unique revision_ids: 14870 | Data Shape: 14870 | Same? : -> True
712 - Removing edits that are reverts from df | New Shape: (14762, 17)
713 - Is any revert_risk_score NA? : False
714 - Is any user_edit_count NA? : False
715 - Is any time_to_revert NA? : False
716 - ROC_gawiki.png saved!
717 - Optimal threshold for 15.0% FPR is: 0.4082460403442383
718 - confusion_matrix_gawiki.png saved!
719 - False Positive Rate is: 0.1525214408233276
720 - CONFUSION MATRIX -
721Predicted not reverted reverted
722Actual
723not reverted 12352 2223
724reverted 23 164
725
726
727============ - kuwiki - ============
728 - Snapshot: 2025-06
729 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
730 - Raw data shape: (35850, 17)
731 - Duplicate rows found and removed: 934
732 - Clean data shape: (34916, 17)
733 - Unique revision_ids: 34916 | Data Shape: 34916 | Same? : -> True
734 - Removing edits that are reverts from df | New Shape: (34053, 17)
735 - Is any revert_risk_score NA? : False
736 - Is any user_edit_count NA? : False
737 - Is any time_to_revert NA? : False
738 - ROC_kuwiki.png saved!
739 - Optimal threshold for 15.0% FPR is: 0.34288907051086426
740 - confusion_matrix_kuwiki.png saved!
741 - False Positive Rate is: 0.1499506538366642
742 - CONFUSION MATRIX -
743Predicted not reverted reverted
744Actual
745not reverted 27562 4862
746reverted 297 1332
747
748
749============ - scowiki - ============
750 - Snapshot: 2025-06
751 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
752 - Raw data shape: (3887, 17)
753 - Duplicate rows found and removed: 11
754 - Clean data shape: (3876, 17)
755 - Unique revision_ids: 3876 | Data Shape: 3876 | Same? : -> True
756 - Removing edits that are reverts from df | New Shape: (3743, 17)
757 - Is any revert_risk_score NA? : False
758 - Is any user_edit_count NA? : False
759 - Is any time_to_revert NA? : False
760 - ROC_scowiki.png saved!
761 - Optimal threshold for 15.0% FPR is: 0.8366231918334961
762 - confusion_matrix_scowiki.png saved!
763 - False Positive Rate is: 0.150074294205052
764 - CONFUSION MATRIX -
765Predicted not reverted reverted
766Actual
767not reverted 2860 505
768reverted 140 238
769
770
771============ - arzwiki - ============
772 - Snapshot: 2025-06
773 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
774 - Raw data shape: (1117144, 17)
775 - Duplicate rows found and removed: 931
776 - Clean data shape: (1116213, 17)
777 - Unique revision_ids: 1116213 | Data Shape: 1116213 | Same? : -> True
778 - Removing edits that are reverts from df | New Shape: (1114143, 17)
779 - Is any revert_risk_score NA? : False
780 - Is any user_edit_count NA? : False
781 - Is any time_to_revert NA? : False
782 - ROC_arzwiki.png saved!
783 - Optimal threshold for 15.0% FPR is: 0.07260608673095703
784 - confusion_matrix_arzwiki.png saved!
785 - False Positive Rate is: 0.14979367968963228
786 - CONFUSION MATRIX -
787Predicted not reverted reverted
788Actual
789not reverted 943871 166296
790reverted 1197 2779
791
792
793============ - bawiki - ============
794 - Snapshot: 2025-06
795 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
796 - Raw data shape: (3383, 17)
797 - Duplicate rows found and removed: 26
798 - Clean data shape: (3357, 17)
799 - Unique revision_ids: 3357 | Data Shape: 3357 | Same? : -> True
800 - Removing edits that are reverts from df | New Shape: (3273, 17)
801 - Is any revert_risk_score NA? : False
802 - Is any user_edit_count NA? : False
803 - Is any time_to_revert NA? : False
804 - ROC_bawiki.png saved!
805 - Optimal threshold for 15.0% FPR is: 0.6595543622970581
806 - confusion_matrix_bawiki.png saved!
807 - False Positive Rate is: 0.15389447236180903
808 - CONFUSION MATRIX -
809Predicted not reverted reverted
810Actual
811not reverted 2694 490
812reverted 23 66
813
814
815============ - ttwiki - ============
816 - Snapshot: 2025-06
817 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
818 - Raw data shape: (16375, 17)
819 - Duplicate rows found and removed: 151
820 - Clean data shape: (16224, 17)
821 - Unique revision_ids: 16224 | Data Shape: 16224 | Same? : -> True
822 - Removing edits that are reverts from df | New Shape: (16051, 17)
823 - Is any revert_risk_score NA? : False
824 - Is any user_edit_count NA? : False
825 - Is any time_to_revert NA? : False
826 - ROC_ttwiki.png saved!
827 - Optimal threshold for 15.0% FPR is: 0.3884022533893585
828 - confusion_matrix_ttwiki.png saved!
829 - False Positive Rate is: 0.15
830 - CONFUSION MATRIX -
831Predicted not reverted reverted
832Actual
833not reverted 13379 2361
834reverted 30 281
835
836
837============ - astwiki - ============
838 - Snapshot: 2025-06
839 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
840 - Raw data shape: (54583, 17)
841 - Duplicate rows found and removed: 61
842 - Clean data shape: (54522, 17)
843 - Unique revision_ids: 54522 | Data Shape: 54522 | Same? : -> True
844 - Removing edits that are reverts from df | New Shape: (54352, 17)
845 - Is any revert_risk_score NA? : False
846 - Is any user_edit_count NA? : False
847 - Is any time_to_revert NA? : False
848 - ROC_astwiki.png saved!
849 - Optimal threshold for 15.0% FPR is: 0.14954642951488495
850 - confusion_matrix_astwiki.png saved!
851 - False Positive Rate is: 0.1501871055424199
852 - CONFUSION MATRIX -
853Predicted not reverted reverted
854Actual
855not reverted 45646 8067
856reverted 108 531
857
858
859============ - jvwiki - ============
860 - Snapshot: 2025-06
861 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
862 - Raw data shape: (24642, 17)
863 - Duplicate rows found and removed: 271
864 - Clean data shape: (24371, 17)
865 - Unique revision_ids: 24371 | Data Shape: 24371 | Same? : -> True
866 - Removing edits that are reverts from df | New Shape: (23993, 17)
867 - Is any revert_risk_score NA? : False
868 - Is any user_edit_count NA? : False
869 - Is any time_to_revert NA? : False
870 - ROC_jvwiki.png saved!
871 - Optimal threshold for 15.0% FPR is: 0.3876338005065918
872 - confusion_matrix_jvwiki.png saved!
873 - False Positive Rate is: 0.14963239981301263
874 - CONFUSION MATRIX -
875Predicted not reverted reverted
876Actual
877not reverted 20010 3521
878reverted 46 416
879
880
881============ - ocwiki - ============
882 - Snapshot: 2025-06
883 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
884 - Raw data shape: (13512, 17)
885 - Duplicate rows found and removed: 93
886 - Clean data shape: (13419, 17)
887 - Unique revision_ids: 13419 | Data Shape: 13419 | Same? : -> True
888 - Removing edits that are reverts from df | New Shape: (13106, 17)
889 - Is any revert_risk_score NA? : False
890 - Is any user_edit_count NA? : False
891 - Is any time_to_revert NA? : False
892 - ROC_ocwiki.png saved!
893 - Optimal threshold for 15.0% FPR is: 0.8661872148513794
894 - confusion_matrix_ocwiki.png saved!
895 - False Positive Rate is: 0.1494781118554292
896 - CONFUSION MATRIX -
897Predicted not reverted reverted
898Actual
899not reverted 10919 1919
900reverted 134 134
901
902
903============ - lbwiki - ============
904 - Snapshot: 2025-06
905 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
906 - Raw data shape: (40412, 17)
907 - Duplicate rows found and removed: 22
908 - Clean data shape: (40390, 17)
909 - Unique revision_ids: 40390 | Data Shape: 40390 | Same? : -> True
910 - Removing edits that are reverts from df | New Shape: (40249, 17)
911 - Is any revert_risk_score NA? : False
912 - Is any user_edit_count NA? : False
913 - Is any time_to_revert NA? : False
914 - ROC_lbwiki.png saved!
915 - Optimal threshold for 15.0% FPR is: 0.4311239421367645
916 - confusion_matrix_lbwiki.png saved!
917 - False Positive Rate is: 0.15055673714500187
918 - CONFUSION MATRIX -
919Predicted not reverted reverted
920Actual
921not reverted 33948 6017
922reverted 72 212
923
924
925============ - satwiki - ============
926 - Snapshot: 2025-06
927 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
928 - Raw data shape: (13345, 17)
929 - Duplicate rows found and removed: 3
930 - Clean data shape: (13342, 17)
931 - Unique revision_ids: 13342 | Data Shape: 13342 | Same? : -> True
932 - Removing edits that are reverts from df | New Shape: (13325, 17)
933 - Is any revert_risk_score NA? : False
934 - Is any user_edit_count NA? : False
935 - Is any time_to_revert NA? : False
936 - ROC_satwiki.png saved!
937 - Optimal threshold for 15.0% FPR is: 0.23358190059661865
938 - confusion_matrix_satwiki.png saved!
939 - False Positive Rate is: 0.1493526046371575
940 - CONFUSION MATRIX -
941Predicted not reverted reverted
942Actual
943not reverted 11300 1984
944reverted 22 19
945
946
947============ - mnwiki - ============
948 - Snapshot: 2025-06
949 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
950 - Raw data shape: (16184, 17)
951 - Duplicate rows found and removed: 237
952 - Clean data shape: (15947, 17)
953 - Unique revision_ids: 15947 | Data Shape: 15947 | Same? : -> True
954 - Removing edits that are reverts from df | New Shape: (15475, 17)
955 - Is any revert_risk_score NA? : False
956 - Is any user_edit_count NA? : False
957 - Is any time_to_revert NA? : False
958 - ROC_mnwiki.png saved!
959 - Optimal threshold for 15.0% FPR is: 0.759356677532196
960 - confusion_matrix_mnwiki.png saved!
961 - False Positive Rate is: 0.14961234895578682
962 - CONFUSION MATRIX -
963Predicted not reverted reverted
964Actual
965not reverted 12175 2142
966reverted 593 565
967
968
969============ - azbwiki - ============
970 - Snapshot: 2025-06
971 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
972 - Raw data shape: (5036, 17)
973 - Duplicate rows found and removed: 39
974 - Clean data shape: (4997, 17)
975 - Unique revision_ids: 4997 | Data Shape: 4997 | Same? : -> True
976 - Removing edits that are reverts from df | New Shape: (4869, 17)
977 - Is any revert_risk_score NA? : False
978 - Is any user_edit_count NA? : False
979 - Is any time_to_revert NA? : False
980 - ROC_azbwiki.png saved!
981 - Optimal threshold for 15.0% FPR is: 0.6210681200027466
982 - confusion_matrix_azbwiki.png saved!
983 - False Positive Rate is: 0.14925373134328357
984 - CONFUSION MATRIX -
985Predicted not reverted reverted
986Actual
987not reverted 3819 670
988reverted 39 341
989
990
991============ - guwiki - ============
992 - Snapshot: 2025-06
993 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
994 - Raw data shape: (7997, 17)
995 - Duplicate rows found and removed: 283
996 - Clean data shape: (7714, 17)
997 - Unique revision_ids: 7714 | Data Shape: 7714 | Same? : -> True
998 - Removing edits that are reverts from df | New Shape: (7196, 17)
999 - Is any revert_risk_score NA? : False
1000 - Is any user_edit_count NA? : False
1001 - Is any time_to_revert NA? : False
1002 - ROC_guwiki.png saved!
1003 - Optimal threshold for 15.0% FPR is: 0.3249874413013458
1004 - confusion_matrix_guwiki.png saved!
1005 - False Positive Rate is: 0.150709805216243
1006 - CONFUSION MATRIX -
1007Predicted not reverted reverted
1008Actual
1009not reverted 5145 913
1010reverted 48 1090
1011
1012
1013============ - brwiki - ============
1014 - Snapshot: 2025-06
1015 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1016 - Raw data shape: (23918, 17)
1017 - Duplicate rows found and removed: 767
1018 - Clean data shape: (23151, 17)
1019 - Unique revision_ids: 23151 | Data Shape: 23151 | Same? : -> True
1020 - Removing edits that are reverts from df | New Shape: (22932, 17)
1021 - Is any revert_risk_score NA? : False
1022 - Is any user_edit_count NA? : False
1023 - Is any time_to_revert NA? : False
1024 - ROC_brwiki.png saved!
1025 - Optimal threshold for 15.0% FPR is: 0.5656294822692871
1026 - confusion_matrix_brwiki.png saved!
1027 - False Positive Rate is: 0.14589478318291876
1028 - CONFUSION MATRIX -
1029Predicted not reverted reverted
1030Actual
1031not reverted 19401 3314
1032reverted 76 141
1033
1034
1035============ - warwiki - ============
1036 - Snapshot: 2025-06
1037 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1038 - Raw data shape: (100096, 17)
1039 - Duplicate rows found and removed: 65
1040 - Clean data shape: (100031, 17)
1041 - Unique revision_ids: 100031 | Data Shape: 100031 | Same? : -> True
1042 - Removing edits that are reverts from df | New Shape: (99898, 17)
1043 - Is any revert_risk_score NA? : False
1044 - Is any user_edit_count NA? : False
1045 - Is any time_to_revert NA? : False
1046 - ROC_warwiki.png saved!
1047 - Optimal threshold for 15.0% FPR is: 0.1372111439704895
1048 - confusion_matrix_warwiki.png saved!
1049 - False Positive Rate is: 0.12320709833482218
1050 - CONFUSION MATRIX -
1051Predicted not reverted reverted
1052Actual
1053not reverted 87354 12275
1054reverted 4 265
1055
1056
1057============ - siwiki - ============
1058 - Snapshot: 2025-06
1059 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1060 - Raw data shape: (14475, 17)
1061 - Duplicate rows found and removed: 169
1062 - Clean data shape: (14306, 17)
1063 - Unique revision_ids: 14306 | Data Shape: 14306 | Same? : -> True
1064 - Removing edits that are reverts from df | New Shape: (13773, 17)
1065 - Is any revert_risk_score NA? : False
1066 - Is any user_edit_count NA? : False
1067 - Is any time_to_revert NA? : False
1068 - ROC_siwiki.png saved!
1069 - Optimal threshold for 15.0% FPR is: 0.6435396075248718
1070 - confusion_matrix_siwiki.png saved!
1071 - False Positive Rate is: 0.15085417937766932
1072 - CONFUSION MATRIX -
1073Predicted not reverted reverted
1074Actual
1075not reverted 11134 1978
1076reverted 76 585
1077
1078
1079============ - minwiki - ============
1080 - Snapshot: 2025-06
1081 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1082 - Raw data shape: (9955, 17)
1083 - Duplicate rows found and removed: 43
1084 - Clean data shape: (9912, 17)
1085 - Unique revision_ids: 9912 | Data Shape: 9912 | Same? : -> True
1086 - Removing edits that are reverts from df | New Shape: (9850, 17)
1087 - Is any revert_risk_score NA? : False
1088 - Is any user_edit_count NA? : False
1089 - Is any time_to_revert NA? : False
1090 - ROC_minwiki.png saved!
1091 - Optimal threshold for 15.0% FPR is: 0.17035524547100067
1092 - confusion_matrix_minwiki.png saved!
1093 - False Positive Rate is: 0.15234974915531893
1094 - CONFUSION MATRIX -
1095Predicted not reverted reverted
1096Actual
1097not reverted 8279 1488
1098reverted 2 81
1099
1100
1101============ - wuuwiki - ============
1102 - Snapshot: 2025-06
1103 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1104 - Raw data shape: (11232, 17)
1105 - Duplicate rows found and removed: 62
1106 - Clean data shape: (11170, 17)
1107 - Unique revision_ids: 11170 | Data Shape: 11170 | Same? : -> True
1108 - Removing edits that are reverts from df | New Shape: (10850, 17)
1109 - Is any revert_risk_score NA? : False
1110 - Is any user_edit_count NA? : False
1111 - Is any time_to_revert NA? : False
1112 - ROC_wuuwiki.png saved!
1113 - Optimal threshold for 15.0% FPR is: 0.4011421799659729
1114 - confusion_matrix_wuuwiki.png saved!
1115 - False Positive Rate is: 0.15026799387442571
1116 - CONFUSION MATRIX -
1117Predicted not reverted reverted
1118Actual
1119not reverted 8878 1570
1120reverted 35 367
1121
1122
1123============ - sowiki - ============
1124 - Snapshot: 2025-06
1125 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1126 - Raw data shape: (5870, 17)
1127 - Duplicate rows found and removed: 325
1128 - Clean data shape: (5545, 17)
1129 - Unique revision_ids: 5545 | Data Shape: 5545 | Same? : -> True
1130 - Removing edits that are reverts from df | New Shape: (5308, 17)
1131 - Is any revert_risk_score NA? : False
1132 - Is any user_edit_count NA? : False
1133 - Is any time_to_revert NA? : False
1134 - ROC_sowiki.png saved!
1135 - Optimal threshold for 15.0% FPR is: 0.9336830377578735
1136 - confusion_matrix_sowiki.png saved!
1137 - False Positive Rate is: 0.15108783239323126
1138 - CONFUSION MATRIX -
1139Predicted not reverted reverted
1140Actual
1141not reverted 4214 750
1142reverted 229 115
1143
1144
1145============ - orwiki - ============
1146 - Snapshot: 2025-06
1147 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1148 - Raw data shape: (10688, 17)
1149 - Duplicate rows found and removed: 29
1150 - Clean data shape: (10659, 17)
1151 - Unique revision_ids: 10659 | Data Shape: 10659 | Same? : -> True
1152 - Removing edits that are reverts from df | New Shape: (10629, 17)
1153 - Is any revert_risk_score NA? : False
1154 - Is any user_edit_count NA? : False
1155 - Is any time_to_revert NA? : False
1156 - ROC_orwiki.png saved!
1157 - Optimal threshold for 15.0% FPR is: 0.2776014506816864
1158 - confusion_matrix_orwiki.png saved!
1159 - False Positive Rate is: 0.14711033274956217
1160 - CONFUSION MATRIX -
1161Predicted not reverted reverted
1162Actual
1163not reverted 8766 1512
1164reverted 94 257
1165
1166
1167============ - tgwiki - ============
1168 - Snapshot: 2025-06
1169 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1170 - Raw data shape: (6459, 17)
1171 - Duplicate rows found and removed: 17
1172 - Clean data shape: (6442, 17)
1173 - Unique revision_ids: 6442 | Data Shape: 6442 | Same? : -> True
1174 - Removing edits that are reverts from df | New Shape: (6396, 17)
1175 - Is any revert_risk_score NA? : False
1176 - Is any user_edit_count NA? : False
1177 - Is any time_to_revert NA? : False
1178 - ROC_tgwiki.png saved!
1179 - Optimal threshold for 15.0% FPR is: 0.7084928154945374
1180 - confusion_matrix_tgwiki.png saved!
1181 - False Positive Rate is: 0.15035720219305532
1182 - CONFUSION MATRIX -
1183Predicted not reverted reverted
1184Actual
1185not reverted 5114 905
1186reverted 56 321
1187
1188
1189============ - yiwiki - ============
1190 - Snapshot: 2025-06
1191 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1192 - Raw data shape: (1017, 17)
1193 - Duplicate rows found and removed: 42
1194 - Clean data shape: (975, 17)
1195 - Unique revision_ids: 975 | Data Shape: 975 | Same? : -> True
1196 - Removing edits that are reverts from df | New Shape: (915, 17)
1197 - Is any revert_risk_score NA? : False
1198 - Is any user_edit_count NA? : False
1199 - Is any time_to_revert NA? : False
1200 - ROC_yiwiki.png saved!
1201 - Optimal threshold for 15.0% FPR is: 0.9127264022827148
1202 - confusion_matrix_yiwiki.png saved!
1203 - False Positive Rate is: 0.1519302615193026
1204 - CONFUSION MATRIX -
1205Predicted not reverted reverted
1206Actual
1207not reverted 681 122
1208reverted 35 77
1209
1210
1211============ - avkwiki - ============
1212 - Snapshot: 2025-06
1213 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1214 - Raw data shape: (403, 17)
1215 - Duplicate rows found and removed: 2
1216 - Clean data shape: (401, 17)
1217 - Unique revision_ids: 401 | Data Shape: 401 | Same? : -> True
1218 - Removing edits that are reverts from df | New Shape: (389, 17)
1219 - Is any revert_risk_score NA? : False
1220 - Is any user_edit_count NA? : False
1221 - Is any time_to_revert NA? : False
1222 - ROC_avkwiki.png saved!
1223 - Optimal threshold for 15.0% FPR is: 0.6609903573989868
1224 - confusion_matrix_avkwiki.png saved!
1225 - False Positive Rate is: 0.0582010582010582
1226 - CONFUSION MATRIX -
1227Predicted not reverted reverted
1228Actual
1229not reverted 356 22
1230reverted 3 8
1231
1232
1233============ - kywiki - ============
1234 - Snapshot: 2025-06
1235 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1236 - Raw data shape: (15408, 17)
1237 - Duplicate rows found and removed: 195
1238 - Clean data shape: (15213, 17)
1239 - Unique revision_ids: 15213 | Data Shape: 15213 | Same? : -> True
1240 - Removing edits that are reverts from df | New Shape: (14849, 17)
1241 - Is any revert_risk_score NA? : False
1242 - Is any user_edit_count NA? : False
1243 - Is any time_to_revert NA? : False
1244 - ROC_kywiki.png saved!
1245 - Optimal threshold for 15.0% FPR is: 0.4734741747379303
1246 - confusion_matrix_kywiki.png saved!
1247 - False Positive Rate is: 0.14982746721877158
1248 - CONFUSION MATRIX -
1249Predicted not reverted reverted
1250Actual
1251not reverted 12319 2171
1252reverted 52 307
1253
1254
1255============ - zh_min_nanwiki - ============
1256 - Snapshot: 2025-06
1257 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1258 - Raw data shape: (10256, 17)
1259 - Duplicate rows found and removed: 66
1260 - Clean data shape: (10190, 17)
1261 - Unique revision_ids: 10190 | Data Shape: 10190 | Same? : -> True
1262 - Removing edits that are reverts from df | New Shape: (10076, 17)
1263 - Is any revert_risk_score NA? : False
1264 - Is any user_edit_count NA? : False
1265 - Is any time_to_revert NA? : False
1266 - ROC_zh_min_nanwiki.png saved!
1267 - Optimal threshold for 15.0% FPR is: 0.8590548634529114
1268 - confusion_matrix_zh_min_nanwiki.png saved!
1269 - False Positive Rate is: 0.14965291955900367
1270 - CONFUSION MATRIX -
1271Predicted not reverted reverted
1272Actual
1273not reverted 8330 1466
1274reverted 180 100
1275
1276
1277============ - kmwiki - ============
1278 - Snapshot: 2025-06
1279 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1280 - Raw data shape: (6365, 17)
1281 - Duplicate rows found and removed: 108
1282 - Clean data shape: (6257, 17)
1283 - Unique revision_ids: 6257 | Data Shape: 6257 | Same? : -> True
1284 - Removing edits that are reverts from df | New Shape: (6125, 17)
1285 - Is any revert_risk_score NA? : False
1286 - Is any user_edit_count NA? : False
1287 - Is any time_to_revert NA? : False
1288 - ROC_kmwiki.png saved!
1289 - Optimal threshold for 15.0% FPR is: 0.8932412266731262
1290 - confusion_matrix_kmwiki.png saved!
1291 - False Positive Rate is: 0.15156196361139718
1292 - CONFUSION MATRIX -
1293Predicted not reverted reverted
1294Actual
1295not reverted 4943 883
1296reverted 158 141
1297
1298
1299============ - zh_classicalwiki - ============
1300 - Snapshot: 2025-06
1301 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1302 - Raw data shape: (1503, 17)
1303 - Duplicate rows found and removed: 5
1304 - Clean data shape: (1498, 17)
1305 - Unique revision_ids: 1498 | Data Shape: 1498 | Same? : -> True
1306 - Removing edits that are reverts from df | New Shape: (1444, 17)
1307 - Is any revert_risk_score NA? : False
1308 - Is any user_edit_count NA? : False
1309 - Is any time_to_revert NA? : False
1310 - ROC_zh_classicalwiki.png saved!
1311 - Optimal threshold for 15.0% FPR is: 0.8159477710723877
1312 - confusion_matrix_zh_classicalwiki.png saved!
1313 - False Positive Rate is: 0.14254224834680382
1314 - CONFUSION MATRIX -
1315Predicted not reverted reverted
1316Actual
1317not reverted 1167 194
1318reverted 10 73
1319
1320
1321============ - hywwiki - ============
1322 - Snapshot: 2025-06
1323 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1324 - Raw data shape: (3077, 17)
1325 - Duplicate rows found and removed: 55
1326 - Clean data shape: (3022, 17)
1327 - Unique revision_ids: 3022 | Data Shape: 3022 | Same? : -> True
1328 - Removing edits that are reverts from df | New Shape: (3000, 17)
1329 - Is any revert_risk_score NA? : False
1330 - Is any user_edit_count NA? : False
1331 - Is any time_to_revert NA? : False
1332 - ROC_hywwiki.png saved!
1333 - Optimal threshold for 15.0% FPR is: 0.36386242508888245
1334 - confusion_matrix_hywwiki.png saved!
1335 - False Positive Rate is: 0.16043507817811012
1336 - CONFUSION MATRIX -
1337Predicted not reverted reverted
1338Actual
1339not reverted 2470 472
1340reverted 5 53
1341
1342
1343============ - alswiki - ============
1344 - Snapshot: 2025-06
1345 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1346 - Raw data shape: (3247, 17)
1347 - Duplicate rows found and removed: 74
1348 - Clean data shape: (3173, 17)
1349 - Unique revision_ids: 3173 | Data Shape: 3173 | Same? : -> True
1350 - Removing edits that are reverts from df | New Shape: (2987, 17)
1351 - Is any revert_risk_score NA? : False
1352 - Is any user_edit_count NA? : False
1353 - Is any time_to_revert NA? : False
1354 - ROC_alswiki.png saved!
1355 - Optimal threshold for 15.0% FPR is: 0.6351034641265869
1356 - confusion_matrix_alswiki.png saved!
1357 - False Positive Rate is: 0.14942528735632185
1358 - CONFUSION MATRIX -
1359Predicted not reverted reverted
1360Actual
1361not reverted 2146 377
1362reverted 21 443
1363
1364
1365============ - fywiki - ============
1366 - Snapshot: 2025-06
1367 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1368 - Raw data shape: (9246, 17)
1369 - Duplicate rows found and removed: 3
1370 - Clean data shape: (9243, 17)
1371 - Unique revision_ids: 9243 | Data Shape: 9243 | Same? : -> True
1372 - Removing edits that are reverts from df | New Shape: (9212, 17)
1373 - Is any revert_risk_score NA? : False
1374 - Is any user_edit_count NA? : False
1375 - Is any time_to_revert NA? : False
1376 - ROC_fywiki.png saved!
1377 - Optimal threshold for 15.0% FPR is: 0.5464207530021667
1378 - confusion_matrix_fywiki.png saved!
1379 - False Positive Rate is: 0.15011013215859031
1380 - CONFUSION MATRIX -
1381Predicted not reverted reverted
1382Actual
1383not reverted 7717 1363
1384reverted 36 96
1385
1386
1387============ - anwiki - ============
1388 - Snapshot: 2025-06
1389 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1390 - Raw data shape: (25105, 17)
1391 - Duplicate rows found and removed: 56
1392 - Clean data shape: (25049, 17)
1393 - Unique revision_ids: 25049 | Data Shape: 25049 | Same? : -> True
1394 - Removing edits that are reverts from df | New Shape: (24873, 17)
1395 - Is any revert_risk_score NA? : False
1396 - Is any user_edit_count NA? : False
1397 - Is any time_to_revert NA? : False
1398 - ROC_anwiki.png saved!
1399 - Optimal threshold for 15.0% FPR is: 0.3275045156478882
1400 - confusion_matrix_anwiki.png saved!
1401 - False Positive Rate is: 0.15239760849229267
1402 - CONFUSION MATRIX -
1403Predicted not reverted reverted
1404Actual
1405not reverted 20840 3747
1406reverted 19 267
1407
1408
1409============ - suwiki - ============
1410 - Snapshot: 2025-06
1411 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1412 - Raw data shape: (4387, 17)
1413 - Duplicate rows found and removed: 81
1414 - Clean data shape: (4306, 17)
1415 - Unique revision_ids: 4306 | Data Shape: 4306 | Same? : -> True
1416 - Removing edits that are reverts from df | New Shape: (4211, 17)
1417 - Is any revert_risk_score NA? : False
1418 - Is any user_edit_count NA? : False
1419 - Is any time_to_revert NA? : False
1420 - ROC_suwiki.png saved!
1421 - Optimal threshold for 15.0% FPR is: 0.7814511656761169
1422 - confusion_matrix_suwiki.png saved!
1423 - False Positive Rate is: 0.149812734082397
1424 - CONFUSION MATRIX -
1425Predicted not reverted reverted
1426Actual
1427not reverted 3405 600
1428reverted 91 115
1429
1430============ - yowiki - ============
1431 - Snapshot: 2025-06
1432 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1433 - Raw data shape: (4231, 17)
1434 - Duplicate rows found and removed: 22
1435 - Clean data shape: (4209, 17)
1436 - Unique revision_ids: 4209 | Data Shape: 4209 | Same? : -> True
1437 - Removing edits that are reverts from df | New Shape: (4143, 17)
1438 - Is any revert_risk_score NA? : False
1439 - Is any user_edit_count NA? : False
1440 - Is any time_to_revert NA? : False
1441 - ROC_yowiki.png saved!
1442 - Optimal threshold for 15.0% FPR is: 0.832072913646698
1443 - confusion_matrix_yowiki.png saved!
1444 - False Positive Rate is: 0.14874596473801838
1445 - CONFUSION MATRIX -
1446Predicted not reverted reverted
1447Actual
1448not reverted 3428 599
1449reverted 33 83
1450
1451
1452============ - arywiki - ============
1453 - Snapshot: 2025-06
1454 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1455 - Raw data shape: (22620, 17)
1456 - Duplicate rows found and removed: 14
1457 - Clean data shape: (22606, 17)
1458 - Unique revision_ids: 22606 | Data Shape: 22606 | Same? : -> True
1459 - Removing edits that are reverts from df | New Shape: (22555, 17)
1460 - Is any revert_risk_score NA? : False
1461 - Is any user_edit_count NA? : False
1462 - Is any time_to_revert NA? : False
1463 - ROC_arywiki.png saved!
1464 - Optimal threshold for 15.0% FPR is: 0.11711876094341278
1465 - confusion_matrix_arywiki.png saved!
1466 - False Positive Rate is: 0.15008499597387492
1467 - CONFUSION MATRIX -
1468Predicted not reverted reverted
1469Actual
1470not reverted 18999 3355
1471reverted 36 165
1472
1473
1474============ - sdwiki - ============
1475 - Snapshot: 2025-06
1476 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1477 - Raw data shape: (15947, 17)
1478 - Duplicate rows found and removed: 0
1479 - Clean data shape: (15947, 17)
1480 - Unique revision_ids: 15947 | Data Shape: 15947 | Same? : -> True
1481 - Removing edits that are reverts from df | New Shape: (15929, 17)
1482 - Is any revert_risk_score NA? : False
1483 - Is any user_edit_count NA? : False
1484 - Is any time_to_revert NA? : False
1485 - ROC_sdwiki.png saved!
1486 - Optimal threshold for 15.0% FPR is: 0.4532212018966675
1487 - confusion_matrix_sdwiki.png saved!
1488 - False Positive Rate is: 0.1492377472596699
1489 - CONFUSION MATRIX -
1490Predicted not reverted reverted
1491Actual
1492not reverted 13505 2369
1493reverted 47 8
1494
1495
1496============ - vecwiki - ============
1497 - Snapshot: 2025-06
1498 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1499 - Raw data shape: (1778, 17)
1500 - Duplicate rows found and removed: 111
1501 - Clean data shape: (1667, 17)
1502 - Unique revision_ids: 1667 | Data Shape: 1667 | Same? : -> True
1503 - Removing edits that are reverts from df | New Shape: (1566, 17)
1504 - Is any revert_risk_score NA? : False
1505 - Is any user_edit_count NA? : False
1506 - Is any time_to_revert NA? : False
1507 - ROC_vecwiki.png saved!
1508 - Optimal threshold for 15.0% FPR is: 0.6318691968917847
1509 - confusion_matrix_vecwiki.png saved!
1510 - False Positive Rate is: 0.14884979702300405
1511 - CONFUSION MATRIX -
1512Predicted not reverted reverted
1513Actual
1514not reverted 1258 220
1515reverted 11 77
1516
1517
1518============ - pswiki - ============
1519 - Snapshot: 2025-06
1520 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1521 - Raw data shape: (3929, 17)
1522 - Duplicate rows found and removed: 43
1523 - Clean data shape: (3886, 17)
1524 - Unique revision_ids: 3886 | Data Shape: 3886 | Same? : -> True
1525 - Removing edits that are reverts from df | New Shape: (3822, 17)
1526 - Is any revert_risk_score NA? : False
1527 - Is any user_edit_count NA? : False
1528 - Is any time_to_revert NA? : False
1529 - ROC_pswiki.png saved!
1530 - Optimal threshold for 15.0% FPR is: 0.5818879008293152
1531 - confusion_matrix_pswiki.png saved!
1532 - False Positive Rate is: 0.15012106537530268
1533 - CONFUSION MATRIX -
1534Predicted not reverted reverted
1535Actual
1536not reverted 3159 558
1537reverted 34 71
1538
1539
1540============ - ndswiki - ============
1541 - Snapshot: 2025-06
1542 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1543 - Raw data shape: (3549, 17)
1544 - Duplicate rows found and removed: 19
1545 - Clean data shape: (3530, 17)
1546 - Unique revision_ids: 3530 | Data Shape: 3530 | Same? : -> True
1547 - Removing edits that are reverts from df | New Shape: (3460, 17)
1548 - Is any revert_risk_score NA? : False
1549 - Is any user_edit_count NA? : False
1550 - Is any time_to_revert NA? : False
1551 - ROC_ndswiki.png saved!
1552 - Optimal threshold for 15.0% FPR is: 0.7050349712371826
1553 - confusion_matrix_ndswiki.png saved!
1554 - False Positive Rate is: 0.1526080476900149
1555 - CONFUSION MATRIX -
1556Predicted not reverted reverted
1557Actual
1558not reverted 2843 512
1559reverted 21 84
1560
1561
1562============ - banwiki - ============
1563 - Snapshot: 2025-06
1564 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1565 - Raw data shape: (5769, 17)
1566 - Duplicate rows found and removed: 39
1567 - Clean data shape: (5730, 17)
1568 - Unique revision_ids: 5730 | Data Shape: 5730 | Same? : -> True
1569 - Removing edits that are reverts from df | New Shape: (5662, 17)
1570 - Is any revert_risk_score NA? : False
1571 - Is any user_edit_count NA? : False
1572 - Is any time_to_revert NA? : False
1573 - ROC_banwiki.png saved!
1574 - Optimal threshold for 15.0% FPR is: 0.7351230978965759
1575 - confusion_matrix_banwiki.png saved!
1576 - False Positive Rate is: 0.14884055365809815
1577 - CONFUSION MATRIX -
1578Predicted not reverted reverted
1579Actual
1580not reverted 4735 828
1581reverted 27 72
1582
1583
1584============ - sahwiki - ============
1585 - Snapshot: 2025-06
1586 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1587 - Raw data shape: (1471, 17)
1588 - Duplicate rows found and removed: 20
1589 - Clean data shape: (1451, 17)
1590 - Unique revision_ids: 1451 | Data Shape: 1451 | Same? : -> True
1591 - Removing edits that are reverts from df | New Shape: (1417, 17)
1592 - Is any revert_risk_score NA? : False
1593 - Is any user_edit_count NA? : False
1594 - Is any time_to_revert NA? : False
1595 - ROC_sahwiki.png saved!
1596 - Optimal threshold for 15.0% FPR is: 0.8460126519203186
1597 - confusion_matrix_sahwiki.png saved!
1598 - False Positive Rate is: 0.15631848064280496
1599 - CONFUSION MATRIX -
1600Predicted not reverted reverted
1601Actual
1602not reverted 1155 214
1603reverted 15 33
1604
1605
1606============ - tcywiki - ============
1607 - Snapshot: 2025-06
1608 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1609 - Raw data shape: (8416, 17)
1610 - Duplicate rows found and removed: 10
1611 - Clean data shape: (8406, 17)
1612 - Unique revision_ids: 8406 | Data Shape: 8406 | Same? : -> True
1613 - Removing edits that are reverts from df | New Shape: (8381, 17)
1614 - Is any revert_risk_score NA? : False
1615 - Is any user_edit_count NA? : False
1616 - Is any time_to_revert NA? : False
1617 - ROC_tcywiki.png saved!
1618 - Optimal threshold for 15.0% FPR is: 0.27785277366638184
1619 - confusion_matrix_tcywiki.png saved!
1620 - False Positive Rate is: 0.15046268477346472
1621 - CONFUSION MATRIX -
1622Predicted not reverted reverted
1623Actual
1624not reverted 7069 1252
1625reverted 43 17
1626
1627
1628============ - lijwiki - ============
1629 - Snapshot: 2025-06
1630 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1631 - Raw data shape: (4380, 17)
1632 - Duplicate rows found and removed: 313
1633 - Clean data shape: (4067, 17)
1634 - Unique revision_ids: 4067 | Data Shape: 4067 | Same? : -> True
1635 - Removing edits that are reverts from df | New Shape: (4048, 17)
1636 - Is any revert_risk_score NA? : False
1637 - Is any user_edit_count NA? : False
1638 - Is any time_to_revert NA? : False
1639 - ROC_lijwiki.png saved!
1640 - Optimal threshold for 15.0% FPR is: 0.3695965111255646
1641 - confusion_matrix_lijwiki.png saved!
1642 - False Positive Rate is: 0.14054726368159204
1643 - CONFUSION MATRIX -
1644Predicted not reverted reverted
1645Actual
1646not reverted 3455 565
1647reverted 3 25
1648
1649
1650============ - lmowiki - ============
1651 - Snapshot: 2025-06
1652 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1653 - Raw data shape: (12464, 17)
1654 - Duplicate rows found and removed: 174
1655 - Clean data shape: (12290, 17)
1656 - Unique revision_ids: 12290 | Data Shape: 12290 | Same? : -> True
1657 - Removing edits that are reverts from df | New Shape: (12185, 17)
1658 - Is any revert_risk_score NA? : False
1659 - Is any user_edit_count NA? : False
1660 - Is any time_to_revert NA? : False
1661 - ROC_lmowiki.png saved!
1662 - Optimal threshold for 15.0% FPR is: 0.21587315201759338
1663 - confusion_matrix_lmowiki.png saved!
1664 - False Positive Rate is: 0.1484974958263773
1665 - CONFUSION MATRIX -
1666Predicted not reverted reverted
1667Actual
1668not reverted 10201 1779
1669reverted 28 177
1670
1671
1672============ - barwiki - ============
1673 - Snapshot: 2025-06
1674 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1675 - Raw data shape: (2160, 17)
1676 - Duplicate rows found and removed: 24
1677 - Clean data shape: (2136, 17)
1678 - Unique revision_ids: 2136 | Data Shape: 2136 | Same? : -> True
1679 - Removing edits that are reverts from df | New Shape: (2081, 17)
1680 - Is any revert_risk_score NA? : False
1681 - Is any user_edit_count NA? : False
1682 - Is any time_to_revert NA? : False
1683 - ROC_barwiki.png saved!
1684 - Optimal threshold for 15.0% FPR is: 0.7210271954536438
1685 - confusion_matrix_barwiki.png saved!
1686 - False Positive Rate is: 0.1496023856858847
1687 - CONFUSION MATRIX -
1688Predicted not reverted reverted
1689Actual
1690not reverted 1711 301
1691reverted 10 59
1692
1693
1694============ - bclwiki - ============
1695 - Snapshot: 2025-06
1696 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1697 - Raw data shape: (5677, 17)
1698 - Duplicate rows found and removed: 30
1699 - Clean data shape: (5647, 17)
1700 - Unique revision_ids: 5647 | Data Shape: 5647 | Same? : -> True
1701 - Removing edits that are reverts from df | New Shape: (5556, 17)
1702 - Is any revert_risk_score NA? : False
1703 - Is any user_edit_count NA? : False
1704 - Is any time_to_revert NA? : False
1705 - ROC_bclwiki.png saved!
1706 - Optimal threshold for 15.0% FPR is: 0.4005255103111267
1707 - confusion_matrix_bclwiki.png saved!
1708 - False Positive Rate is: 0.16381057674590013
1709 - CONFUSION MATRIX -
1710Predicted not reverted reverted
1711Actual
1712not reverted 4538 889
1713reverted 26 103
1714
1715
1716============ - cvwiki - ============
1717 - Snapshot: 2025-06
1718 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1719 - Raw data shape: (4718, 17)
1720 - Duplicate rows found and removed: 3
1721 - Clean data shape: (4715, 17)
1722 - Unique revision_ids: 4715 | Data Shape: 4715 | Same? : -> True
1723 - Removing edits that are reverts from df | New Shape: (4689, 17)
1724 - Is any revert_risk_score NA? : False
1725 - Is any user_edit_count NA? : False
1726 - Is any time_to_revert NA? : False
1727 - ROC_cvwiki.png saved!
1728 - Optimal threshold for 15.0% FPR is: 0.43576115369796753
1729 - confusion_matrix_cvwiki.png saved!
1730 - False Positive Rate is: 0.14930182599355532
1731 - CONFUSION MATRIX -
1732Predicted not reverted reverted
1733Actual
1734not reverted 3960 695
1735reverted 5 29
1736
1737
1738============ - mtwiki - ============
1739 - Snapshot: 2025-06
1740 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1741 - Raw data shape: (4454, 17)
1742 - Duplicate rows found and removed: 7
1743 - Clean data shape: (4447, 17)
1744 - Unique revision_ids: 4447 | Data Shape: 4447 | Same? : -> True
1745 - Removing edits that are reverts from df | New Shape: (4430, 17)
1746 - Is any revert_risk_score NA? : False
1747 - Is any user_edit_count NA? : False
1748 - Is any time_to_revert NA? : False
1749 - ROC_mtwiki.png saved!
1750 - Optimal threshold for 15.0% FPR is: 0.5258171558380127
1751 - confusion_matrix_mtwiki.png saved!
1752 - False Positive Rate is: 0.14901960784313725
1753 - CONFUSION MATRIX -
1754Predicted not reverted reverted
1755Actual
1756not reverted 3689 646
1757reverted 36 59
1758
1759
1760============ - iawiki - ============
1761 - Snapshot: 2025-06
1762 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1763 - Raw data shape: (3487, 17)
1764 - Duplicate rows found and removed: 303
1765 - Clean data shape: (3184, 17)
1766 - Unique revision_ids: 3184 | Data Shape: 3184 | Same? : -> True
1767 - Removing edits that are reverts from df | New Shape: (2648, 17)
1768 - Is any revert_risk_score NA? : False
1769 - Is any user_edit_count NA? : False
1770 - Is any time_to_revert NA? : False
1771 - ROC_iawiki.png saved!
1772 - Optimal threshold for 15.0% FPR is: 0.5953251719474792
1773 - confusion_matrix_iawiki.png saved!
1774 - False Positive Rate is: 0.14973694860380413
1775 - CONFUSION MATRIX -
1776Predicted not reverted reverted
1777Actual
1778not reverted 2101 370
1779reverted 48 129
1780
1781
1782============ - szywiki - ============
1783 - Snapshot: 2025-06
1784 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1785 - Raw data shape: (852, 17)
1786 - Duplicate rows found and removed: 2
1787 - Clean data shape: (850, 17)
1788 - Unique revision_ids: 850 | Data Shape: 850 | Same? : -> True
1789 - Removing edits that are reverts from df | New Shape: (847, 17)
1790 - Is any revert_risk_score NA? : False
1791 - Is any user_edit_count NA? : False
1792 - Is any time_to_revert NA? : False
1793 - ROC_szywiki.png saved!
1794 - Optimal threshold for 15.0% FPR is: 0.35294532775878906
1795 - confusion_matrix_szywiki.png saved!
1796 - False Positive Rate is: 0.13539192399049882
1797 - CONFUSION MATRIX -
1798Predicted not reverted reverted
1799Actual
1800not reverted 728 114
1801reverted 0 5
1802
1803
1804============ - cvwiki - ============
1805 - Snapshot: 2025-06
1806 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1807 - Raw data shape: (4718, 17)
1808 - Duplicate rows found and removed: 3
1809 - Clean data shape: (4715, 17)
1810 - Unique revision_ids: 4715 | Data Shape: 4715 | Same? : -> True
1811 - Removing edits that are reverts from df | New Shape: (4689, 17)
1812 - Is any revert_risk_score NA? : False
1813 - Is any user_edit_count NA? : False
1814 - Is any time_to_revert NA? : False
1815 - ROC_cvwiki.png saved!
1816 - Optimal threshold for 15.0% FPR is: 0.43576115369796753
1817 - confusion_matrix_cvwiki.png saved!
1818 - False Positive Rate is: 0.14930182599355532
1819 - CONFUSION MATRIX -
1820Predicted not reverted reverted
1821Actual
1822not reverted 3960 695
1823reverted 5 29
1824
1825
1826============ - mtwiki - ============
1827 - Snapshot: 2025-06
1828 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1829 - Raw data shape: (4454, 17)
1830 - Duplicate rows found and removed: 7
1831 - Clean data shape: (4447, 17)
1832 - Unique revision_ids: 4447 | Data Shape: 4447 | Same? : -> True
1833 - Removing edits that are reverts from df | New Shape: (4430, 17)
1834 - Is any revert_risk_score NA? : False
1835 - Is any user_edit_count NA? : False
1836 - Is any time_to_revert NA? : False
1837 - ROC_mtwiki.png saved!
1838 - Optimal threshold for 15.0% FPR is: 0.5258171558380127
1839 - confusion_matrix_mtwiki.png saved!
1840 - False Positive Rate is: 0.14901960784313725
1841 - CONFUSION MATRIX -
1842Predicted not reverted reverted
1843Actual
1844not reverted 3689 646
1845reverted 36 59
1846
1847
1848============ - iawiki - ============
1849 - Snapshot: 2025-06
1850 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1851 - Raw data shape: (3487, 17)
1852 - Duplicate rows found and removed: 303
1853 - Clean data shape: (3184, 17)
1854 - Unique revision_ids: 3184 | Data Shape: 3184 | Same? : -> True
1855 - Removing edits that are reverts from df | New Shape: (2648, 17)
1856 - Is any revert_risk_score NA? : False
1857 - Is any user_edit_count NA? : False
1858 - Is any time_to_revert NA? : False
1859 - ROC_iawiki.png saved!
1860 - Optimal threshold for 15.0% FPR is: 0.5953251719474792
1861 - confusion_matrix_iawiki.png saved!
1862 - False Positive Rate is: 0.14973694860380413
1863 - CONFUSION MATRIX -
1864Predicted not reverted reverted
1865Actual
1866not reverted 2101 370
1867reverted 48 129
1868
1869
1870============ - szywiki - ============
1871 - Snapshot: 2025-06
1872 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1873 - Raw data shape: (852, 17)
1874 - Duplicate rows found and removed: 2
1875 - Clean data shape: (850, 17)
1876 - Unique revision_ids: 850 | Data Shape: 850 | Same? : -> True
1877 - Removing edits that are reverts from df | New Shape: (847, 17)
1878 - Is any revert_risk_score NA? : False
1879 - Is any user_edit_count NA? : False
1880 - Is any time_to_revert NA? : False
1881 - ROC_szywiki.png saved!
1882 - Optimal threshold for 15.0% FPR is: 0.35294532775878906
1883 - confusion_matrix_szywiki.png saved!
1884 - False Positive Rate is: 0.13539192399049882
1885 - CONFUSION MATRIX -
1886Predicted not reverted reverted
1887Actual
1888not reverted 728 114
1889reverted 0 5
1890
1891
1892============ - pnbwiki - ============
1893 - Snapshot: 2025-06
1894 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1895 - Raw data shape: (7273, 17)
1896 - Duplicate rows found and removed: 13
1897 - Clean data shape: (7260, 17)
1898 - Unique revision_ids: 7260 | Data Shape: 7260 | Same? : -> True
1899 - Removing edits that are reverts from df | New Shape: (7217, 17)
1900 - Is any revert_risk_score NA? : False
1901 - Is any user_edit_count NA? : False
1902 - Is any time_to_revert NA? : False
1903 - ROC_pnbwiki.png saved!
1904 - Optimal threshold for 15.0% FPR is: 0.32891952991485596
1905 - confusion_matrix_pnbwiki.png saved!
1906 - False Positive Rate is: 0.15066202090592334
1907 - CONFUSION MATRIX -
1908Predicted not reverted reverted
1909Actual
1910not reverted 6094 1081
1911reverted 6 36
1912
1913
1914============ - scwiki - ============
1915 - Snapshot: 2025-06
1916 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1917 - Raw data shape: (921, 17)
1918 - Duplicate rows found and removed: 4
1919 - Clean data shape: (917, 17)
1920 - Unique revision_ids: 917 | Data Shape: 917 | Same? : -> True
1921 - Removing edits that are reverts from df | New Shape: (884, 17)
1922 - Is any revert_risk_score NA? : False
1923 - Is any user_edit_count NA? : False
1924 - Is any time_to_revert NA? : False
1925 - ROC_scwiki.png saved!
1926 - Optimal threshold for 15.0% FPR is: 0.6216806173324585
1927 - confusion_matrix_scwiki.png saved!
1928 - False Positive Rate is: 0.15058823529411763
1929 - CONFUSION MATRIX -
1930Predicted not reverted reverted
1931Actual
1932not reverted 722 128
1933reverted 8 26
1934
1935
1936============ - cewiki - ============
1937 - Snapshot: 2025-06
1938 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1939 - Raw data shape: (16255, 17)
1940 - Duplicate rows found and removed: 16
1941 - Clean data shape: (16239, 17)
1942 - Unique revision_ids: 16239 | Data Shape: 16239 | Same? : -> True
1943 - Removing edits that are reverts from df | New Shape: (16222, 17)
1944 - Is any revert_risk_score NA? : False
1945 - Is any user_edit_count NA? : False
1946 - Is any time_to_revert NA? : False
1947 - ROC_cewiki.png saved!
1948 - Optimal threshold for 15.0% FPR is: 0.4421550929546356
1949 - confusion_matrix_cewiki.png saved!
1950 - False Positive Rate is: 0.15128284389489954
1951 - CONFUSION MATRIX -
1952Predicted not reverted reverted
1953Actual
1954not reverted 13728 2447
1955reverted 8 39
1956
1957
1958============ - vowiki - ============
1959 - Snapshot: 2025-06
1960 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1961 - Raw data shape: (3479, 17)
1962 - Duplicate rows found and removed: 36
1963 - Clean data shape: (3443, 17)
1964 - Unique revision_ids: 3443 | Data Shape: 3443 | Same? : -> True
1965 - Removing edits that are reverts from df | New Shape: (3405, 17)
1966 - Is any revert_risk_score NA? : False
1967 - Is any user_edit_count NA? : False
1968 - Is any time_to_revert NA? : False
1969 - ROC_vowiki.png saved!
1970 - Optimal threshold for 15.0% FPR is: 0.21234892308712006
1971 - confusion_matrix_vowiki.png saved!
1972 - False Positive Rate is: 0.15004439183190293
1973 - CONFUSION MATRIX -
1974Predicted not reverted reverted
1975Actual
1976not reverted 2872 507
1977reverted 2 24
1978
1979
1980============ - tkwiki - ============
1981 - Snapshot: 2025-06
1982 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
1983 - Raw data shape: (3857, 17)
1984 - Duplicate rows found and removed: 64
1985 - Clean data shape: (3793, 17)
1986 - Unique revision_ids: 3793 | Data Shape: 3793 | Same? : -> True
1987 - Removing edits that are reverts from df | New Shape: (3721, 17)
1988 - Is any revert_risk_score NA? : False
1989 - Is any user_edit_count NA? : False
1990 - Is any time_to_revert NA? : False
1991 - ROC_tkwiki.png saved!
1992 - Optimal threshold for 15.0% FPR is: 0.8397768139839172
1993 - confusion_matrix_tkwiki.png saved!
1994 - False Positive Rate is: 0.151440329218107
1995 - CONFUSION MATRIX -
1996Predicted not reverted reverted
1997Actual
1998not reverted 3093 552
1999reverted 24 52
2000
2001
2002============ - iowiki - ============
2003 - Snapshot: 2025-06
2004 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2005 - Raw data shape: (7409, 17)
2006 - Duplicate rows found and removed: 13
2007 - Clean data shape: (7396, 17)
2008 - Unique revision_ids: 7396 | Data Shape: 7396 | Same? : -> True
2009 - Removing edits that are reverts from df | New Shape: (7354, 17)
2010 - Is any revert_risk_score NA? : False
2011 - Is any user_edit_count NA? : False
2012 - Is any time_to_revert NA? : False
2013 - ROC_iowiki.png saved!
2014 - Optimal threshold for 15.0% FPR is: 0.4195224344730377
2015 - confusion_matrix_iowiki.png saved!
2016 - False Positive Rate is: 0.15097715386732727
2017 - CONFUSION MATRIX -
2018Predicted not reverted reverted
2019Actual
2020not reverted 6169 1097
2021reverted 35 53
2022
2023
2024============ - mnwwiki - ============
2025 - Snapshot: 2025-06
2026 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2027 - Raw data shape: (733, 17)
2028 - Duplicate rows found and removed: 0
2029 - Clean data shape: (733, 17)
2030 - Unique revision_ids: 733 | Data Shape: 733 | Same? : -> True
2031 - Removing edits that are reverts from df | New Shape: (733, 17)
2032 - Is any revert_risk_score NA? : False
2033 - Is any user_edit_count NA? : False
2034 - Is any time_to_revert NA? : False
2035 - ROC_mnwwiki.png saved!
2036 - Optimal threshold for 15.0% FPR is: 0.3484097719192505
2037 - confusion_matrix_mnwwiki.png saved!
2038 - False Positive Rate is: 0.12978142076502733
2039 - CONFUSION MATRIX -
2040Predicted not reverted reverted
2041Actual
2042not reverted 637 95
2043reverted 1 0
2044
2045
2046============ - sawiki - ============
2047 - Snapshot: 2025-06
2048 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2049 - Raw data shape: (1247, 17)
2050 - Duplicate rows found and removed: 2
2051 - Clean data shape: (1245, 17)
2052 - Unique revision_ids: 1245 | Data Shape: 1245 | Same? : -> True
2053 - Removing edits that are reverts from df | New Shape: (1178, 17)
2054 - Is any revert_risk_score NA? : False
2055 - Is any user_edit_count NA? : False
2056 - Is any time_to_revert NA? : False
2057 - ROC_sawiki.png saved!
2058 - Optimal threshold for 15.0% FPR is: 0.797757089138031
2059 - confusion_matrix_sawiki.png saved!
2060 - False Positive Rate is: 0.14841628959276018
2061 - CONFUSION MATRIX -
2062Predicted not reverted reverted
2063Actual
2064not reverted 941 164
2065reverted 27 46
2066
2067
2068============ - quwiki - ============
2069 - Snapshot: 2025-06
2070 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2071 - Raw data shape: (1905, 17)
2072 - Duplicate rows found and removed: 9
2073 - Clean data shape: (1896, 17)
2074 - Unique revision_ids: 1896 | Data Shape: 1896 | Same? : -> True
2075 - Removing edits that are reverts from df | New Shape: (1869, 17)
2076 - Is any revert_risk_score NA? : False
2077 - Is any user_edit_count NA? : False
2078 - Is any time_to_revert NA? : False
2079 - ROC_quwiki.png saved!
2080 - Optimal threshold for 15.0% FPR is: 0.6781440377235413
2081 - confusion_matrix_quwiki.png saved!
2082 - False Positive Rate is: 0.16584833606110203
2083 - CONFUSION MATRIX -
2084Predicted not reverted reverted
2085Actual
2086not reverted 1529 304
2087reverted 8 28
2088
2089
2090============ - crhwiki - ============
2091 - Snapshot: 2025-06
2092 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2093 - Raw data shape: (1792, 17)
2094 - Duplicate rows found and removed: 15
2095 - Clean data shape: (1777, 17)
2096 - Unique revision_ids: 1777 | Data Shape: 1777 | Same? : -> True
2097 - Removing edits that are reverts from df | New Shape: (1753, 17)
2098 - Is any revert_risk_score NA? : False
2099 - Is any user_edit_count NA? : False
2100 - Is any time_to_revert NA? : False
2101 - ROC_crhwiki.png saved!
2102 - Optimal threshold for 15.0% FPR is: 0.750942051410675
2103 - confusion_matrix_crhwiki.png saved!
2104 - False Positive Rate is: 0.15029239766081873
2105 - CONFUSION MATRIX -
2106Predicted not reverted reverted
2107Actual
2108not reverted 1453 257
2109reverted 6 37
2110
2111
2112============ - bhwiki - ============
2113 - Snapshot: 2025-06
2114 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2115 - Raw data shape: (3561, 17)
2116 - Duplicate rows found and removed: 67
2117 - Clean data shape: (3494, 17)
2118 - Unique revision_ids: 3494 | Data Shape: 3494 | Same? : -> True
2119 - Removing edits that are reverts from df | New Shape: (3376, 17)
2120 - Is any revert_risk_score NA? : False
2121 - Is any user_edit_count NA? : False
2122 - Is any time_to_revert NA? : False
2123 - ROC_bhwiki.png saved!
2124 - Optimal threshold for 15.0% FPR is: 0.3270818591117859
2125 - confusion_matrix_bhwiki.png saved!
2126 - False Positive Rate is: 0.1329605467536502
2127 - CONFUSION MATRIX -
2128Predicted not reverted reverted
2129Actual
2130not reverted 2791 428
2131reverted 4 153
2132
2133
2134============ - lowiki - ============
2135 - Snapshot: 2025-06
2136 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2137 - Raw data shape: (5248, 17)
2138 - Duplicate rows found and removed: 38
2139 - Clean data shape: (5210, 17)
2140 - Unique revision_ids: 5210 | Data Shape: 5210 | Same? : -> True
2141 - Removing edits that are reverts from df | New Shape: (5150, 17)
2142 - Is any revert_risk_score NA? : False
2143 - Is any user_edit_count NA? : False
2144 - Is any time_to_revert NA? : False
2145 - ROC_lowiki.png saved!
2146 - Optimal threshold for 15.0% FPR is: 0.5496832728385925
2147 - confusion_matrix_lowiki.png saved!
2148 - False Positive Rate is: 0.15178571428571427
2149 - CONFUSION MATRIX -
2150Predicted not reverted reverted
2151Actual
2152not reverted 4275 765
2153reverted 26 84
2154
2155
2156============ - maiwiki - ============
2157 - Snapshot: 2025-06
2158 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2159 - Raw data shape: (2009, 17)
2160 - Duplicate rows found and removed: 47
2161 - Clean data shape: (1962, 17)
2162 - Unique revision_ids: 1962 | Data Shape: 1962 | Same? : -> True
2163 - Removing edits that are reverts from df | New Shape: (1915, 17)
2164 - Is any revert_risk_score NA? : False
2165 - Is any user_edit_count NA? : False
2166 - Is any time_to_revert NA? : False
2167 - ROC_maiwiki.png saved!
2168 - Optimal threshold for 15.0% FPR is: 0.42570507526397705
2169 - confusion_matrix_maiwiki.png saved!
2170 - False Positive Rate is: 0.16657652785289345
2171 - CONFUSION MATRIX -
2172Predicted not reverted reverted
2173Actual
2174not reverted 1541 308
2175reverted 1 65
2176
2177
2178============ - diqwiki - ============
2179 - Snapshot: 2025-06
2180 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2181 - Raw data shape: (2705, 17)
2182 - Duplicate rows found and removed: 35
2183 - Clean data shape: (2670, 17)
2184 - Unique revision_ids: 2670 | Data Shape: 2670 | Same? : -> True
2185 - Removing edits that are reverts from df | New Shape: (2575, 17)
2186 - Is any revert_risk_score NA? : False
2187 - Is any user_edit_count NA? : False
2188 - Is any time_to_revert NA? : False
2189 - ROC_diqwiki.png saved!
2190 - Optimal threshold for 15.0% FPR is: 0.5013527870178223
2191 - confusion_matrix_diqwiki.png saved!
2192 - False Positive Rate is: 0.14007308160779536
2193 - CONFUSION MATRIX -
2194Predicted not reverted reverted
2195Actual
2196not reverted 2118 345
2197reverted 14 98
2198
2199
2200============ - liwiki - ============
2201 - Snapshot: 2025-06
2202 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2203 - Raw data shape: (5884, 17)
2204 - Duplicate rows found and removed: 8
2205 - Clean data shape: (5876, 17)
2206 - Unique revision_ids: 5876 | Data Shape: 5876 | Same? : -> True
2207 - Removing edits that are reverts from df | New Shape: (5833, 17)
2208 - Is any revert_risk_score NA? : False
2209 - Is any user_edit_count NA? : False
2210 - Is any time_to_revert NA? : False
2211 - ROC_liwiki.png saved!
2212 - Optimal threshold for 15.0% FPR is: 0.5055482983589172
2213 - confusion_matrix_liwiki.png saved!
2214 - False Positive Rate is: 0.14946928832434314
2215 - CONFUSION MATRIX -
2216Predicted not reverted reverted
2217Actual
2218not reverted 4888 859
2219reverted 36 50
2220
2221
2222============ - nds_nlwiki - ============
2223 - Snapshot: 2025-06
2224 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2225 - Raw data shape: (502, 17)
2226 - Duplicate rows found and removed: 4
2227 - Clean data shape: (498, 17)
2228 - Unique revision_ids: 498 | Data Shape: 498 | Same? : -> True
2229 - Removing edits that are reverts from df | New Shape: (486, 17)
2230 - Is any revert_risk_score NA? : False
2231 - Is any user_edit_count NA? : False
2232 - Is any time_to_revert NA? : False
2233 - ROC_nds_nlwiki.png saved!
2234 - Optimal threshold for 15.0% FPR is: 0.8951061367988586
2235 - confusion_matrix_nds_nlwiki.png saved!
2236 - False Positive Rate is: 0.1522248243559719
2237 - CONFUSION MATRIX -
2238Predicted not reverted reverted
2239Actual
2240not reverted 362 65
2241reverted 35 24
2242
2243
2244============ - fowiki - ============
2245 - Snapshot: 2025-06
2246 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2247 - Raw data shape: (2774, 17)
2248 - Duplicate rows found and removed: 36
2249 - Clean data shape: (2738, 17)
2250 - Unique revision_ids: 2738 | Data Shape: 2738 | Same? : -> True
2251 - Removing edits that are reverts from df | New Shape: (2524, 17)
2252 - Is any revert_risk_score NA? : False
2253 - Is any user_edit_count NA? : False
2254 - Is any time_to_revert NA? : False
2255 - ROC_fowiki.png saved!
2256 - Optimal threshold for 15.0% FPR is: 0.8035809993743896
2257 - confusion_matrix_fowiki.png saved!
2258 - False Positive Rate is: 0.15061224489795919
2259 - CONFUSION MATRIX -
2260Predicted not reverted reverted
2261Actual
2262not reverted 2081 369
2263reverted 18 56
2264
2265
2266============ - iewiki - ============
2267 - Snapshot: 2025-06
2268 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2269 - Raw data shape: (2922, 17)
2270 - Duplicate rows found and removed: 5
2271 - Clean data shape: (2917, 17)
2272 - Unique revision_ids: 2917 | Data Shape: 2917 | Same? : -> True
2273 - Removing edits that are reverts from df | New Shape: (2893, 17)
2274 - Is any revert_risk_score NA? : False
2275 - Is any user_edit_count NA? : False
2276 - Is any time_to_revert NA? : False
2277 - ROC_iewiki.png saved!
2278 - Optimal threshold for 15.0% FPR is: 0.5032959580421448
2279 - confusion_matrix_iewiki.png saved!
2280 - False Positive Rate is: 0.12857142857142856
2281 - CONFUSION MATRIX -
2282Predicted not reverted reverted
2283Actual
2284not reverted 2501 369
2285reverted 5 18
2286
2287
2288============ - kwwiki - ============
2289 - Snapshot: 2025-06
2290 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2291 - Raw data shape: (2258, 17)
2292 - Duplicate rows found and removed: 22
2293 - Clean data shape: (2236, 17)
2294 - Unique revision_ids: 2236 | Data Shape: 2236 | Same? : -> True
2295 - Removing edits that are reverts from df | New Shape: (2216, 17)
2296 - Is any revert_risk_score NA? : False
2297 - Is any user_edit_count NA? : False
2298 - Is any time_to_revert NA? : False
2299 - ROC_kwwiki.png saved!
2300 - Optimal threshold for 15.0% FPR is: 0.5901607871055603
2301 - confusion_matrix_kwwiki.png saved!
2302 - False Positive Rate is: 0.1548974943052392
2303 - CONFUSION MATRIX -
2304Predicted not reverted reverted
2305Actual
2306not reverted 1855 340
2307reverted 7 14
2308
2309
2310============ - htwiki - ============
2311 - Snapshot: 2025-06
2312 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2313 - Raw data shape: (9527, 17)
2314 - Duplicate rows found and removed: 56
2315 - Clean data shape: (9471, 17)
2316 - Unique revision_ids: 9471 | Data Shape: 9471 | Same? : -> True
2317 - Removing edits that are reverts from df | New Shape: (9423, 17)
2318 - Is any revert_risk_score NA? : False
2319 - Is any user_edit_count NA? : False
2320 - Is any time_to_revert NA? : False
2321 - ROC_htwiki.png saved!
2322 - Optimal threshold for 15.0% FPR is: 0.4185045063495636
2323 - confusion_matrix_htwiki.png saved!
2324 - False Positive Rate is: 0.14988742360887744
2325 - CONFUSION MATRIX -
2326Predicted not reverted reverted
2327Actual
2328not reverted 7929 1398
2329reverted 20 76
2330
2331
2332============ - oswiki - ============
2333 - Snapshot: 2025-06
2334 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2335 - Raw data shape: (9620, 17)
2336 - Duplicate rows found and removed: 12
2337 - Clean data shape: (9608, 17)
2338 - Unique revision_ids: 9608 | Data Shape: 9608 | Same? : -> True
2339 - Removing edits that are reverts from df | New Shape: (9584, 17)
2340 - Is any revert_risk_score NA? : False
2341 - Is any user_edit_count NA? : False
2342 - Is any time_to_revert NA? : False
2343 - ROC_oswiki.png saved!
2344 - Optimal threshold for 15.0% FPR is: 0.3149448037147522
2345 - confusion_matrix_oswiki.png saved!
2346 - False Positive Rate is: 0.14967985724782198
2347 - CONFUSION MATRIX -
2348Predicted not reverted reverted
2349Actual
2350not reverted 8101 1426
2351reverted 30 27
2352
2353
2354============ - igwiki - ============
2355 - Snapshot: 2025-06
2356 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2357 - Raw data shape: (12511, 17)
2358 - Duplicate rows found and removed: 23
2359 - Clean data shape: (12488, 17)
2360 - Unique revision_ids: 12488 | Data Shape: 12488 | Same? : -> True
2361 - Removing edits that are reverts from df | New Shape: (12402, 17)
2362 - Is any revert_risk_score NA? : False
2363 - Is any user_edit_count NA? : False
2364 - Is any time_to_revert NA? : False
2365 - ROC_igwiki.png saved!
2366 - Optimal threshold for 15.0% FPR is: 0.3341909348964691
2367 - confusion_matrix_igwiki.png saved!
2368 - False Positive Rate is: 0.159463850528026
2369 - CONFUSION MATRIX -
2370Predicted not reverted reverted
2371Actual
2372not reverted 10347 1963
2373reverted 52 40
2374
2375
2376============ - pmswiki - ============
2377 - Snapshot: 2025-06
2378 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2379 - Raw data shape: (1613, 17)
2380 - Duplicate rows found and removed: 20
2381 - Clean data shape: (1593, 17)
2382 - Unique revision_ids: 1593 | Data Shape: 1593 | Same? : -> True
2383 - Removing edits that are reverts from df | New Shape: (1529, 17)
2384 - Is any revert_risk_score NA? : False
2385 - Is any user_edit_count NA? : False
2386 - Is any time_to_revert NA? : False
2387 - ROC_pmswiki.png saved!
2388 - Optimal threshold for 15.0% FPR is: 0.8976967334747314
2389 - confusion_matrix_pmswiki.png saved!
2390 - False Positive Rate is: 0.15279672578444747
2391 - CONFUSION MATRIX -
2392Predicted not reverted reverted
2393Actual
2394not reverted 1242 224
2395reverted 45 18
2396
2397
2398============ - myvwiki - ============
2399 - Snapshot: 2025-06
2400 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2401 - Raw data shape: (327, 17)
2402 - Duplicate rows found and removed: 0
2403 - Clean data shape: (327, 17)
2404 - Unique revision_ids: 327 | Data Shape: 327 | Same? : -> True
2405 - Removing edits that are reverts from df | New Shape: (325, 17)
2406 - Is any revert_risk_score NA? : False
2407 - Is any user_edit_count NA? : False
2408 - Is any time_to_revert NA? : False
2409 - ROC_myvwiki.png saved!
2410 - Optimal threshold for 15.0% FPR is: 0.4775513708591461
2411 - confusion_matrix_myvwiki.png saved!
2412 - False Positive Rate is: 0.2006172839506173
2413 - CONFUSION MATRIX -
2414Predicted not reverted reverted
2415Actual
2416not reverted 259 65
2417reverted 1 0
2418
2419
2420============ - acewiki - ============
2421 - Snapshot: 2025-06
2422 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2423 - Raw data shape: (652, 17)
2424 - Duplicate rows found and removed: 30
2425 - Clean data shape: (622, 17)
2426 - Unique revision_ids: 622 | Data Shape: 622 | Same? : -> True
2427 - Removing edits that are reverts from df | New Shape: (521, 17)
2428 - Is any revert_risk_score NA? : False
2429 - Is any user_edit_count NA? : False
2430 - Is any time_to_revert NA? : False
2431 - ROC_acewiki.png saved!
2432 - Optimal threshold for 15.0% FPR is: 0.8583618402481079
2433 - confusion_matrix_acewiki.png saved!
2434 - False Positive Rate is: 0.1543778801843318
2435 - CONFUSION MATRIX -
2436Predicted not reverted reverted
2437Actual
2438not reverted 367 67
2439reverted 29 58
2440
2441
2442============ - abwiki - ============
2443 - Snapshot: 2025-06
2444 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2445 - Raw data shape: (2116, 17)
2446 - Duplicate rows found and removed: 38
2447 - Clean data shape: (2078, 17)
2448 - Unique revision_ids: 2078 | Data Shape: 2078 | Same? : -> True
2449 - Removing edits that are reverts from df | New Shape: (2004, 17)
2450 - Is any revert_risk_score NA? : False
2451 - Is any user_edit_count NA? : False
2452 - Is any time_to_revert NA? : False
2453 - ROC_abwiki.png saved!
2454 - Optimal threshold for 15.0% FPR is: 0.6387686729431152
2455 - confusion_matrix_abwiki.png saved!
2456 - False Positive Rate is: 0.14984059511158343
2457 - CONFUSION MATRIX -
2458Predicted not reverted reverted
2459Actual
2460not reverted 1600 282
2461reverted 17 105
2462
2463
2464============ - tyvwiki - ============
2465 - Snapshot: 2025-06
2466 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2467 - Raw data shape: (958, 17)
2468 - Duplicate rows found and removed: 6
2469 - Clean data shape: (952, 17)
2470 - Unique revision_ids: 952 | Data Shape: 952 | Same? : -> True
2471 - Removing edits that are reverts from df | New Shape: (944, 17)
2472 - Is any revert_risk_score NA? : False
2473 - Is any user_edit_count NA? : False
2474 - Is any time_to_revert NA? : False
2475 - ROC_tyvwiki.png saved!
2476 - Optimal threshold for 15.0% FPR is: 0.7302069067955017
2477 - confusion_matrix_tyvwiki.png saved!
2478 - False Positive Rate is: 0.14683815648445875
2479 - CONFUSION MATRIX -
2480Predicted not reverted reverted
2481Actual
2482not reverted 796 137
2483reverted 2 9
2484
2485
2486============ - gdwiki - ============
2487 - Snapshot: 2025-06
2488 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2489 - Raw data shape: (1116, 17)
2490 - Duplicate rows found and removed: 6
2491 - Clean data shape: (1110, 17)
2492 - Unique revision_ids: 1110 | Data Shape: 1110 | Same? : -> True
2493 - Removing edits that are reverts from df | New Shape: (1094, 17)
2494 - Is any revert_risk_score NA? : False
2495 - Is any user_edit_count NA? : False
2496 - Is any time_to_revert NA? : False
2497 - ROC_gdwiki.png saved!
2498 - Optimal threshold for 15.0% FPR is: 0.6253749132156372
2499 - confusion_matrix_gdwiki.png saved!
2500 - False Positive Rate is: 0.17580340264650285
2501 - CONFUSION MATRIX -
2502Predicted not reverted reverted
2503Actual
2504not reverted 872 186
2505reverted 8 28
2506
2507
2508============ - mznwiki - ============
2509 - Snapshot: 2025-06
2510 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2511 - Raw data shape: (41589, 17)
2512 - Duplicate rows found and removed: 39
2513 - Clean data shape: (41550, 17)
2514 - Unique revision_ids: 41550 | Data Shape: 41550 | Same? : -> True
2515 - Removing edits that are reverts from df | New Shape: (41077, 17)
2516 - Is any revert_risk_score NA? : False
2517 - Is any user_edit_count NA? : False
2518 - Is any time_to_revert NA? : False
2519 - ROC_mznwiki.png saved!
2520 - Optimal threshold for 15.0% FPR is: 0.1651594638824463
2521 - confusion_matrix_mznwiki.png saved!
2522 - False Positive Rate is: 0.14594153435227777
2523 - CONFUSION MATRIX -
2524Predicted not reverted reverted
2525Actual
2526not reverted 34533 5901
2527reverted 2 641
2528
2529
2530============ - mgwiki - ============
2531 - Snapshot: 2025-06
2532 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2533 - Raw data shape: (23484, 17)
2534 - Duplicate rows found and removed: 131
2535 - Clean data shape: (23353, 17)
2536 - Unique revision_ids: 23353 | Data Shape: 23353 | Same? : -> True
2537 - Removing edits that are reverts from df | New Shape: (23219, 17)
2538 - Is any revert_risk_score NA? : False
2539 - Is any user_edit_count NA? : False
2540 - Is any time_to_revert NA? : False
2541 - ROC_mgwiki.png saved!
2542 - Optimal threshold for 15.0% FPR is: 0.8254655599594116
2543 - confusion_matrix_mgwiki.png saved!
2544 - False Positive Rate is: 0.149770089774469
2545 - CONFUSION MATRIX -
2546Predicted not reverted reverted
2547Actual
2548not reverted 19415 3420
2549reverted 209 175
2550
2551
2552============ - cowiki - ============
2553 - Snapshot: 2025-06
2554 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2555 - Raw data shape: (1921, 17)
2556 - Duplicate rows found and removed: 42
2557 - Clean data shape: (1879, 17)
2558 - Unique revision_ids: 1879 | Data Shape: 1879 | Same? : -> True
2559 - Removing edits that are reverts from df | New Shape: (1781, 17)
2560 - Is any revert_risk_score NA? : False
2561 - Is any user_edit_count NA? : False
2562 - Is any time_to_revert NA? : False
2563 - ROC_cowiki.png saved!
2564 - Optimal threshold for 15.0% FPR is: 0.8178458213806152
2565 - confusion_matrix_cowiki.png saved!
2566 - False Positive Rate is: 0.1487553126897389
2567 - CONFUSION MATRIX -
2568Predicted not reverted reverted
2569Actual
2570not reverted 1402 245
2571reverted 35 99
2572
2573
2574============ - xmfwiki - ============
2575 - Snapshot: 2025-06
2576 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2577 - Raw data shape: (10891, 17)
2578 - Duplicate rows found and removed: 5
2579 - Clean data shape: (10886, 17)
2580 - Unique revision_ids: 10886 | Data Shape: 10886 | Same? : -> True
2581 - Removing edits that are reverts from df | New Shape: (10846, 17)
2582 - Is any revert_risk_score NA? : False
2583 - Is any user_edit_count NA? : False
2584 - Is any time_to_revert NA? : False
2585 - ROC_xmfwiki.png saved!
2586 - Optimal threshold for 15.0% FPR is: 0.14220458269119263
2587 - confusion_matrix_xmfwiki.png saved!
2588 - False Positive Rate is: 0.14713263314434427
2589 - CONFUSION MATRIX -
2590Predicted not reverted reverted
2591Actual
2592not reverted 9176 1583
2593reverted 40 47
2594
2595
2596============ - wawiki - ============
2597 - Snapshot: 2025-06
2598 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2599 - Raw data shape: (2550, 17)
2600 - Duplicate rows found and removed: 5
2601 - Clean data shape: (2545, 17)
2602 - Unique revision_ids: 2545 | Data Shape: 2545 | Same? : -> True
2603 - Removing edits that are reverts from df | New Shape: (2526, 17)
2604 - Is any revert_risk_score NA? : False
2605 - Is any user_edit_count NA? : False
2606 - Is any time_to_revert NA? : False
2607 - ROC_wawiki.png saved!
2608 - Optimal threshold for 15.0% FPR is: 0.6133654117584229
2609 - confusion_matrix_wawiki.png saved!
2610 - False Positive Rate is: 0.15860966839792248
2611 - CONFUSION MATRIX -
2612Predicted not reverted reverted
2613Actual
2614not reverted 2106 397
2615reverted 6 17
2616
2617
2618============ - nqowiki - ============
2619 - Snapshot: 2025-06
2620 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2621 - Raw data shape: (336, 17)
2622 - Duplicate rows found and removed: 4
2623 - Clean data shape: (332, 17)
2624 - Unique revision_ids: 332 | Data Shape: 332 | Same? : -> True
2625 - Removing edits that are reverts from df | New Shape: (317, 17)
2626 - Is any revert_risk_score NA? : False
2627 - Is any user_edit_count NA? : False
2628 - Is any time_to_revert NA? : False
2629 - ROC_nqowiki.png saved!
2630 - Optimal threshold for 15.0% FPR is: 0.7342619299888611
2631 - confusion_matrix_nqowiki.png saved!
2632 - False Positive Rate is: 0.18506493506493507
2633 - CONFUSION MATRIX -
2634Predicted not reverted reverted
2635Actual
2636not reverted 251 57
2637reverted 6 3
2638
2639
2640============ - pcdwiki - ============
2641 - Snapshot: 2025-06
2642 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2643 - Raw data shape: (1939, 17)
2644 - Duplicate rows found and removed: 50
2645 - Clean data shape: (1889, 17)
2646 - Unique revision_ids: 1889 | Data Shape: 1889 | Same? : -> True
2647 - Removing edits that are reverts from df | New Shape: (1853, 17)
2648 - Is any revert_risk_score NA? : False
2649 - Is any user_edit_count NA? : False
2650 - Is any time_to_revert NA? : False
2651 - ROC_pcdwiki.png saved!
2652 - Optimal threshold for 15.0% FPR is: 0.5277352929115295
2653 - confusion_matrix_pcdwiki.png saved!
2654 - False Positive Rate is: 0.14205186020293123
2655 - CONFUSION MATRIX -
2656Predicted not reverted reverted
2657Actual
2658not reverted 1522 252
2659reverted 10 69
2660
2661
2662============ - amwiki - ============
2663 - Snapshot: 2025-06
2664 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2665 - Raw data shape: (2892, 17)
2666 - Duplicate rows found and removed: 273
2667 - Clean data shape: (2619, 17)
2668 - Unique revision_ids: 2619 | Data Shape: 2619 | Same? : -> True
2669 - Removing edits that are reverts from df | New Shape: (2296, 17)
2670 - Is any revert_risk_score NA? : False
2671 - Is any user_edit_count NA? : False
2672 - Is any time_to_revert NA? : False
2673 - ROC_amwiki.png saved!
2674 - Optimal threshold for 15.0% FPR is: 0.9134624600410461
2675 - confusion_matrix_amwiki.png saved!
2676 - False Positive Rate is: 0.15059308922124806
2677 - CONFUSION MATRIX -
2678Predicted not reverted reverted
2679Actual
2680not reverted 1647 292
2681reverted 160 197
2682
2683
2684============ - emlwiki - ============
2685 - Snapshot: 2025-06
2686 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2687 - Raw data shape: (1211, 17)
2688 - Duplicate rows found and removed: 5
2689 - Clean data shape: (1206, 17)
2690 - Unique revision_ids: 1206 | Data Shape: 1206 | Same? : -> True
2691 - Removing edits that are reverts from df | New Shape: (1184, 17)
2692 - Is any revert_risk_score NA? : False
2693 - Is any user_edit_count NA? : False
2694 - Is any time_to_revert NA? : False
2695 - ROC_emlwiki.png saved!
2696 - Optimal threshold for 15.0% FPR is: 0.6133654117584229
2697 - confusion_matrix_emlwiki.png saved!
2698 - False Positive Rate is: 0.16013925152306355
2699 - CONFUSION MATRIX -
2700Predicted not reverted reverted
2701Actual
2702not reverted 965 184
2703reverted 5 30
2704
2705
2706============ - scnwiki - ============
2707 - Snapshot: 2025-06
2708 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2709 - Raw data shape: (16480, 17)
2710 - Duplicate rows found and removed: 17
2711 - Clean data shape: (16463, 17)
2712 - Unique revision_ids: 16463 | Data Shape: 16463 | Same? : -> True
2713 - Removing edits that are reverts from df | New Shape: (16393, 17)
2714 - Is any revert_risk_score NA? : False
2715 - Is any user_edit_count NA? : False
2716 - Is any time_to_revert NA? : False
2717 - ROC_scnwiki.png saved!
2718 - Optimal threshold for 15.0% FPR is: 0.4262775778770447
2719 - confusion_matrix_scnwiki.png saved!
2720 - False Positive Rate is: 0.15029080559336716
2721 - CONFUSION MATRIX -
2722Predicted not reverted reverted
2723Actual
2724not reverted 13733 2429
2725reverted 114 117
2726
2727
2728============ - zuwiki - ============
2729 - Snapshot: 2025-06
2730 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2731 - Raw data shape: (1943, 17)
2732 - Duplicate rows found and removed: 20
2733 - Clean data shape: (1923, 17)
2734 - Unique revision_ids: 1923 | Data Shape: 1923 | Same? : -> True
2735 - Removing edits that are reverts from df | New Shape: (1864, 17)
2736 - Is any revert_risk_score NA? : False
2737 - Is any user_edit_count NA? : False
2738 - Is any time_to_revert NA? : False
2739 - ROC_zuwiki.png saved!
2740 - Optimal threshold for 15.0% FPR is: 0.7980137467384338
2741 - confusion_matrix_zuwiki.png saved!
2742 - False Positive Rate is: 0.1474036850921273
2743 - CONFUSION MATRIX -
2744Predicted not reverted reverted
2745Actual
2746not reverted 1527 264
2747reverted 29 44
2748
2749
2750============ - lldwiki - ============
2751 - Snapshot: 2025-06
2752 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2753 - Raw data shape: (3069, 17)
2754 - Duplicate rows found and removed: 7
2755 - Clean data shape: (3062, 17)
2756 - Unique revision_ids: 3062 | Data Shape: 3062 | Same? : -> True
2757 - Removing edits that are reverts from df | New Shape: (3020, 17)
2758 - Is any revert_risk_score NA? : False
2759 - Is any user_edit_count NA? : False
2760 - Is any time_to_revert NA? : False
2761 - ROC_lldwiki.png saved!
2762 - Optimal threshold for 15.0% FPR is: 0.4878181219100952
2763 - confusion_matrix_lldwiki.png saved!
2764 - False Positive Rate is: 0.16341627437794218
2765 - CONFUSION MATRIX -
2766Predicted not reverted reverted
2767Actual
2768not reverted 2488 486
2769reverted 11 35
2770
2771
2772============ - bjnwiki - ============
2773 - Snapshot: 2025-06
2774 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2775 - Raw data shape: (1523, 17)
2776 - Duplicate rows found and removed: 7
2777 - Clean data shape: (1516, 17)
2778 - Unique revision_ids: 1516 | Data Shape: 1516 | Same? : -> True
2779 - Removing edits that are reverts from df | New Shape: (1479, 17)
2780 - Is any revert_risk_score NA? : False
2781 - Is any user_edit_count NA? : False
2782 - Is any time_to_revert NA? : False
2783 - ROC_bjnwiki.png saved!
2784 - Optimal threshold for 15.0% FPR is: 0.6133654117584229
2785 - confusion_matrix_bjnwiki.png saved!
2786 - False Positive Rate is: 0.12642045454545456
2787 - CONFUSION MATRIX -
2788Predicted not reverted reverted
2789Actual
2790not reverted 1230 178
2791reverted 10 61
2792
2793
2794============ - frrwiki - ============
2795 - Snapshot: 2025-06
2796 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2797 - Raw data shape: (1132, 17)
2798 - Duplicate rows found and removed: 2
2799 - Clean data shape: (1130, 17)
2800 - Unique revision_ids: 1130 | Data Shape: 1130 | Same? : -> True
2801 - Removing edits that are reverts from df | New Shape: (1120, 17)
2802 - Is any revert_risk_score NA? : False
2803 - Is any user_edit_count NA? : False
2804 - Is any time_to_revert NA? : False
2805 - ROC_frrwiki.png saved!
2806 - Optimal threshold for 15.0% FPR is: 0.65794438123703
2807 - confusion_matrix_frrwiki.png saved!
2808 - False Positive Rate is: 0.15170278637770898
2809 - CONFUSION MATRIX -
2810Predicted not reverted reverted
2811Actual
2812not reverted 822 147
2813reverted 123 28
2814
2815
2816============ - bat_smgwiki - ============
2817 - Snapshot: 2025-06
2818 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2819 - Raw data shape: (378, 17)
2820 - Duplicate rows found and removed: 7
2821 - Clean data shape: (371, 17)
2822 - Unique revision_ids: 371 | Data Shape: 371 | Same? : -> True
2823 - Removing edits that are reverts from df | New Shape: (343, 17)
2824 - Is any revert_risk_score NA? : False
2825 - Is any user_edit_count NA? : False
2826 - Is any time_to_revert NA? : False
2827 - ROC_bat_smgwiki.png saved!
2828 - Optimal threshold for 15.0% FPR is: 0.7381485104560852
2829 - confusion_matrix_bat_smgwiki.png saved!
2830 - False Positive Rate is: 0.1501597444089457
2831 - CONFUSION MATRIX -
2832Predicted not reverted reverted
2833Actual
2834not reverted 266 47
2835reverted 7 23
2836
2837
2838============ - sewiki - ============
2839 - Snapshot: 2025-06
2840 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2841 - Raw data shape: (195, 17)
2842 - Duplicate rows found and removed: 2
2843 - Clean data shape: (193, 17)
2844 - Unique revision_ids: 193 | Data Shape: 193 | Same? : -> True
2845 - Removing edits that are reverts from df | New Shape: (180, 17)
2846 - Is any revert_risk_score NA? : False
2847 - Is any user_edit_count NA? : False
2848 - Is any time_to_revert NA? : False
2849 - ROC_sewiki.png saved!
2850 - Optimal threshold for 15.0% FPR is: 0.8541494011878967
2851 - confusion_matrix_sewiki.png saved!
2852 - False Positive Rate is: 0.1532258064516129
2853 - CONFUSION MATRIX -
2854Predicted not reverted reverted
2855Actual
2856not reverted 105 19
2857reverted 39 17
2858
2859
2860============ - lfnwiki - ============
2861 - Snapshot: 2025-06
2862 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2863 - Raw data shape: (320, 17)
2864 - Duplicate rows found and removed: 3
2865 - Clean data shape: (317, 17)
2866 - Unique revision_ids: 317 | Data Shape: 317 | Same? : -> True
2867 - Removing edits that are reverts from df | New Shape: (300, 17)
2868 - Is any revert_risk_score NA? : False
2869 - Is any user_edit_count NA? : False
2870 - Is any time_to_revert NA? : False
2871 - ROC_lfnwiki.png saved!
2872 - Optimal threshold for 15.0% FPR is: 0.804692268371582
2873 - confusion_matrix_lfnwiki.png saved!
2874 - False Positive Rate is: 0.14736842105263157
2875 - CONFUSION MATRIX -
2876Predicted not reverted reverted
2877Actual
2878not reverted 243 42
2879reverted 4 11
2880
2881
2882============ - vepwiki - ============
2883 - Snapshot: 2025-06
2884 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2885 - Raw data shape: (4682, 17)
2886 - Duplicate rows found and removed: 0
2887 - Clean data shape: (4682, 17)
2888 - Unique revision_ids: 4682 | Data Shape: 4682 | Same? : -> True
2889 - Removing edits that are reverts from df | New Shape: (4669, 17)
2890 - Is any revert_risk_score NA? : False
2891 - Is any user_edit_count NA? : False
2892 - Is any time_to_revert NA? : False
2893 - ROC_vepwiki.png saved!
2894 - Optimal threshold for 15.0% FPR is: 0.284596711397171
2895 - confusion_matrix_vepwiki.png saved!
2896 - False Positive Rate is: 0.15042918454935622
2897 - CONFUSION MATRIX -
2898Predicted not reverted reverted
2899Actual
2900not reverted 3959 701
2901reverted 3 6
2902
2903
2904============ - kabwiki - ============
2905 - Snapshot: 2025-06
2906 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2907 - Raw data shape: (845, 17)
2908 - Duplicate rows found and removed: 182
2909 - Clean data shape: (663, 17)
2910 - Unique revision_ids: 663 | Data Shape: 663 | Same? : -> True
2911 - Removing edits that are reverts from df | New Shape: (649, 17)
2912 - Is any revert_risk_score NA? : False
2913 - Is any user_edit_count NA? : False
2914 - Is any time_to_revert NA? : False
2915 - ROC_kabwiki.png saved!
2916 - Optimal threshold for 15.0% FPR is: 0.874995231628418
2917 - confusion_matrix_kabwiki.png saved!
2918 - False Positive Rate is: 0.11305732484076433
2919 - CONFUSION MATRIX -
2920Predicted not reverted reverted
2921Actual
2922not reverted 557 71
2923reverted 9 12
2924
2925
2926============ - ruewiki - ============
2927 - Snapshot: 2025-06
2928 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2929 - Raw data shape: (4294, 17)
2930 - Duplicate rows found and removed: 31
2931 - Clean data shape: (4263, 17)
2932 - Unique revision_ids: 4263 | Data Shape: 4263 | Same? : -> True
2933 - Removing edits that are reverts from df | New Shape: (4172, 17)
2934 - Is any revert_risk_score NA? : False
2935 - Is any user_edit_count NA? : False
2936 - Is any time_to_revert NA? : False
2937 - ROC_ruewiki.png saved!
2938 - Optimal threshold for 15.0% FPR is: 0.47483348846435547
2939 - confusion_matrix_ruewiki.png saved!
2940 - False Positive Rate is: 0.15271914576607898
2941 - CONFUSION MATRIX -
2942Predicted not reverted reverted
2943Actual
2944not reverted 3412 615
2945reverted 22 123
2946
2947
2948============ - ugwiki - ============
2949 - Snapshot: 2025-06
2950 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2951 - Raw data shape: (7340, 17)
2952 - Duplicate rows found and removed: 0
2953 - Clean data shape: (7340, 17)
2954 - Unique revision_ids: 7340 | Data Shape: 7340 | Same? : -> True
2955 - Removing edits that are reverts from df | New Shape: (7331, 17)
2956 - Is any revert_risk_score NA? : False
2957 - Is any user_edit_count NA? : False
2958 - Is any time_to_revert NA? : False
2959 - ROC_ugwiki.png saved!
2960 - Optimal threshold for 15.0% FPR is: 0.33279338479042053
2961 - confusion_matrix_ugwiki.png saved!
2962 - False Positive Rate is: 0.15025269771889085
2963 - CONFUSION MATRIX -
2964Predicted not reverted reverted
2965Actual
2966not reverted 6221 1100
2967reverted 4 6
2968
2969
2970============ - lezwiki - ============
2971 - Snapshot: 2025-06
2972 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2973 - Raw data shape: (338, 17)
2974 - Duplicate rows found and removed: 0
2975 - Clean data shape: (338, 17)
2976 - Unique revision_ids: 338 | Data Shape: 338 | Same? : -> True
2977 - Removing edits that are reverts from df | New Shape: (338, 17)
2978 - Is any revert_risk_score NA? : False
2979 - Is any user_edit_count NA? : False
2980 - Is any time_to_revert NA? : False
2981 - ROC_lezwiki.png saved!
2982 - Optimal threshold for 15.0% FPR is: 0.882215678691864
2983 - confusion_matrix_lezwiki.png saved!
2984 - False Positive Rate is: 0.15
2985 - CONFUSION MATRIX -
2986Predicted not reverted reverted
2987Actual
2988not reverted 272 48
2989reverted 1 17
2990
2991
2992============ - szlwiki - ============
2993 - Snapshot: 2025-06
2994 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
2995 - Raw data shape: (1874, 17)
2996 - Duplicate rows found and removed: 35
2997 - Clean data shape: (1839, 17)
2998 - Unique revision_ids: 1839 | Data Shape: 1839 | Same? : -> True
2999 - Removing edits that are reverts from df | New Shape: (1728, 17)
3000 - Is any revert_risk_score NA? : False
3001 - Is any user_edit_count NA? : False
3002 - Is any time_to_revert NA? : False
3003 - ROC_szlwiki.png saved!
3004 - Optimal threshold for 15.0% FPR is: 0.6092071533203125
3005 - confusion_matrix_szlwiki.png saved!
3006 - False Positive Rate is: 0.14799025578562727
3007 - CONFUSION MATRIX -
3008Predicted not reverted reverted
3009Actual
3010not reverted 1399 243
3011reverted 11 75
3012
3013
3014============ - frpwiki - ============
3015 - Snapshot: 2025-06
3016 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3017 - Raw data shape: (287, 17)
3018 - Duplicate rows found and removed: 7
3019 - Clean data shape: (280, 17)
3020 - Unique revision_ids: 280 | Data Shape: 280 | Same? : -> True
3021 - Removing edits that are reverts from df | New Shape: (270, 17)
3022 - Is any revert_risk_score NA? : False
3023 - Is any user_edit_count NA? : False
3024 - Is any time_to_revert NA? : False
3025 - ROC_frpwiki.png saved!
3026 - Optimal threshold for 15.0% FPR is: 0.8332036137580872
3027 - confusion_matrix_frpwiki.png saved!
3028 - False Positive Rate is: 0.1646586345381526
3029 - CONFUSION MATRIX -
3030Predicted not reverted reverted
3031Actual
3032not reverted 208 41
3033reverted 9 12
3034
3035============ - olowiki - ============
3036 - Snapshot: 2025-06
3037 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3038 - Raw data shape: (492, 17)
3039 - Duplicate rows found and removed: 1
3040 - Clean data shape: (491, 17)
3041 - Unique revision_ids: 491 | Data Shape: 491 | Same? : -> True
3042 - Removing edits that are reverts from df | New Shape: (481, 17)
3043 - Is any revert_risk_score NA? : False
3044 - Is any user_edit_count NA? : False
3045 - Is any time_to_revert NA? : False
3046 - ROC_olowiki.png saved!
3047 - Optimal threshold for 15.0% FPR is: 0.615082323551178
3048 - confusion_matrix_olowiki.png saved!
3049 - False Positive Rate is: 0.1670235546038544
3050 - CONFUSION MATRIX -
3051Predicted not reverted reverted
3052Actual
3053not reverted 389 78
3054reverted 3 11
3055
3056
3057============ - bpywiki - ============
3058 - Snapshot: 2025-06
3059 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3060 - Raw data shape: (644, 17)
3061 - Duplicate rows found and removed: 32
3062 - Clean data shape: (612, 17)
3063 - Unique revision_ids: 612 | Data Shape: 612 | Same? : -> True
3064 - Removing edits that are reverts from df | New Shape: (567, 17)
3065 - Is any revert_risk_score NA? : False
3066 - Is any user_edit_count NA? : False
3067 - Is any time_to_revert NA? : False
3068 - ROC_bpywiki.png saved!
3069 - Optimal threshold for 15.0% FPR is: 0.9044924974441528
3070 - confusion_matrix_bpywiki.png saved!
3071 - False Positive Rate is: 0.1461864406779661
3072 - CONFUSION MATRIX -
3073Predicted not reverted reverted
3074Actual
3075not reverted 403 69
3076reverted 30 65
3077
3078
3079============ - rwwiki - ============
3080 - Snapshot: 2025-06
3081 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3082 - Raw data shape: (11713, 17)
3083 - Duplicate rows found and removed: 23
3084 - Clean data shape: (11690, 17)
3085 - Unique revision_ids: 11690 | Data Shape: 11690 | Same? : -> True
3086 - Removing edits that are reverts from df | New Shape: (11592, 17)
3087 - Is any revert_risk_score NA? : False
3088 - Is any user_edit_count NA? : False
3089 - Is any time_to_revert NA? : False
3090 - ROC_rwwiki.png saved!
3091 - Optimal threshold for 15.0% FPR is: 0.6023024320602417
3092 - confusion_matrix_rwwiki.png saved!
3093 - False Positive Rate is: 0.15309842041312272
3094 - CONFUSION MATRIX -
3095Predicted not reverted reverted
3096Actual
3097not reverted 9758 1764
3098reverted 24 46
3099
3100
3101============ - mhrwiki - ============
3102 - Snapshot: 2025-06
3103 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3104 - Raw data shape: (1037, 17)
3105 - Duplicate rows found and removed: 16
3106 - Clean data shape: (1021, 17)
3107 - Unique revision_ids: 1021 | Data Shape: 1021 | Same? : -> True
3108 - Removing edits that are reverts from df | New Shape: (999, 17)
3109 - Is any revert_risk_score NA? : False
3110 - Is any user_edit_count NA? : False
3111 - Is any time_to_revert NA? : False
3112 - ROC_mhrwiki.png saved!
3113 - Optimal threshold for 15.0% FPR is: 0.8115577101707458
3114 - confusion_matrix_mhrwiki.png saved!
3115 - False Positive Rate is: 0.14681724845995894
3116 - CONFUSION MATRIX -
3117Predicted not reverted reverted
3118Actual
3119not reverted 831 143
3120reverted 6 19
3121
3122
3123============ - gorwiki - ============
3124 - Snapshot: 2025-06
3125 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3126 - Raw data shape: (1652, 17)
3127 - Duplicate rows found and removed: 66
3128 - Clean data shape: (1586, 17)
3129 - Unique revision_ids: 1586 | Data Shape: 1586 | Same? : -> True
3130 - Removing edits that are reverts from df | New Shape: (1509, 17)
3131 - Is any revert_risk_score NA? : False
3132 - Is any user_edit_count NA? : False
3133 - Is any time_to_revert NA? : False
3134 - ROC_gorwiki.png saved!
3135 - Optimal threshold for 15.0% FPR is: 0.6311735510826111
3136 - confusion_matrix_gorwiki.png saved!
3137 - False Positive Rate is: 0.153954802259887
3138 - CONFUSION MATRIX -
3139Predicted not reverted reverted
3140Actual
3141not reverted 1198 218
3142reverted 15 78
3143
3144
3145============ - dsbwiki - ============
3146 - Snapshot: 2025-06
3147 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3148 - Raw data shape: (708, 17)
3149 - Duplicate rows found and removed: 5
3150 - Clean data shape: (703, 17)
3151 - Unique revision_ids: 703 | Data Shape: 703 | Same? : -> True
3152 - Removing edits that are reverts from df | New Shape: (688, 17)
3153 - Is any revert_risk_score NA? : False
3154 - Is any user_edit_count NA? : False
3155 - Is any time_to_revert NA? : False
3156 - ROC_dsbwiki.png saved!
3157 - Optimal threshold for 15.0% FPR is: 0.8436957001686096
3158 - confusion_matrix_dsbwiki.png saved!
3159 - False Positive Rate is: 0.14754098360655737
3160 - CONFUSION MATRIX -
3161Predicted not reverted reverted
3162Actual
3163not reverted 572 99
3164reverted 8 9
3165
3166
3167============ - rmwiki - ============
3168 - Snapshot: 2025-06
3169 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3170 - Raw data shape: (890, 17)
3171 - Duplicate rows found and removed: 0
3172 - Clean data shape: (890, 17)
3173 - Unique revision_ids: 890 | Data Shape: 890 | Same? : -> True
3174 - Removing edits that are reverts from df | New Shape: (861, 17)
3175 - Is any revert_risk_score NA? : False
3176 - Is any user_edit_count NA? : False
3177 - Is any time_to_revert NA? : False
3178 - ROC_rmwiki.png saved!
3179 - Optimal threshold for 15.0% FPR is: 0.7736589908599854
3180 - confusion_matrix_rmwiki.png saved!
3181 - False Positive Rate is: 0.15222772277227722
3182 - CONFUSION MATRIX -
3183Predicted not reverted reverted
3184Actual
3185not reverted 685 123
3186reverted 31 22
3187
3188
3189============ - glkwiki - ============
3190 - Snapshot: 2025-06
3191 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3192 - Raw data shape: (41120, 17)
3193 - Duplicate rows found and removed: 179
3194 - Clean data shape: (40941, 17)
3195 - Unique revision_ids: 40941 | Data Shape: 40941 | Same? : -> True
3196 - Removing edits that are reverts from df | New Shape: (40369, 17)
3197 - Is any revert_risk_score NA? : False
3198 - Is any user_edit_count NA? : False
3199 - Is any time_to_revert NA? : False
3200 - ROC_glkwiki.png saved!
3201 - Optimal threshold for 15.0% FPR is: 0.21582037210464478
3202 - confusion_matrix_glkwiki.png saved!
3203 - False Positive Rate is: 0.1461611232822147
3204 - CONFUSION MATRIX -
3205Predicted not reverted reverted
3206Actual
3207not reverted 34297 5871
3208reverted 12 189
3209
3210
3211============ - napwiki - ============
3212 - Snapshot: 2025-06
3213 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3214 - Raw data shape: (837, 17)
3215 - Duplicate rows found and removed: 6
3216 - Clean data shape: (831, 17)
3217 - Unique revision_ids: 831 | Data Shape: 831 | Same? : -> True
3218 - Removing edits that are reverts from df | New Shape: (807, 17)
3219 - Is any revert_risk_score NA? : False
3220 - Is any user_edit_count NA? : False
3221 - Is any time_to_revert NA? : False
3222 - ROC_napwiki.png saved!
3223 - Optimal threshold for 15.0% FPR is: 0.9015378952026367
3224 - confusion_matrix_napwiki.png saved!
3225 - False Positive Rate is: 0.14657534246575343
3226 - CONFUSION MATRIX -
3227Predicted not reverted reverted
3228Actual
3229not reverted 623 107
3230reverted 36 41
3231
3232
3233============ - gnwiki - ============
3234 - Snapshot: 2025-06
3235 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3236 - Raw data shape: (1683, 17)
3237 - Duplicate rows found and removed: 4
3238 - Clean data shape: (1679, 17)
3239 - Unique revision_ids: 1679 | Data Shape: 1679 | Same? : -> True
3240 - Removing edits that are reverts from df | New Shape: (1645, 17)
3241 - Is any revert_risk_score NA? : False
3242 - Is any user_edit_count NA? : False
3243 - Is any time_to_revert NA? : False
3244 - ROC_gnwiki.png saved!
3245 - Optimal threshold for 15.0% FPR is: 0.6133654117584229
3246 - confusion_matrix_gnwiki.png saved!
3247 - False Positive Rate is: 0.15346225826575172
3248 - CONFUSION MATRIX -
3249Predicted not reverted reverted
3250Actual
3251not reverted 1357 246
3252reverted 4 38
3253
3254
3255============ - fiu_vrowiki - ============
3256 - Snapshot: 2025-06
3257 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3258 - Raw data shape: (440, 17)
3259 - Duplicate rows found and removed: 2
3260 - Clean data shape: (438, 17)
3261 - Unique revision_ids: 438 | Data Shape: 438 | Same? : -> True
3262 - Removing edits that are reverts from df | New Shape: (420, 17)
3263 - Is any revert_risk_score NA? : False
3264 - Is any user_edit_count NA? : False
3265 - Is any time_to_revert NA? : False
3266 - ROC_fiu_vrowiki.png saved!
3267 - Optimal threshold for 15.0% FPR is: 0.6267386078834534
3268 - confusion_matrix_fiu_vrowiki.png saved!
3269 - False Positive Rate is: 0.14356435643564355
3270 - CONFUSION MATRIX -
3271Predicted not reverted reverted
3272Actual
3273not reverted 346 58
3274reverted 7 9
3275
3276
3277============ - snwiki - ============
3278 - Snapshot: 2025-06
3279 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3280 - Raw data shape: (713, 17)
3281 - Duplicate rows found and removed: 24
3282 - Clean data shape: (689, 17)
3283 - Unique revision_ids: 689 | Data Shape: 689 | Same? : -> True
3284 - Removing edits that are reverts from df | New Shape: (636, 17)
3285 - Is any revert_risk_score NA? : False
3286 - Is any user_edit_count NA? : False
3287 - Is any time_to_revert NA? : False
3288 - ROC_snwiki.png saved!
3289 - Optimal threshold for 15.0% FPR is: 0.8449905514717102
3290 - confusion_matrix_snwiki.png saved!
3291 - False Positive Rate is: 0.1558219178082192
3292 - CONFUSION MATRIX -
3293Predicted not reverted reverted
3294Actual
3295not reverted 493 91
3296reverted 13 39
3297
3298
3299============ - hawwiki - ============
3300 - Snapshot: 2025-06
3301 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3302 - Raw data shape: (648, 17)
3303 - Duplicate rows found and removed: 25
3304 - Clean data shape: (623, 17)
3305 - Unique revision_ids: 623 | Data Shape: 623 | Same? : -> True
3306 - Removing edits that are reverts from df | New Shape: (577, 17)
3307 - Is any revert_risk_score NA? : False
3308 - Is any user_edit_count NA? : False
3309 - Is any time_to_revert NA? : False
3310 - ROC_hawwiki.png saved!
3311 - Optimal threshold for 15.0% FPR is: 0.7269960045814514
3312 - confusion_matrix_hawwiki.png saved!
3313 - False Positive Rate is: 0.15180265654648956
3314 - CONFUSION MATRIX -
3315Predicted not reverted reverted
3316Actual
3317not reverted 447 80
3318reverted 8 42
3319
3320
3321============ - gomwiki - ============
3322 - Snapshot: 2025-06
3323 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3324 - Raw data shape: (524, 17)
3325 - Duplicate rows found and removed: 2
3326 - Clean data shape: (522, 17)
3327 - Unique revision_ids: 522 | Data Shape: 522 | Same? : -> True
3328 - Removing edits that are reverts from df | New Shape: (507, 17)
3329 - Is any revert_risk_score NA? : False
3330 - Is any user_edit_count NA? : False
3331 - Is any time_to_revert NA? : False
3332 - ROC_gomwiki.png saved!
3333 - Optimal threshold for 15.0% FPR is: 0.791193425655365
3334 - confusion_matrix_gomwiki.png saved!
3335 - False Positive Rate is: 0.16359918200409
3336 - CONFUSION MATRIX -
3337Predicted not reverted reverted
3338Actual
3339not reverted 409 80
3340reverted 7 11
3341
3342
3343============ - atjwiki - ============
3344 - Snapshot: 2025-06
3345 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3346 - Raw data shape: (231, 17)
3347 - Duplicate rows found and removed: 1
3348 - Clean data shape: (230, 17)
3349 - Unique revision_ids: 230 | Data Shape: 230 | Same? : -> True
3350 - Removing edits that are reverts from df | New Shape: (220, 17)
3351 - Is any revert_risk_score NA? : False
3352 - Is any user_edit_count NA? : False
3353 - Is any time_to_revert NA? : False
3354 - ROC_atjwiki.png saved!
3355 - Optimal threshold for 15.0% FPR is: 0.6775605082511902
3356 - confusion_matrix_atjwiki.png saved!
3357 - False Positive Rate is: 0.11650485436893204
3358 - CONFUSION MATRIX -
3359Predicted not reverted reverted
3360Actual
3361not reverted 182 24
3362reverted 6 8
3363
3364
3365============ - awawiki - ============
3366 - Snapshot: 2025-06
3367 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3368 - Raw data shape: (985, 17)
3369 - Duplicate rows found and removed: 20
3370 - Clean data shape: (965, 17)
3371 - Unique revision_ids: 965 | Data Shape: 965 | Same? : -> True
3372 - Removing edits that are reverts from df | New Shape: (914, 17)
3373 - Is any revert_risk_score NA? : False
3374 - Is any user_edit_count NA? : False
3375 - Is any time_to_revert NA? : False
3376 - ROC_awawiki.png saved!
3377 - Optimal threshold for 15.0% FPR is: 0.7764677405357361
3378 - confusion_matrix_awawiki.png saved!
3379 - False Positive Rate is: 0.14801864801864803
3380 - CONFUSION MATRIX -
3381Predicted not reverted reverted
3382Actual
3383not reverted 731 127
3384reverted 24 32
3385
3386
3387============ - hifwiki - ============
3388 - Snapshot: 2025-06
3389 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3390 - Raw data shape: (6520, 17)
3391 - Duplicate rows found and removed: 61
3392 - Clean data shape: (6459, 17)
3393 - Unique revision_ids: 6459 | Data Shape: 6459 | Same? : -> True
3394 - Removing edits that are reverts from df | New Shape: (5989, 17)
3395 - Is any revert_risk_score NA? : False
3396 - Is any user_edit_count NA? : False
3397 - Is any time_to_revert NA? : False
3398 - ROC_hifwiki.png saved!
3399 - Optimal threshold for 15.0% FPR is: 0.5998679995536804
3400 - confusion_matrix_hifwiki.png saved!
3401 - False Positive Rate is: 0.1495664739884393
3402 - CONFUSION MATRIX -
3403Predicted not reverted reverted
3404Actual
3405not reverted 4708 828
3406reverted 43 410
3407
3408
3409============ - vlswiki - ============
3410 - Snapshot: 2025-06
3411 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3412 - Raw data shape: (2929, 17)
3413 - Duplicate rows found and removed: 3
3414 - Clean data shape: (2926, 17)
3415 - Unique revision_ids: 2926 | Data Shape: 2926 | Same? : -> True
3416 - Removing edits that are reverts from df | New Shape: (2905, 17)
3417 - Is any revert_risk_score NA? : False
3418 - Is any user_edit_count NA? : False
3419 - Is any time_to_revert NA? : False
3420 - ROC_vlswiki.png saved!
3421 - Optimal threshold for 15.0% FPR is: 0.7379829287528992
3422 - confusion_matrix_vlswiki.png saved!
3423 - False Positive Rate is: 0.1499644633972992
3424 - CONFUSION MATRIX -
3425Predicted not reverted reverted
3426Actual
3427not reverted 2392 422
3428reverted 31 60
3429
3430
3431============ - hsbwiki - ============
3432 - Snapshot: 2025-06
3433 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3434 - Raw data shape: (1658, 17)
3435 - Duplicate rows found and removed: 23
3436 - Clean data shape: (1635, 17)
3437 - Unique revision_ids: 1635 | Data Shape: 1635 | Same? : -> True
3438 - Removing edits that are reverts from df | New Shape: (1594, 17)
3439 - Is any revert_risk_score NA? : False
3440 - Is any user_edit_count NA? : False
3441 - Is any time_to_revert NA? : False
3442 - ROC_hsbwiki.png saved!
3443 - Optimal threshold for 15.0% FPR is: 0.6192979216575623
3444 - confusion_matrix_hsbwiki.png saved!
3445 - False Positive Rate is: 0.14868421052631578
3446 - CONFUSION MATRIX -
3447Predicted not reverted reverted
3448Actual
3449not reverted 1294 226
3450reverted 9 65
3451
3452
3453============ - papwiki - ============
3454 - Snapshot: 2025-06
3455 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3456 - Raw data shape: (15067, 17)
3457 - Duplicate rows found and removed: 9
3458 - Clean data shape: (15058, 17)
3459 - Unique revision_ids: 15058 | Data Shape: 15058 | Same? : -> True
3460 - Removing edits that are reverts from df | New Shape: (15015, 17)
3461 - Is any revert_risk_score NA? : False
3462 - Is any user_edit_count NA? : False
3463 - Is any time_to_revert NA? : False
3464 - ROC_papwiki.png saved!
3465 - Optimal threshold for 15.0% FPR is: 0.3176918029785156
3466 - confusion_matrix_papwiki.png saved!
3467 - False Positive Rate is: 0.14998307952622675
3468 - CONFUSION MATRIX -
3469Predicted not reverted reverted
3470Actual
3471not reverted 12559 2216
3472reverted 80 160
3473
3474
3475============ - ilowiki - ============
3476 - Snapshot: 2025-06
3477 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3478 - Raw data shape: (1793, 17)
3479 - Duplicate rows found and removed: 102
3480 - Clean data shape: (1691, 17)
3481 - Unique revision_ids: 1691 | Data Shape: 1691 | Same? : -> True
3482 - Removing edits that are reverts from df | New Shape: (1593, 17)
3483 - Is any revert_risk_score NA? : False
3484 - Is any user_edit_count NA? : False
3485 - Is any time_to_revert NA? : False
3486 - ROC_ilowiki.png saved!
3487 - Optimal threshold for 15.0% FPR is: 0.8618624210357666
3488 - confusion_matrix_ilowiki.png saved!
3489 - False Positive Rate is: 0.15263518138261464
3490 - CONFUSION MATRIX -
3491Predicted not reverted reverted
3492Actual
3493not reverted 1238 223
3494reverted 24 108
3495
3496
3497============ - angwiki - ============
3498 - Snapshot: 2025-06
3499 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3500 - Raw data shape: (2230, 17)
3501 - Duplicate rows found and removed: 37
3502 - Clean data shape: (2193, 17)
3503 - Unique revision_ids: 2193 | Data Shape: 2193 | Same? : -> True
3504 - Removing edits that are reverts from df | New Shape: (2089, 17)
3505 - Is any revert_risk_score NA? : False
3506 - Is any user_edit_count NA? : False
3507 - Is any time_to_revert NA? : False
3508 - ROC_angwiki.png saved!
3509 - Optimal threshold for 15.0% FPR is: 0.6236856579780579
3510 - confusion_matrix_angwiki.png saved!
3511 - False Positive Rate is: 0.15792103948025987
3512 - CONFUSION MATRIX -
3513Predicted not reverted reverted
3514Actual
3515not reverted 1685 316
3516reverted 6 82
3517
3518
3519============ - udmwiki - ============
3520 - Snapshot: 2025-06
3521 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3522 - Raw data shape: (263, 17)
3523 - Duplicate rows found and removed: 3
3524 - Clean data shape: (260, 17)
3525 - Unique revision_ids: 260 | Data Shape: 260 | Same? : -> True
3526 - Removing edits that are reverts from df | New Shape: (242, 17)
3527 - Is any revert_risk_score NA? : False
3528 - Is any user_edit_count NA? : False
3529 - Is any time_to_revert NA? : False
3530 - ROC_udmwiki.png saved!
3531 - Optimal threshold for 15.0% FPR is: 0.9283071756362915
3532 - confusion_matrix_udmwiki.png saved!
3533 - False Positive Rate is: 0.15486725663716813
3534 - CONFUSION MATRIX -
3535Predicted not reverted reverted
3536Actual
3537not reverted 191 35
3538reverted 12 4
3539
3540
3541============ - inhwiki - ============
3542 - Snapshot: 2025-06
3543 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3544 - Raw data shape: (1889, 17)
3545 - Duplicate rows found and removed: 4
3546 - Clean data shape: (1885, 17)
3547 - Unique revision_ids: 1885 | Data Shape: 1885 | Same? : -> True
3548 - Removing edits that are reverts from df | New Shape: (1845, 17)
3549 - Is any revert_risk_score NA? : False
3550 - Is any user_edit_count NA? : False
3551 - Is any time_to_revert NA? : False
3552 - ROC_inhwiki.png saved!
3553 - Optimal threshold for 15.0% FPR is: 0.5125144124031067
3554 - confusion_matrix_inhwiki.png saved!
3555 - False Positive Rate is: 0.12583148558758314
3556 - CONFUSION MATRIX -
3557Predicted not reverted reverted
3558Actual
3559not reverted 1577 227
3560reverted 11 30
3561
3562
3563============ - shnwiki - ============
3564 - Snapshot: 2025-06
3565 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3566 - Raw data shape: (38321, 17)
3567 - Duplicate rows found and removed: 2
3568 - Clean data shape: (38319, 17)
3569 - Unique revision_ids: 38319 | Data Shape: 38319 | Same? : -> True
3570 - Removing edits that are reverts from df | New Shape: (38297, 17)
3571 - Is any revert_risk_score NA? : False
3572 - Is any user_edit_count NA? : False
3573 - Is any time_to_revert NA? : False
3574 - ROC_shnwiki.png saved!
3575 - Optimal threshold for 15.0% FPR is: 0.39125075936317444
3576 - confusion_matrix_shnwiki.png saved!
3577 - False Positive Rate is: 0.14907627583683922
3578 - CONFUSION MATRIX -
3579Predicted not reverted reverted
3580Actual
3581not reverted 32564 5705
3582reverted 9 19
3583
3584
3585============ - roa_tarawiki - ============
3586 - Snapshot: 2025-06
3587 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3588 - Raw data shape: (395, 17)
3589 - Duplicate rows found and removed: 0
3590 - Clean data shape: (395, 17)
3591 - Unique revision_ids: 395 | Data Shape: 395 | Same? : -> True
3592 - Removing edits that are reverts from df | New Shape: (392, 17)
3593 - Is any revert_risk_score NA? : False
3594 - Is any user_edit_count NA? : False
3595 - Is any time_to_revert NA? : False
3596 - ROC_roa_tarawiki.png saved!
3597 - Optimal threshold for 15.0% FPR is: 0.8557033538818359
3598 - confusion_matrix_roa_tarawiki.png saved!
3599 - False Positive Rate is: 0.14397905759162305
3600 - CONFUSION MATRIX -
3601Predicted not reverted reverted
3602Actual
3603not reverted 327 55
3604reverted 8 2
3605
3606
3607============ - pamwiki - ============
3608 - Snapshot: 2025-06
3609 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3610 - Raw data shape: (1718, 17)
3611 - Duplicate rows found and removed: 108
3612 - Clean data shape: (1610, 17)
3613 - Unique revision_ids: 1610 | Data Shape: 1610 | Same? : -> True
3614 - Removing edits that are reverts from df | New Shape: (1567, 17)
3615 - Is any revert_risk_score NA? : False
3616 - Is any user_edit_count NA? : False
3617 - Is any time_to_revert NA? : False
3618 - ROC_pamwiki.png saved!
3619 - Optimal threshold for 15.0% FPR is: 0.6135830283164978
3620 - confusion_matrix_pamwiki.png saved!
3621 - False Positive Rate is: 0.15053763440860216
3622 - CONFUSION MATRIX -
3623Predicted not reverted reverted
3624Actual
3625not reverted 1185 210
3626reverted 8 164
3627
3628
3629============ - hakwiki - ============
3630 - Snapshot: 2025-06
3631 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3632 - Raw data shape: (1724, 17)
3633 - Duplicate rows found and removed: 11
3634 - Clean data shape: (1713, 17)
3635 - Unique revision_ids: 1713 | Data Shape: 1713 | Same? : -> True
3636 - Removing edits that are reverts from df | New Shape: (1686, 17)
3637 - Is any revert_risk_score NA? : False
3638 - Is any user_edit_count NA? : False
3639 - Is any time_to_revert NA? : False
3640 - ROC_hakwiki.png saved!
3641 - Optimal threshold for 15.0% FPR is: 0.872566819190979
3642 - confusion_matrix_hakwiki.png saved!
3643 - False Positive Rate is: 0.14906457453228728
3644 - CONFUSION MATRIX -
3645Predicted not reverted reverted
3646Actual
3647not reverted 1410 247
3648reverted 9 20
3649
3650
3651============ - xhwiki - ============
3652 - Snapshot: 2025-06
3653 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3654 - Raw data shape: (989, 17)
3655 - Duplicate rows found and removed: 7
3656 - Clean data shape: (982, 17)
3657 - Unique revision_ids: 982 | Data Shape: 982 | Same? : -> True
3658 - Removing edits that are reverts from df | New Shape: (955, 17)
3659 - Is any revert_risk_score NA? : False
3660 - Is any user_edit_count NA? : False
3661 - Is any time_to_revert NA? : False
3662 - ROC_xhwiki.png saved!
3663 - Optimal threshold for 15.0% FPR is: 0.8263096809387207
3664 - confusion_matrix_xhwiki.png saved!
3665 - False Positive Rate is: 0.15570934256055363
3666 - CONFUSION MATRIX -
3667Predicted not reverted reverted
3668Actual
3669not reverted 732 135
3670reverted 61 27
3671
3672
3673============ - cdowiki - ============
3674 - Snapshot: 2025-06
3675 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3676 - Raw data shape: (675, 17)
3677 - Duplicate rows found and removed: 5
3678 - Clean data shape: (670, 17)
3679 - Unique revision_ids: 670 | Data Shape: 670 | Same? : -> True
3680 - Removing edits that are reverts from df | New Shape: (615, 17)
3681 - Is any revert_risk_score NA? : False
3682 - Is any user_edit_count NA? : False
3683 - Is any time_to_revert NA? : False
3684 - ROC_cdowiki.png saved!
3685 - Optimal threshold for 15.0% FPR is: 0.7494366765022278
3686 - confusion_matrix_cdowiki.png saved!
3687 - False Positive Rate is: 0.157439446366782
3688 - CONFUSION MATRIX -
3689Predicted not reverted reverted
3690Actual
3691not reverted 487 91
3692reverted 16 21
3693
3694
3695============ - crwiki - ============
3696 - Snapshot: 2025-06
3697 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3698 - Raw data shape: (354, 17)
3699 - Duplicate rows found and removed: 73
3700 - Clean data shape: (281, 17)
3701 - Unique revision_ids: 281 | Data Shape: 281 | Same? : -> True
3702 - Removing edits that are reverts from df | New Shape: (192, 17)
3703 - Is any revert_risk_score NA? : False
3704 - Is any user_edit_count NA? : False
3705 - Is any time_to_revert NA? : False
3706 - ROC_crwiki.png saved!
3707 - Optimal threshold for 15.0% FPR is: 0.9600421190261841
3708 - confusion_matrix_crwiki.png saved!
3709 - False Positive Rate is: 0.13924050632911392
3710 - CONFUSION MATRIX -
3711Predicted not reverted reverted
3712Actual
3713not reverted 68 11
3714reverted 61 52
3715
3716
3717============ - bowiki - ============
3718 - Snapshot: 2025-06
3719 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3720 - Raw data shape: (2967, 17)
3721 - Duplicate rows found and removed: 13
3722 - Clean data shape: (2954, 17)
3723 - Unique revision_ids: 2954 | Data Shape: 2954 | Same? : -> True
3724 - Removing edits that are reverts from df | New Shape: (2914, 17)
3725 - Is any revert_risk_score NA? : False
3726 - Is any user_edit_count NA? : False
3727 - Is any time_to_revert NA? : False
3728 - ROC_bowiki.png saved!
3729 - Optimal threshold for 15.0% FPR is: 0.6828720569610596
3730 - confusion_matrix_bowiki.png saved!
3731 - False Positive Rate is: 0.1438721136767318
3732 - CONFUSION MATRIX -
3733Predicted not reverted reverted
3734Actual
3735not reverted 2410 405
3736reverted 61 38
3737
3738
3739============ - mwlwiki - ============
3740 - Snapshot: 2025-06
3741 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3742 - Raw data shape: (1392, 17)
3743 - Duplicate rows found and removed: 15
3744 - Clean data shape: (1377, 17)
3745 - Unique revision_ids: 1377 | Data Shape: 1377 | Same? : -> True
3746 - Removing edits that are reverts from df | New Shape: (1341, 17)
3747 - Is any revert_risk_score NA? : False
3748 - Is any user_edit_count NA? : False
3749 - Is any time_to_revert NA? : False
3750 - ROC_mwlwiki.png saved!
3751 - Optimal threshold for 15.0% FPR is: 0.7521858215332031
3752 - confusion_matrix_mwlwiki.png saved!
3753 - False Positive Rate is: 0.15007656967840735
3754 - CONFUSION MATRIX -
3755Predicted not reverted reverted
3756Actual
3757not reverted 1110 196
3758reverted 0 35
3759
3760
3761============ - kvwiki - ============
3762 - Snapshot: 2025-06
3763 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3764 - Raw data shape: (1342, 17)
3765 - Duplicate rows found and removed: 18
3766 - Clean data shape: (1324, 17)
3767 - Unique revision_ids: 1324 | Data Shape: 1324 | Same? : -> True
3768 - Removing edits that are reverts from df | New Shape: (1284, 17)
3769 - Is any revert_risk_score NA? : False
3770 - Is any user_edit_count NA? : False
3771 - Is any time_to_revert NA? : False
3772 - ROC_kvwiki.png saved!
3773 - Optimal threshold for 15.0% FPR is: 0.8378894925117493
3774 - confusion_matrix_kvwiki.png saved!
3775 - False Positive Rate is: 0.15024232633279483
3776 - CONFUSION MATRIX -
3777Predicted not reverted reverted
3778Actual
3779not reverted 1052 186
3780reverted 17 29
3781
3782
3783============ - nvwiki - ============
3784 - Snapshot: 2025-06
3785 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3786 - Raw data shape: (531, 17)
3787 - Duplicate rows found and removed: 25
3788 - Clean data shape: (506, 17)
3789 - Unique revision_ids: 506 | Data Shape: 506 | Same? : -> True
3790 - Removing edits that are reverts from df | New Shape: (366, 17)
3791 - Is any revert_risk_score NA? : False
3792 - Is any user_edit_count NA? : False
3793 - Is any time_to_revert NA? : False
3794 - ROC_nvwiki.png saved!
3795 - Optimal threshold for 15.0% FPR is: 0.38992777466773987
3796 - confusion_matrix_nvwiki.png saved!
3797 - False Positive Rate is: 0.14798206278026907
3798 - CONFUSION MATRIX -
3799Predicted not reverted reverted
3800Actual
3801not reverted 190 33
3802reverted 24 119
3803
3804
3805============ - tiwiki - ============
3806 - Snapshot: 2025-06
3807 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3808 - Raw data shape: (305, 17)
3809 - Duplicate rows found and removed: 35
3810 - Clean data shape: (270, 17)
3811 - Unique revision_ids: 270 | Data Shape: 270 | Same? : -> True
3812 - Removing edits that are reverts from df | New Shape: (260, 17)
3813 - Is any revert_risk_score NA? : False
3814 - Is any user_edit_count NA? : False
3815 - Is any time_to_revert NA? : False
3816 - ROC_tiwiki.png saved!
3817 - Optimal threshold for 15.0% FPR is: 0.935514509677887
3818 - confusion_matrix_tiwiki.png saved!
3819 - False Positive Rate is: 0.13524590163934427
3820 - CONFUSION MATRIX -
3821Predicted not reverted reverted
3822Actual
3823not reverted 211 33
3824reverted 14 2
3825
3826
3827============ - lnwiki - ============
3828 - Snapshot: 2025-06
3829 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3830 - Raw data shape: (1980, 17)
3831 - Duplicate rows found and removed: 3
3832 - Clean data shape: (1977, 17)
3833 - Unique revision_ids: 1977 | Data Shape: 1977 | Same? : -> True
3834 - Removing edits that are reverts from df | New Shape: (1952, 17)
3835 - Is any revert_risk_score NA? : False
3836 - Is any user_edit_count NA? : False
3837 - Is any time_to_revert NA? : False
3838 - ROC_lnwiki.png saved!
3839 - Optimal threshold for 15.0% FPR is: 0.613621175289154
3840 - confusion_matrix_lnwiki.png saved!
3841 - False Positive Rate is: 0.1809623430962343
3842 - CONFUSION MATRIX -
3843Predicted not reverted reverted
3844Actual
3845not reverted 1566 346
3846reverted 11 29
3847
3848
3849============ - dinwiki - ============
3850 - Snapshot: 2025-06
3851 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3852 - Raw data shape: (43, 17)
3853 - Duplicate rows found and removed: 0
3854 - Clean data shape: (43, 17)
3855 - Unique revision_ids: 43 | Data Shape: 43 | Same? : -> True
3856 - Removing edits that are reverts from df | New Shape: (40, 17)
3857 - Is any revert_risk_score NA? : False
3858 - Is any user_edit_count NA? : False
3859 - Is any time_to_revert NA? : False
3860 - ROC_dinwiki.png saved!
3861 - Optimal threshold for 15.0% FPR is: 0.9743212461471558
3862 - confusion_matrix_dinwiki.png saved!
3863 - False Positive Rate is: 0.025
3864 - CONFUSION MATRIX -
3865Predicted not reverted reverted
3866Actual
3867not reverted 39 1
3868
3869
3870============ - pdcwiki - ============
3871 - Snapshot: 2025-06
3872 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3873 - Raw data shape: (494, 17)
3874 - Duplicate rows found and removed: 30
3875 - Clean data shape: (464, 17)
3876 - Unique revision_ids: 464 | Data Shape: 464 | Same? : -> True
3877 - Removing edits that are reverts from df | New Shape: (426, 17)
3878 - Is any revert_risk_score NA? : False
3879 - Is any user_edit_count NA? : False
3880 - Is any time_to_revert NA? : False
3881 - ROC_pdcwiki.png saved!
3882 - Optimal threshold for 15.0% FPR is: 0.8773342967033386
3883 - confusion_matrix_pdcwiki.png saved!
3884 - False Positive Rate is: 0.1184573002754821
3885 - CONFUSION MATRIX -
3886Predicted not reverted reverted
3887Actual
3888not reverted 320 43
3889reverted 11 52
3890
3891
3892============ - wowiki - ============
3893 - Snapshot: 2025-06
3894 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3895 - Raw data shape: (510, 17)
3896 - Duplicate rows found and removed: 0
3897 - Clean data shape: (510, 17)
3898 - Unique revision_ids: 510 | Data Shape: 510 | Same? : -> True
3899 - Removing edits that are reverts from df | New Shape: (503, 17)
3900 - Is any revert_risk_score NA? : False
3901 - Is any user_edit_count NA? : False
3902 - Is any time_to_revert NA? : False
3903 - ROC_wowiki.png saved!
3904 - Optimal threshold for 15.0% FPR is: 0.9093973636627197
3905 - confusion_matrix_wowiki.png saved!
3906 - False Positive Rate is: 0.12627291242362526
3907 - CONFUSION MATRIX -
3908Predicted not reverted reverted
3909Actual
3910not reverted 429 62
3911reverted 9 3
3912
3913
3914============ - ladwiki - ============
3915 - Snapshot: 2025-06
3916 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3917 - Raw data shape: (1057, 17)
3918 - Duplicate rows found and removed: 8
3919 - Clean data shape: (1049, 17)
3920 - Unique revision_ids: 1049 | Data Shape: 1049 | Same? : -> True
3921 - Removing edits that are reverts from df | New Shape: (1023, 17)
3922 - Is any revert_risk_score NA? : False
3923 - Is any user_edit_count NA? : False
3924 - Is any time_to_revert NA? : False
3925 - ROC_ladwiki.png saved!
3926 - Optimal threshold for 15.0% FPR is: 0.6133739948272705
3927 - confusion_matrix_ladwiki.png saved!
3928 - False Positive Rate is: 0.15747241725175526
3929 - CONFUSION MATRIX -
3930Predicted not reverted reverted
3931Actual
3932not reverted 840 157
3933reverted 3 23
3934
3935
3936============ - kaawiki - ============
3937 - Snapshot: 2025-06
3938 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3939 - Raw data shape: (25760, 17)
3940 - Duplicate rows found and removed: 40
3941 - Clean data shape: (25720, 17)
3942 - Unique revision_ids: 25720 | Data Shape: 25720 | Same? : -> True
3943 - Removing edits that are reverts from df | New Shape: (25684, 17)
3944 - Is any revert_risk_score NA? : False
3945 - Is any user_edit_count NA? : False
3946 - Is any time_to_revert NA? : False
3947 - ROC_kaawiki.png saved!
3948 - Optimal threshold for 15.0% FPR is: 0.36563432216644287
3949 - confusion_matrix_kaawiki.png saved!
3950 - False Positive Rate is: 0.15022368730868849
3951 - CONFUSION MATRIX -
3952Predicted not reverted reverted
3953Actual
3954not reverted 21654 3828
3955reverted 121 81
3956
3957
3958============ - avwiki - ============
3959 - Snapshot: 2025-06
3960 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3961 - Raw data shape: (1262, 17)
3962 - Duplicate rows found and removed: 0
3963 - Clean data shape: (1262, 17)
3964 - Unique revision_ids: 1262 | Data Shape: 1262 | Same? : -> True
3965 - Removing edits that are reverts from df | New Shape: (1243, 17)
3966 - Is any revert_risk_score NA? : False
3967 - Is any user_edit_count NA? : False
3968 - Is any time_to_revert NA? : False
3969 - ROC_avwiki.png saved!
3970 - Optimal threshold for 15.0% FPR is: 0.6423503756523132
3971 - confusion_matrix_avwiki.png saved!
3972 - False Positive Rate is: 0.14775510204081632
3973 - CONFUSION MATRIX -
3974Predicted not reverted reverted
3975Actual
3976not reverted 1044 181
3977reverted 4 14
3978
3979
3980============ - arcwiki - ============
3981 - Snapshot: 2025-06
3982 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
3983 - Raw data shape: (198, 17)
3984 - Duplicate rows found and removed: 5
3985 - Clean data shape: (193, 17)
3986 - Unique revision_ids: 193 | Data Shape: 193 | Same? : -> True
3987 - Removing edits that are reverts from df | New Shape: (172, 17)
3988 - Is any revert_risk_score NA? : False
3989 - Is any user_edit_count NA? : False
3990 - Is any time_to_revert NA? : False
3991 - ROC_arcwiki.png saved!
3992 - Optimal threshold for 15.0% FPR is: 0.8438354730606079
3993 - confusion_matrix_arcwiki.png saved!
3994 - False Positive Rate is: 0.14965986394557823
3995 - CONFUSION MATRIX -
3996Predicted not reverted reverted
3997Actual
3998not reverted 125 22
3999reverted 9 16
4000
4001
4002============ - nywiki - ============
4003 - Snapshot: 2025-06
4004 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4005 - Raw data shape: (179, 17)
4006 - Duplicate rows found and removed: 3
4007 - Clean data shape: (176, 17)
4008 - Unique revision_ids: 176 | Data Shape: 176 | Same? : -> True
4009 - Removing edits that are reverts from df | New Shape: (171, 17)
4010 - Is any revert_risk_score NA? : False
4011 - Is any user_edit_count NA? : False
4012 - Is any time_to_revert NA? : False
4013 - ROC_nywiki.png saved!
4014 - Optimal threshold for 15.0% FPR is: 0.8020690679550171
4015 - confusion_matrix_nywiki.png saved!
4016 - False Positive Rate is: 0.15757575757575756
4017 - CONFUSION MATRIX -
4018Predicted not reverted reverted
4019Actual
4020not reverted 139 26
4021reverted 6 0
4022
4023
4024============ - cuwiki - ============
4025 - Snapshot: 2025-06
4026 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4027 - Raw data shape: (831, 17)
4028 - Duplicate rows found and removed: 16
4029 - Clean data shape: (815, 17)
4030 - Unique revision_ids: 815 | Data Shape: 815 | Same? : -> True
4031 - Removing edits that are reverts from df | New Shape: (775, 17)
4032 - Is any revert_risk_score NA? : False
4033 - Is any user_edit_count NA? : False
4034 - Is any time_to_revert NA? : False
4035 - ROC_cuwiki.png saved!
4036 - Optimal threshold for 15.0% FPR is: 0.828187108039856
4037 - confusion_matrix_cuwiki.png saved!
4038 - False Positive Rate is: 0.14421768707482993
4039 - CONFUSION MATRIX -
4040Predicted not reverted reverted
4041Actual
4042not reverted 629 106
4043reverted 21 19
4044
4045
4046============ - pflwiki - ============
4047 - Snapshot: 2025-06
4048 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4049 - Raw data shape: (785, 17)
4050 - Duplicate rows found and removed: 13
4051 - Clean data shape: (772, 17)
4052 - Unique revision_ids: 772 | Data Shape: 772 | Same? : -> True
4053 - Removing edits that are reverts from df | New Shape: (756, 17)
4054 - Is any revert_risk_score NA? : False
4055 - Is any user_edit_count NA? : False
4056 - Is any time_to_revert NA? : False
4057 - ROC_pflwiki.png saved!
4058 - Optimal threshold for 15.0% FPR is: 0.6133654117584229
4059 - confusion_matrix_pflwiki.png saved!
4060 - False Positive Rate is: 0.07327001356852103
4061 - CONFUSION MATRIX -
4062Predicted not reverted reverted
4063Actual
4064not reverted 683 54
4065reverted 0 19
4066
4067
4068============ - csbwiki - ============
4069 - Snapshot: 2025-06
4070 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4071 - Raw data shape: (686, 17)
4072 - Duplicate rows found and removed: 6
4073 - Clean data shape: (680, 17)
4074 - Unique revision_ids: 680 | Data Shape: 680 | Same? : -> True
4075 - Removing edits that are reverts from df | New Shape: (631, 17)
4076 - Is any revert_risk_score NA? : False
4077 - Is any user_edit_count NA? : False
4078 - Is any time_to_revert NA? : False
4079 - ROC_csbwiki.png saved!
4080 - Optimal threshold for 15.0% FPR is: 0.8540676236152649
4081 - confusion_matrix_csbwiki.png saved!
4082 - False Positive Rate is: 0.15198618307426598
4083 - CONFUSION MATRIX -
4084Predicted not reverted reverted
4085Actual
4086not reverted 491 88
4087reverted 21 31
4088
4089
4090============ - extwiki - ============
4091 - Snapshot: 2025-06
4092 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4093 - Raw data shape: (4192, 17)
4094 - Duplicate rows found and removed: 5
4095 - Clean data shape: (4187, 17)
4096 - Unique revision_ids: 4187 | Data Shape: 4187 | Same? : -> True
4097 - Removing edits that are reverts from df | New Shape: (4138, 17)
4098 - Is any revert_risk_score NA? : False
4099 - Is any user_edit_count NA? : False
4100 - Is any time_to_revert NA? : False
4101 - ROC_extwiki.png saved!
4102 - Optimal threshold for 15.0% FPR is: 0.5438408255577087
4103 - confusion_matrix_extwiki.png saved!
4104 - False Positive Rate is: 0.1488633585920313
4105 - CONFUSION MATRIX -
4106Predicted not reverted reverted
4107Actual
4108not reverted 3482 609
4109reverted 11 36
4110
4111
4112============ - miwiki - ============
4113 - Snapshot: 2025-06
4114 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4115 - Raw data shape: (745, 17)
4116 - Duplicate rows found and removed: 3
4117 - Clean data shape: (742, 17)
4118 - Unique revision_ids: 742 | Data Shape: 742 | Same? : -> True
4119 - Removing edits that are reverts from df | New Shape: (730, 17)
4120 - Is any revert_risk_score NA? : False
4121 - Is any user_edit_count NA? : False
4122 - Is any time_to_revert NA? : False
4123 - ROC_miwiki.png saved!
4124 - Optimal threshold for 15.0% FPR is: 0.8816435933113098
4125 - confusion_matrix_miwiki.png saved!
4126 - False Positive Rate is: 0.1511627906976744
4127 - CONFUSION MATRIX -
4128Predicted not reverted reverted
4129Actual
4130not reverted 584 104
4131reverted 22 20
4132
4133
4134============ - aywiki - ============
4135 - Snapshot: 2025-06
4136 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4137 - Raw data shape: (889, 17)
4138 - Duplicate rows found and removed: 9
4139 - Clean data shape: (880, 17)
4140 - Unique revision_ids: 880 | Data Shape: 880 | Same? : -> True
4141 - Removing edits that are reverts from df | New Shape: (820, 17)
4142 - Is any revert_risk_score NA? : False
4143 - Is any user_edit_count NA? : False
4144 - Is any time_to_revert NA? : False
4145 - ROC_aywiki.png saved!
4146 - Optimal threshold for 15.0% FPR is: 0.7209958434104919
4147 - confusion_matrix_aywiki.png saved!
4148 - False Positive Rate is: 0.14915693904020752
4149 - CONFUSION MATRIX -
4150Predicted not reverted reverted
4151Actual
4152not reverted 656 115
4153reverted 12 37
4154
4155
4156============ - nrmwiki - ============
4157 - Snapshot: 2025-06
4158 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4159 - Raw data shape: (175, 17)
4160 - Duplicate rows found and removed: 0
4161 - Clean data shape: (175, 17)
4162 - Unique revision_ids: 175 | Data Shape: 175 | Same? : -> True
4163 - Removing edits that are reverts from df | New Shape: (169, 17)
4164 - Is any revert_risk_score NA? : False
4165 - Is any user_edit_count NA? : False
4166 - Is any time_to_revert NA? : False
4167 - ROC_nrmwiki.png saved!
4168 - Optimal threshold for 15.0% FPR is: 0.8681466579437256
4169 - confusion_matrix_nrmwiki.png saved!
4170 - False Positive Rate is: 0.15432098765432098
4171 - CONFUSION MATRIX -
4172Predicted not reverted reverted
4173Actual
4174not reverted 137 25
4175reverted 2 5
4176
4177
4178============ - furwiki - ============
4179 - Snapshot: 2025-06
4180 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4181 - Raw data shape: (2744, 17)
4182 - Duplicate rows found and removed: 7
4183 - Clean data shape: (2737, 17)
4184 - Unique revision_ids: 2737 | Data Shape: 2737 | Same? : -> True
4185 - Removing edits that are reverts from df | New Shape: (2707, 17)
4186 - Is any revert_risk_score NA? : False
4187 - Is any user_edit_count NA? : False
4188 - Is any time_to_revert NA? : False
4189 - ROC_furwiki.png saved!
4190 - Optimal threshold for 15.0% FPR is: 0.9329590201377869
4191 - confusion_matrix_furwiki.png saved!
4192 - False Positive Rate is: 0.14930944382232175
4193 - CONFUSION MATRIX -
4194Predicted not reverted reverted
4195Actual
4196not reverted 2279 400
4197reverted 16 12
4198
4199
4200============ - cbk_zamwiki - ============
4201 - Snapshot: 2025-06
4202 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4203 - Raw data shape: (411, 17)
4204 - Duplicate rows found and removed: 13
4205 - Clean data shape: (398, 17)
4206 - Unique revision_ids: 398 | Data Shape: 398 | Same? : -> True
4207 - Removing edits that are reverts from df | New Shape: (370, 17)
4208 - Is any revert_risk_score NA? : False
4209 - Is any user_edit_count NA? : False
4210 - Is any time_to_revert NA? : False
4211 - ROC_cbk_zamwiki.png saved!
4212 - Optimal threshold for 15.0% FPR is: 0.7877727746963501
4213 - confusion_matrix_cbk_zamwiki.png saved!
4214 - False Positive Rate is: 0.1402439024390244
4215 - CONFUSION MATRIX -
4216Predicted not reverted reverted
4217Actual
4218not reverted 282 46
4219reverted 12 30
4220
4221
4222============ - newwiki - ============
4223 - Snapshot: 2025-06
4224 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4225 - Raw data shape: (2457, 17)
4226 - Duplicate rows found and removed: 61
4227 - Clean data shape: (2396, 17)
4228 - Unique revision_ids: 2396 | Data Shape: 2396 | Same? : -> True
4229 - Removing edits that are reverts from df | New Shape: (2360, 17)
4230 - Is any revert_risk_score NA? : False
4231 - Is any user_edit_count NA? : False
4232 - Is any time_to_revert NA? : False
4233 - ROC_newwiki.png saved!
4234 - Optimal threshold for 15.0% FPR is: 0.5211036801338196
4235 - confusion_matrix_newwiki.png saved!
4236 - False Positive Rate is: 0.15593952483801296
4237 - CONFUSION MATRIX -
4238Predicted not reverted reverted
4239Actual
4240not reverted 1954 361
4241reverted 7 38
4242
4243
4244============ - nahwiki - ============
4245 - Snapshot: 2025-06
4246 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4247 - Raw data shape: (172, 17)
4248 - Duplicate rows found and removed: 6
4249 - Clean data shape: (166, 17)
4250 - Unique revision_ids: 166 | Data Shape: 166 | Same? : -> True
4251 - Removing edits that are reverts from df | New Shape: (157, 17)
4252 - Is any revert_risk_score NA? : False
4253 - Is any user_edit_count NA? : False
4254 - Is any time_to_revert NA? : False
4255 - ROC_nahwiki.png saved!
4256 - Optimal threshold for 15.0% FPR is: 0.9201942086219788
4257 - confusion_matrix_nahwiki.png saved!
4258 - False Positive Rate is: 0.10596026490066225
4259 - CONFUSION MATRIX -
4260Predicted not reverted reverted
4261Actual
4262not reverted 135 16
4263reverted 5 1
4264
4265
4266============ - gvwiki - ============
4267 - Snapshot: 2025-06
4268 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4269 - Raw data shape: (6286, 17)
4270 - Duplicate rows found and removed: 25
4271 - Clean data shape: (6261, 17)
4272 - Unique revision_ids: 6261 | Data Shape: 6261 | Same? : -> True
4273 - Removing edits that are reverts from df | New Shape: (6218, 17)
4274 - Is any revert_risk_score NA? : False
4275 - Is any user_edit_count NA? : False
4276 - Is any time_to_revert NA? : False
4277 - ROC_gvwiki.png saved!
4278 - Optimal threshold for 15.0% FPR is: 0.3498202860355377
4279 - confusion_matrix_gvwiki.png saved!
4280 - False Positive Rate is: 0.15668727627725348
4281 - CONFUSION MATRIX -
4282Predicted not reverted reverted
4283Actual
4284not reverted 5183 963
4285reverted 4 68
4286
4287
4288============ - omwiki - ============
4289 - Snapshot: 2025-06
4290 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4291 - Raw data shape: (891, 17)
4292 - Duplicate rows found and removed: 7
4293 - Clean data shape: (884, 17)
4294 - Unique revision_ids: 884 | Data Shape: 884 | Same? : -> True
4295 - Removing edits that are reverts from df | New Shape: (834, 17)
4296 - Is any revert_risk_score NA? : False
4297 - Is any user_edit_count NA? : False
4298 - Is any time_to_revert NA? : False
4299 - ROC_omwiki.png saved!
4300 - Optimal threshold for 15.0% FPR is: 0.8401511311531067
4301 - confusion_matrix_omwiki.png saved!
4302 - False Positive Rate is: 0.17884130982367757
4303 - CONFUSION MATRIX -
4304Predicted not reverted reverted
4305Actual
4306not reverted 652 142
4307reverted 19 21
4308
4309
4310============ - klwiki - ============
4311 - Snapshot: 2025-06
4312 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4313 - Raw data shape: (598, 17)
4314 - Duplicate rows found and removed: 26
4315 - Clean data shape: (572, 17)
4316 - Unique revision_ids: 572 | Data Shape: 572 | Same? : -> True
4317 - Removing edits that are reverts from df | New Shape: (382, 17)
4318 - Is any revert_risk_score NA? : False
4319 - Is any user_edit_count NA? : False
4320 - Is any time_to_revert NA? : False
4321 - ROC_klwiki.png saved!
4322 - Optimal threshold for 15.0% FPR is: 0.8161092400550842
4323 - confusion_matrix_klwiki.png saved!
4324 - False Positive Rate is: 0.14285714285714285
4325 - CONFUSION MATRIX -
4326Predicted not reverted reverted
4327Actual
4328not reverted 162 27
4329reverted 30 163
4330
4331
4332============ - zeawiki - ============
4333 - Snapshot: 2025-06
4334 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4335 - Raw data shape: (11034, 17)
4336 - Duplicate rows found and removed: 20
4337 - Clean data shape: (11014, 17)
4338 - Unique revision_ids: 11014 | Data Shape: 11014 | Same? : -> True
4339 - Removing edits that are reverts from df | New Shape: (10935, 17)
4340 - Is any revert_risk_score NA? : False
4341 - Is any user_edit_count NA? : False
4342 - Is any time_to_revert NA? : False
4343 - ROC_zeawiki.png saved!
4344 - Optimal threshold for 15.0% FPR is: 0.633648157119751
4345 - confusion_matrix_zeawiki.png saved!
4346 - False Positive Rate is: 0.1496655518394649
4347 - CONFUSION MATRIX -
4348Predicted not reverted reverted
4349Actual
4350not reverted 9153 1611
4351reverted 103 68
4352
4353
4354============ - smwiki - ============
4355 - Snapshot: 2025-06
4356 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4357 - Raw data shape: (664, 17)
4358 - Duplicate rows found and removed: 0
4359 - Clean data shape: (664, 17)
4360 - Unique revision_ids: 664 | Data Shape: 664 | Same? : -> True
4361 - Removing edits that are reverts from df | New Shape: (658, 17)
4362 - Is any revert_risk_score NA? : False
4363 - Is any user_edit_count NA? : False
4364 - Is any time_to_revert NA? : False
4365 - ROC_smwiki.png saved!
4366 - Optimal threshold for 15.0% FPR is: 0.7539634108543396
4367 - confusion_matrix_smwiki.png saved!
4368 - False Positive Rate is: 0.16411042944785276
4369 - CONFUSION MATRIX -
4370Predicted not reverted reverted
4371Actual
4372not reverted 545 107
4373reverted 2 4
4374
4375
4376============ - roa_rupwiki - ============
4377 - Snapshot: 2025-06
4378 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4379 - Raw data shape: (384, 17)
4380 - Duplicate rows found and removed: 13
4381 - Clean data shape: (371, 17)
4382 - Unique revision_ids: 371 | Data Shape: 371 | Same? : -> True
4383 - Removing edits that are reverts from df | New Shape: (333, 17)
4384 - Is any revert_risk_score NA? : False
4385 - Is any user_edit_count NA? : False
4386 - Is any time_to_revert NA? : False
4387 - ROC_roa_rupwiki.png saved!
4388 - Optimal threshold for 15.0% FPR is: 0.6913049817085266
4389 - confusion_matrix_roa_rupwiki.png saved!
4390 - False Positive Rate is: 0.15960912052117263
4391 - CONFUSION MATRIX -
4392Predicted not reverted reverted
4393Actual
4394not reverted 258 49
4395reverted 6 20
4396
4397
4398============ - map_bmswiki - ============
4399 - Snapshot: 2025-06
4400 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4401 - Raw data shape: (623, 17)
4402 - Duplicate rows found and removed: 88
4403 - Clean data shape: (535, 17)
4404 - Unique revision_ids: 535 | Data Shape: 535 | Same? : -> True
4405 - Removing edits that are reverts from df | New Shape: (456, 17)
4406 - Is any revert_risk_score NA? : False
4407 - Is any user_edit_count NA? : False
4408 - Is any time_to_revert NA? : False
4409 - ROC_map_bmswiki.png saved!
4410 - Optimal threshold for 15.0% FPR is: 0.9081456065177917
4411 - confusion_matrix_map_bmswiki.png saved!
4412 - False Positive Rate is: 0.1488673139158576
4413 - CONFUSION MATRIX -
4414Predicted not reverted reverted
4415Actual
4416not reverted 263 46
4417reverted 80 67
4418
4419
4420============ - stwiki - ============
4421 - Snapshot: 2025-06
4422 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4423 - Raw data shape: (1955, 17)
4424 - Duplicate rows found and removed: 1
4425 - Clean data shape: (1954, 17)
4426 - Unique revision_ids: 1954 | Data Shape: 1954 | Same? : -> True
4427 - Removing edits that are reverts from df | New Shape: (1943, 17)
4428 - Is any revert_risk_score NA? : False
4429 - Is any user_edit_count NA? : False
4430 - Is any time_to_revert NA? : False
4431 - ROC_stwiki.png saved!
4432 - Optimal threshold for 15.0% FPR is: 0.4744786024093628
4433 - confusion_matrix_stwiki.png saved!
4434 - False Positive Rate is: 0.14105925537493444
4435 - CONFUSION MATRIX -
4436Predicted not reverted reverted
4437Actual
4438not reverted 1638 269
4439reverted 28 8
4440
4441
4442============ - kswiki - ============
4443 - Snapshot: 2025-06
4444 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4445 - Raw data shape: (8284, 17)
4446 - Duplicate rows found and removed: 1
4447 - Clean data shape: (8283, 17)
4448 - Unique revision_ids: 8283 | Data Shape: 8283 | Same? : -> True
4449 - Removing edits that are reverts from df | New Shape: (8270, 17)
4450 - Is any revert_risk_score NA? : False
4451 - Is any user_edit_count NA? : False
4452 - Is any time_to_revert NA? : False
4453 - ROC_kswiki.png saved!
4454 - Optimal threshold for 15.0% FPR is: 0.25549614429473877
4455 - confusion_matrix_kswiki.png saved!
4456 - False Positive Rate is: 0.14953838678328474
4457 - CONFUSION MATRIX -
4458Predicted not reverted reverted
4459Actual
4460not reverted 7001 1231
4461reverted 23 15
4462
4463
4464============ - bxrwiki - ============
4465 - Snapshot: 2025-06
4466 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4467 - Raw data shape: (707, 17)
4468 - Duplicate rows found and removed: 0
4469 - Clean data shape: (707, 17)
4470 - Unique revision_ids: 707 | Data Shape: 707 | Same? : -> True
4471 - Removing edits that are reverts from df | New Shape: (698, 17)
4472 - Is any revert_risk_score NA? : False
4473 - Is any user_edit_count NA? : False
4474 - Is any time_to_revert NA? : False
4475 - ROC_bxrwiki.png saved!
4476 - Optimal threshold for 15.0% FPR is: 0.842378556728363
4477 - confusion_matrix_bxrwiki.png saved!
4478 - False Positive Rate is: 0.1577424023154848
4479 - CONFUSION MATRIX -
4480Predicted not reverted reverted
4481Actual
4482not reverted 582 109
4483reverted 5 2
4484
4485
4486============ - kbpwiki - ============
4487 - Snapshot: 2025-06
4488 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4489 - Raw data shape: (169, 17)
4490 - Duplicate rows found and removed: 7
4491 - Clean data shape: (162, 17)
4492 - Unique revision_ids: 162 | Data Shape: 162 | Same? : -> True
4493 - Removing edits that are reverts from df | New Shape: (148, 17)
4494 - Is any revert_risk_score NA? : False
4495 - Is any user_edit_count NA? : False
4496 - Is any time_to_revert NA? : False
4497 - ROC_kbpwiki.png saved!
4498 - Optimal threshold for 15.0% FPR is: 0.8215250372886658
4499 - confusion_matrix_kbpwiki.png saved!
4500 - False Positive Rate is: 0.16911764705882354
4501 - CONFUSION MATRIX -
4502Predicted not reverted reverted
4503Actual
4504not reverted 113 23
4505reverted 6 6
4506
4507
4508============ - fjwiki - ============
4509 - Snapshot: 2025-06
4510 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4511 - Raw data shape: (661, 17)
4512 - Duplicate rows found and removed: 5
4513 - Clean data shape: (656, 17)
4514 - Unique revision_ids: 656 | Data Shape: 656 | Same? : -> True
4515 - Removing edits that are reverts from df | New Shape: (643, 17)
4516 - Is any revert_risk_score NA? : False
4517 - Is any user_edit_count NA? : False
4518 - Is any time_to_revert NA? : False
4519 - ROC_fjwiki.png saved!
4520 - Optimal threshold for 15.0% FPR is: 0.4832654595375061
4521 - confusion_matrix_fjwiki.png saved!
4522 - False Positive Rate is: 0.09968354430379747
4523 - CONFUSION MATRIX -
4524Predicted not reverted reverted
4525Actual
4526not reverted 569 63
4527reverted 0 11
4528
4529
4530============ - ltgwiki - ============
4531 - Snapshot: 2025-06
4532 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4533 - Raw data shape: (176, 17)
4534 - Duplicate rows found and removed: 2
4535 - Clean data shape: (174, 17)
4536 - Unique revision_ids: 174 | Data Shape: 174 | Same? : -> True
4537 - Removing edits that are reverts from df | New Shape: (161, 17)
4538 - Is any revert_risk_score NA? : False
4539 - Is any user_edit_count NA? : False
4540 - Is any time_to_revert NA? : False
4541 - ROC_ltgwiki.png saved!
4542 - Optimal threshold for 15.0% FPR is: 0.8122517466545105
4543 - confusion_matrix_ltgwiki.png saved!
4544 - False Positive Rate is: 0.13815789473684212
4545 - CONFUSION MATRIX -
4546Predicted not reverted reverted
4547Actual
4548not reverted 131 21
4549reverted 5 4
4550
4551
4552============ - gotwiki - ============
4553 - Snapshot: 2025-06
4554 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4555 - Raw data shape: (514, 17)
4556 - Duplicate rows found and removed: 8
4557 - Clean data shape: (506, 17)
4558 - Unique revision_ids: 506 | Data Shape: 506 | Same? : -> True
4559 - Removing edits that are reverts from df | New Shape: (483, 17)
4560 - Is any revert_risk_score NA? : False
4561 - Is any user_edit_count NA? : False
4562 - Is any time_to_revert NA? : False
4563 - ROC_gotwiki.png saved!
4564 - Optimal threshold for 15.0% FPR is: 0.7810843586921692
4565 - confusion_matrix_gotwiki.png saved!
4566 - False Positive Rate is: 0.16375545851528384
4567 - CONFUSION MATRIX -
4568Predicted not reverted reverted
4569Actual
4570not reverted 383 75
4571reverted 3 22
4572
4573
4574============ - ganwiki - ============
4575 - Snapshot: 2025-06
4576 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4577 - Raw data shape: (330, 17)
4578 - Duplicate rows found and removed: 29
4579 - Clean data shape: (301, 17)
4580 - Unique revision_ids: 301 | Data Shape: 301 | Same? : -> True
4581 - Removing edits that are reverts from df | New Shape: (258, 17)
4582 - Is any revert_risk_score NA? : False
4583 - Is any user_edit_count NA? : False
4584 - Is any time_to_revert NA? : False
4585 - ROC_ganwiki.png saved!
4586 - Optimal threshold for 15.0% FPR is: 0.8062781095504761
4587 - confusion_matrix_ganwiki.png saved!
4588 - False Positive Rate is: 0.15126050420168066
4589 - CONFUSION MATRIX -
4590Predicted not reverted reverted
4591Actual
4592not reverted 202 36
4593reverted 9 11
4594
4595
4596============ - pagwiki - ============
4597 - Snapshot: 2025-06
4598 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4599 - Raw data shape: (1110, 17)
4600 - Duplicate rows found and removed: 208
4601 - Clean data shape: (902, 17)
4602 - Unique revision_ids: 902 | Data Shape: 902 | Same? : -> True
4603 - Removing edits that are reverts from df | New Shape: (734, 17)
4604 - Is any revert_risk_score NA? : False
4605 - Is any user_edit_count NA? : False
4606 - Is any time_to_revert NA? : False
4607 - ROC_pagwiki.png saved!
4608 - Optimal threshold for 15.0% FPR is: 0.9537625908851624
4609 - confusion_matrix_pagwiki.png saved!
4610 - False Positive Rate is: 0.14937759336099585
4611 - CONFUSION MATRIX -
4612Predicted not reverted reverted
4613Actual
4614not reverted 410 72
4615reverted 141 111
4616
4617
4618============ - gagwiki - ============
4619 - Snapshot: 2025-06
4620 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4621 - Raw data shape: (1700, 17)
4622 - Duplicate rows found and removed: 15
4623 - Clean data shape: (1685, 17)
4624 - Unique revision_ids: 1685 | Data Shape: 1685 | Same? : -> True
4625 - Removing edits that are reverts from df | New Shape: (1657, 17)
4626 - Is any revert_risk_score NA? : False
4627 - Is any user_edit_count NA? : False
4628 - Is any time_to_revert NA? : False
4629 - ROC_gagwiki.png saved!
4630 - Optimal threshold for 15.0% FPR is: 0.5600930452346802
4631 - confusion_matrix_gagwiki.png saved!
4632 - False Positive Rate is: 0.1434729064039409
4633 - CONFUSION MATRIX -
4634Predicted not reverted reverted
4635Actual
4636not reverted 1391 233
4637reverted 4 29
4638
4639
4640============ - sswiki - ============
4641 - Snapshot: 2025-06
4642 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4643 - Raw data shape: (839, 17)
4644 - Duplicate rows found and removed: 2
4645 - Clean data shape: (837, 17)
4646 - Unique revision_ids: 837 | Data Shape: 837 | Same? : -> True
4647 - Removing edits that are reverts from df | New Shape: (832, 17)
4648 - Is any revert_risk_score NA? : False
4649 - Is any user_edit_count NA? : False
4650 - Is any time_to_revert NA? : False
4651 - ROC_sswiki.png saved!
4652 - Optimal threshold for 15.0% FPR is: 0.6542070508003235
4653 - confusion_matrix_sswiki.png saved!
4654 - False Positive Rate is: 0.15158924205378974
4655 - CONFUSION MATRIX -
4656Predicted not reverted reverted
4657Actual
4658not reverted 694 124
4659reverted 6 8
4660
4661
4662============ - rmywiki - ============
4663 - Snapshot: 2025-06
4664 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4665 - Raw data shape: (257, 17)
4666 - Duplicate rows found and removed: 5
4667 - Clean data shape: (252, 17)
4668 - Unique revision_ids: 252 | Data Shape: 252 | Same? : -> True
4669 - Removing edits that are reverts from df | New Shape: (230, 17)
4670 - Is any revert_risk_score NA? : False
4671 - Is any user_edit_count NA? : False
4672 - Is any time_to_revert NA? : False
4673 - ROC_rmywiki.png saved!
4674 - Optimal threshold for 15.0% FPR is: 0.9060903191566467
4675 - confusion_matrix_rmywiki.png saved!
4676 - False Positive Rate is: 0.16097560975609757
4677 - CONFUSION MATRIX -
4678Predicted not reverted reverted
4679Actual
4680not reverted 172 33
4681reverted 14 11
4682
4683
4684============ - ffwiki - ============
4685 - Snapshot: 2025-06
4686 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4687 - Raw data shape: (36311, 17)
4688 - Duplicate rows found and removed: 22
4689 - Clean data shape: (36289, 17)
4690 - Unique revision_ids: 36289 | Data Shape: 36289 | Same? : -> True
4691 - Removing edits that are reverts from df | New Shape: (36230, 17)
4692 - Is any revert_risk_score NA? : False
4693 - Is any user_edit_count NA? : False
4694 - Is any time_to_revert NA? : False
4695 - ROC_ffwiki.png saved!
4696 - Optimal threshold for 15.0% FPR is: 0.3907754123210907
4697 - confusion_matrix_ffwiki.png saved!
4698 - False Positive Rate is: 0.14912814835316912
4699 - CONFUSION MATRIX -
4700Predicted not reverted reverted
4701Actual
4702not reverted 30742 5388
4703reverted 62 38
4704
4705
4706============ - nsowiki - ============
4707 - Snapshot: 2025-06
4708 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4709 - Raw data shape: (697, 17)
4710 - Duplicate rows found and removed: 29
4711 - Clean data shape: (668, 17)
4712 - Unique revision_ids: 668 | Data Shape: 668 | Same? : -> True
4713 - Removing edits that are reverts from df | New Shape: (632, 17)
4714 - Is any revert_risk_score NA? : False
4715 - Is any user_edit_count NA? : False
4716 - Is any time_to_revert NA? : False
4717 - ROC_nsowiki.png saved!
4718 - Optimal threshold for 15.0% FPR is: 0.8198308944702148
4719 - confusion_matrix_nsowiki.png saved!
4720 - False Positive Rate is: 0.1445993031358885
4721 - CONFUSION MATRIX -
4722Predicted not reverted reverted
4723Actual
4724not reverted 491 83
4725reverted 27 31
4726
4727
4728============ - jbowiki - ============
4729 - Snapshot: 2025-06
4730 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4731 - Raw data shape: (519, 17)
4732 - Duplicate rows found and removed: 9
4733 - Clean data shape: (510, 17)
4734 - Unique revision_ids: 510 | Data Shape: 510 | Same? : -> True
4735 - Removing edits that are reverts from df | New Shape: (325, 17)
4736 - Is any revert_risk_score NA? : False
4737 - Is any user_edit_count NA? : False
4738 - Is any time_to_revert NA? : False
4739 - ROC_jbowiki.png saved!
4740 - Optimal threshold for 15.0% FPR is: 0.9296131730079651
4741 - confusion_matrix_jbowiki.png saved!
4742 - False Positive Rate is: 0.15023474178403756
4743 - CONFUSION MATRIX -
4744Predicted not reverted reverted
4745Actual
4746not reverted 181 32
4747reverted 78 34
4748
4749
4750============ - chrwiki - ============
4751 - Snapshot: 2025-06
4752 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4753 - Raw data shape: (953, 17)
4754 - Duplicate rows found and removed: 228
4755 - Clean data shape: (725, 17)
4756 - Unique revision_ids: 725 | Data Shape: 725 | Same? : -> True
4757 - Removing edits that are reverts from df | New Shape: (604, 17)
4758 - Is any revert_risk_score NA? : False
4759 - Is any user_edit_count NA? : False
4760 - Is any time_to_revert NA? : False
4761 - ROC_chrwiki.png saved!
4762 - Optimal threshold for 15.0% FPR is: 0.9158294200897217
4763 - confusion_matrix_chrwiki.png saved!
4764 - False Positive Rate is: 0.14814814814814814
4765 - CONFUSION MATRIX -
4766Predicted not reverted reverted
4767Actual
4768not reverted 322 56
4769reverted 111 115
4770
4771
4772============ - adywiki - ============
4773 - Snapshot: 2025-06
4774 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4775 - Raw data shape: (160, 17)
4776 - Duplicate rows found and removed: 10
4777 - Clean data shape: (150, 17)
4778 - Unique revision_ids: 150 | Data Shape: 150 | Same? : -> True
4779 - Removing edits that are reverts from df | New Shape: (136, 17)
4780 - Is any revert_risk_score NA? : False
4781 - Is any user_edit_count NA? : False
4782 - Is any time_to_revert NA? : False
4783 - ROC_adywiki.png saved!
4784 - Optimal threshold for 15.0% FPR is: 0.8431930541992188
4785 - confusion_matrix_adywiki.png saved!
4786 - False Positive Rate is: 0.14285714285714285
4787 - CONFUSION MATRIX -
4788Predicted not reverted reverted
4789Actual
4790not reverted 96 16
4791reverted 9 15
4792
4793
4794============ - stqwiki - ============
4795 - Snapshot: 2025-06
4796 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4797 - Raw data shape: (215, 17)
4798 - Duplicate rows found and removed: 0
4799 - Clean data shape: (215, 17)
4800 - Unique revision_ids: 215 | Data Shape: 215 | Same? : -> True
4801 - Removing edits that are reverts from df | New Shape: (193, 17)
4802 - Is any revert_risk_score NA? : False
4803 - Is any user_edit_count NA? : False
4804 - Is any time_to_revert NA? : False
4805 - ROC_stqwiki.png saved!
4806 - Optimal threshold for 15.0% FPR is: 0.8420207500457764
4807 - confusion_matrix_stqwiki.png saved!
4808 - False Positive Rate is: 0.14444444444444443
4809 - CONFUSION MATRIX -
4810Predicted not reverted reverted
4811Actual
4812not reverted 154 26
4813reverted 7 6
4814
4815
4816============ - tetwiki - ============
4817 - Snapshot: 2025-06
4818 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4819 - Raw data shape: (978, 17)
4820 - Duplicate rows found and removed: 3
4821 - Clean data shape: (975, 17)
4822 - Unique revision_ids: 975 | Data Shape: 975 | Same? : -> True
4823 - Removing edits that are reverts from df | New Shape: (959, 17)
4824 - Is any revert_risk_score NA? : False
4825 - Is any user_edit_count NA? : False
4826 - Is any time_to_revert NA? : False
4827 - ROC_tetwiki.png saved!
4828 - Optimal threshold for 15.0% FPR is: 0.8080796599388123
4829 - confusion_matrix_tetwiki.png saved!
4830 - False Positive Rate is: 0.1554845580404686
4831 - CONFUSION MATRIX -
4832Predicted not reverted reverted
4833Actual
4834not reverted 793 146
4835reverted 8 12
4836
4837
4838============ - tnwiki - ============
4839 - Snapshot: 2025-06
4840 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4841 - Raw data shape: (2544, 17)
4842 - Duplicate rows found and removed: 1
4843 - Clean data shape: (2543, 17)
4844 - Unique revision_ids: 2543 | Data Shape: 2543 | Same? : -> True
4845 - Removing edits that are reverts from df | New Shape: (2535, 17)
4846 - Is any revert_risk_score NA? : False
4847 - Is any user_edit_count NA? : False
4848 - Is any time_to_revert NA? : False
4849 - ROC_tnwiki.png saved!
4850 - Optimal threshold for 15.0% FPR is: 0.5295730829238892
4851 - confusion_matrix_tnwiki.png saved!
4852 - False Positive Rate is: 0.15087579617834396
4853 - CONFUSION MATRIX -
4854Predicted not reverted reverted
4855Actual
4856not reverted 2133 379
4857reverted 9 14
4858
4859
4860============ - lgwiki - ============
4861 - Snapshot: 2025-06
4862 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4863 - Raw data shape: (2099, 17)
4864 - Duplicate rows found and removed: 38
4865 - Clean data shape: (2061, 17)
4866 - Unique revision_ids: 2061 | Data Shape: 2061 | Same? : -> True
4867 - Removing edits that are reverts from df | New Shape: (2001, 17)
4868 - Is any revert_risk_score NA? : False
4869 - Is any user_edit_count NA? : False
4870 - Is any time_to_revert NA? : False
4871 - ROC_lgwiki.png saved!
4872 - Optimal threshold for 15.0% FPR is: 0.5962941646575928
4873 - confusion_matrix_lgwiki.png saved!
4874 - False Positive Rate is: 0.14926133469179828
4875 - CONFUSION MATRIX -
4876Predicted not reverted reverted
4877Actual
4878not reverted 1670 293
4879reverted 21 17
4880
4881
4882============ - dvwiki - ============
4883 - Snapshot: 2025-06
4884 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4885 - Raw data shape: (818, 17)
4886 - Duplicate rows found and removed: 25
4887 - Clean data shape: (793, 17)
4888 - Unique revision_ids: 793 | Data Shape: 793 | Same? : -> True
4889 - Removing edits that are reverts from df | New Shape: (772, 17)
4890 - Is any revert_risk_score NA? : False
4891 - Is any user_edit_count NA? : False
4892 - Is any time_to_revert NA? : False
4893 - ROC_dvwiki.png saved!
4894 - Optimal threshold for 15.0% FPR is: 0.9355606436729431
4895 - confusion_matrix_dvwiki.png saved!
4896 - False Positive Rate is: 0.15902964959568733
4897 - CONFUSION MATRIX -
4898Predicted not reverted reverted
4899Actual
4900not reverted 624 118
4901reverted 23 7
4902
4903
4904============ - gcrwiki - ============
4905 - Snapshot: 2025-06
4906 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4907 - Raw data shape: (128, 17)
4908 - Duplicate rows found and removed: 3
4909 - Clean data shape: (125, 17)
4910 - Unique revision_ids: 125 | Data Shape: 125 | Same? : -> True
4911 - Removing edits that are reverts from df | New Shape: (108, 17)
4912 - Is any revert_risk_score NA? : False
4913 - Is any user_edit_count NA? : False
4914 - Is any time_to_revert NA? : False
4915 - ROC_gcrwiki.png saved!
4916 - Optimal threshold for 15.0% FPR is: 0.8253213167190552
4917 - confusion_matrix_gcrwiki.png saved!
4918 - False Positive Rate is: 0.16304347826086957
4919 - CONFUSION MATRIX -
4920Predicted not reverted reverted
4921Actual
4922not reverted 77 15
4923reverted 6 10
4924
4925
4926============ - tswiki - ============
4927 - Snapshot: 2025-06
4928 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4929 - Raw data shape: (827, 17)
4930 - Duplicate rows found and removed: 3
4931 - Clean data shape: (824, 17)
4932 - Unique revision_ids: 824 | Data Shape: 824 | Same? : -> True
4933 - Removing edits that are reverts from df | New Shape: (815, 17)
4934 - Is any revert_risk_score NA? : False
4935 - Is any user_edit_count NA? : False
4936 - Is any time_to_revert NA? : False
4937 - ROC_tswiki.png saved!
4938 - Optimal threshold for 15.0% FPR is: 0.7430192828178406
4939 - confusion_matrix_tswiki.png saved!
4940 - False Positive Rate is: 0.1545338441890166
4941 - CONFUSION MATRIX -
4942Predicted not reverted reverted
4943Actual
4944not reverted 662 121
4945reverted 22 10
4946
4947
4948============ - kbdwiki - ============
4949 - Snapshot: 2025-06
4950 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4951 - Raw data shape: (223, 17)
4952 - Duplicate rows found and removed: 2
4953 - Clean data shape: (221, 17)
4954 - Unique revision_ids: 221 | Data Shape: 221 | Same? : -> True
4955 - Removing edits that are reverts from df | New Shape: (208, 17)
4956 - Is any revert_risk_score NA? : False
4957 - Is any user_edit_count NA? : False
4958 - Is any time_to_revert NA? : False
4959 - ROC_kbdwiki.png saved!
4960 - Optimal threshold for 15.0% FPR is: 0.9053165316581726
4961 - confusion_matrix_kbdwiki.png saved!
4962 - False Positive Rate is: 0.10050251256281408
4963 - CONFUSION MATRIX -
4964Predicted not reverted reverted
4965Actual
4966not reverted 179 20
4967reverted 1 8
4968
4969
4970============ - novwiki - ============
4971 - Snapshot: 2025-06
4972 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4973 - Raw data shape: (526, 17)
4974 - Duplicate rows found and removed: 1
4975 - Clean data shape: (525, 17)
4976 - Unique revision_ids: 525 | Data Shape: 525 | Same? : -> True
4977 - Removing edits that are reverts from df | New Shape: (514, 17)
4978 - Is any revert_risk_score NA? : False
4979 - Is any user_edit_count NA? : False
4980 - Is any time_to_revert NA? : False
4981 - ROC_novwiki.png saved!
4982 - Optimal threshold for 15.0% FPR is: 0.8247998356819153
4983 - confusion_matrix_novwiki.png saved!
4984 - False Positive Rate is: 0.15169660678642716
4985 - CONFUSION MATRIX -
4986Predicted not reverted reverted
4987Actual
4988not reverted 425 76
4989reverted 5 8
4990
4991
4992============ - twwiki - ============
4993 - Snapshot: 2025-06
4994 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
4995 - Raw data shape: (3435, 17)
4996 - Duplicate rows found and removed: 3
4997 - Clean data shape: (3432, 17)
4998 - Unique revision_ids: 3432 | Data Shape: 3432 | Same? : -> True
4999 - Removing edits that are reverts from df | New Shape: (3413, 17)
5000 - Is any revert_risk_score NA? : False
5001 - Is any user_edit_count NA? : False
5002 - Is any time_to_revert NA? : False
5003 - ROC_twwiki.png saved!
5004 - Optimal threshold for 15.0% FPR is: 0.5095222592353821
5005 - confusion_matrix_twwiki.png saved!
5006 - False Positive Rate is: 0.11578323956174119
5007 - CONFUSION MATRIX -
5008Predicted not reverted reverted
5009Actual
5010not reverted 2986 391
5011reverted 9 27
5012
5013
5014============ - srnwiki - ============
5015 - Snapshot: 2025-06
5016 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5017 - Raw data shape: (161, 17)
5018 - Duplicate rows found and removed: 1
5019 - Clean data shape: (160, 17)
5020 - Unique revision_ids: 160 | Data Shape: 160 | Same? : -> True
5021 - Removing edits that are reverts from df | New Shape: (149, 17)
5022 - Is any revert_risk_score NA? : False
5023 - Is any user_edit_count NA? : False
5024 - Is any time_to_revert NA? : False
5025 - ROC_srnwiki.png saved!
5026 - Optimal threshold for 15.0% FPR is: 0.8315097689628601
5027 - confusion_matrix_srnwiki.png saved!
5028 - False Positive Rate is: 0.17482517482517482
5029 - CONFUSION MATRIX -
5030Predicted not reverted reverted
5031Actual
5032not reverted 118 25
5033reverted 3 3
5034
5035
5036============ - mdfwiki - ============
5037 - Snapshot: 2025-06
5038 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5039 - Raw data shape: (8683, 17)
5040 - Duplicate rows found and removed: 0
5041 - Clean data shape: (8683, 17)
5042 - Unique revision_ids: 8683 | Data Shape: 8683 | Same? : -> True
5043 - Removing edits that are reverts from df | New Shape: (8660, 17)
5044 - Is any revert_risk_score NA? : False
5045 - Is any user_edit_count NA? : False
5046 - Is any time_to_revert NA? : False
5047 - ROC_mdfwiki.png saved!
5048 - Optimal threshold for 15.0% FPR is: 0.2540217638015747
5049 - confusion_matrix_mdfwiki.png saved!
5050 - False Positive Rate is: 0.15540384170330943
5051 - CONFUSION MATRIX -
5052Predicted not reverted reverted
5053Actual
5054not reverted 7299 1343
5055reverted 3 15
5056
5057
5058============ - kshwiki - ============
5059 - Snapshot: 2025-06
5060 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5061 - Raw data shape: (218, 17)
5062 - Duplicate rows found and removed: 1
5063 - Clean data shape: (217, 17)
5064 - Unique revision_ids: 217 | Data Shape: 217 | Same? : -> True
5065 - Removing edits that are reverts from df | New Shape: (206, 17)
5066 - Is any revert_risk_score NA? : False
5067 - Is any user_edit_count NA? : False
5068 - Is any time_to_revert NA? : False
5069 - ROC_kshwiki.png saved!
5070 - Optimal threshold for 15.0% FPR is: 0.7654833197593689
5071 - confusion_matrix_kshwiki.png saved!
5072 - False Positive Rate is: 0.1306532663316583
5073 - CONFUSION MATRIX -
5074Predicted not reverted reverted
5075Actual
5076not reverted 173 26
5077reverted 2 5
5078
5079
5080============ - tpiwiki - ============
5081 - Snapshot: 2025-06
5082 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5083 - Raw data shape: (202, 17)
5084 - Duplicate rows found and removed: 0
5085 - Clean data shape: (202, 17)
5086 - Unique revision_ids: 202 | Data Shape: 202 | Same? : -> True
5087 - Removing edits that are reverts from df | New Shape: (196, 17)
5088 - Is any revert_risk_score NA? : False
5089 - Is any user_edit_count NA? : False
5090 - Is any time_to_revert NA? : False
5091 - ROC_tpiwiki.png saved!
5092 - Optimal threshold for 15.0% FPR is: 0.9687272906303406
5093 - confusion_matrix_tpiwiki.png saved!
5094 - False Positive Rate is: 0.021164021164021163
5095 - CONFUSION MATRIX -
5096Predicted not reverted reverted
5097Actual
5098not reverted 185 4
5099reverted 6 1
5100
5101
5102============ - pihwiki - ============
5103 - Snapshot: 2025-06
5104 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5105 - Raw data shape: (110, 17)
5106 - Duplicate rows found and removed: 1
5107 - Clean data shape: (109, 17)
5108 - Unique revision_ids: 109 | Data Shape: 109 | Same? : -> True
5109 - Removing edits that are reverts from df | New Shape: (91, 17)
5110 - Is any revert_risk_score NA? : False
5111 - Is any user_edit_count NA? : False
5112 - Is any time_to_revert NA? : False
5113 - ROC_pihwiki.png saved!
5114 - Optimal threshold for 15.0% FPR is: 0.8746289014816284
5115 - confusion_matrix_pihwiki.png saved!
5116 - False Positive Rate is: 0.19480519480519481
5117 - CONFUSION MATRIX -
5118Predicted not reverted reverted
5119Actual
5120not reverted 62 15
5121reverted 8 6
5122
5123
5124============ - biwiki - ============
5125 - Snapshot: 2025-06
5126 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5127 - Raw data shape: (310, 17)
5128 - Duplicate rows found and removed: 28
5129 - Clean data shape: (282, 17)
5130 - Unique revision_ids: 282 | Data Shape: 282 | Same? : -> True
5131 - Removing edits that are reverts from df | New Shape: (249, 17)
5132 - Is any revert_risk_score NA? : False
5133 - Is any user_edit_count NA? : False
5134 - Is any time_to_revert NA? : False
5135 - ROC_biwiki.png saved!
5136 - Optimal threshold for 15.0% FPR is: 0.798578143119812
5137 - confusion_matrix_biwiki.png saved!
5138 - False Positive Rate is: 0.13777777777777778
5139 - CONFUSION MATRIX -
5140Predicted not reverted reverted
5141Actual
5142not reverted 194 31
5143reverted 13 11
5144
5145
5146============ - iuwiki - ============
5147 - Snapshot: 2025-06
5148 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5149 - Raw data shape: (423, 17)
5150 - Duplicate rows found and removed: 13
5151 - Clean data shape: (410, 17)
5152 - Unique revision_ids: 410 | Data Shape: 410 | Same? : -> True
5153 - Removing edits that are reverts from df | New Shape: (364, 17)
5154 - Is any revert_risk_score NA? : False
5155 - Is any user_edit_count NA? : False
5156 - Is any time_to_revert NA? : False
5157 - ROC_iuwiki.png saved!
5158 - Optimal threshold for 15.0% FPR is: 0.9290987849235535
5159 - confusion_matrix_iuwiki.png saved!
5160 - False Positive Rate is: 0.14779874213836477
5161 - CONFUSION MATRIX -
5162Predicted not reverted reverted
5163Actual
5164not reverted 271 47
5165reverted 25 21
5166
5167
5168============ - bugwiki - ============
5169 - Snapshot: 2025-06
5170 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5171 - Raw data shape: (829, 17)
5172 - Duplicate rows found and removed: 6
5173 - Clean data shape: (823, 17)
5174 - Unique revision_ids: 823 | Data Shape: 823 | Same? : -> True
5175 - Removing edits that are reverts from df | New Shape: (805, 17)
5176 - Is any revert_risk_score NA? : False
5177 - Is any user_edit_count NA? : False
5178 - Is any time_to_revert NA? : False
5179 - ROC_bugwiki.png saved!
5180 - Optimal threshold for 15.0% FPR is: 0.8452925682067871
5181 - confusion_matrix_bugwiki.png saved!
5182 - False Positive Rate is: 0.14863102998696218
5183 - CONFUSION MATRIX -
5184Predicted not reverted reverted
5185Actual
5186not reverted 653 114
5187reverted 27 11
5188
5189
5190============ - kgwiki - ============
5191 - Snapshot: 2025-06
5192 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5193 - Raw data shape: (1633, 17)
5194 - Duplicate rows found and removed: 4
5195 - Clean data shape: (1629, 17)
5196 - Unique revision_ids: 1629 | Data Shape: 1629 | Same? : -> True
5197 - Removing edits that are reverts from df | New Shape: (1605, 17)
5198 - Is any revert_risk_score NA? : False
5199 - Is any user_edit_count NA? : False
5200 - Is any time_to_revert NA? : False
5201 - ROC_kgwiki.png saved!
5202 - Optimal threshold for 15.0% FPR is: 0.4633360207080841
5203 - confusion_matrix_kgwiki.png saved!
5204 - False Positive Rate is: 0.1471147748890298
5205 - CONFUSION MATRIX -
5206Predicted not reverted reverted
5207Actual
5208not reverted 1345 232
5209reverted 8 20
5210
5211
5212============ - vewiki - ============
5213 - Snapshot: 2025-06
5214 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5215 - Raw data shape: (243, 17)
5216 - Duplicate rows found and removed: 6
5217 - Clean data shape: (237, 17)
5218 - Unique revision_ids: 237 | Data Shape: 237 | Same? : -> True
5219 - Removing edits that are reverts from df | New Shape: (217, 17)
5220 - Is any revert_risk_score NA? : False
5221 - Is any user_edit_count NA? : False
5222 - Is any time_to_revert NA? : False
5223 - ROC_vewiki.png saved!
5224 - Optimal threshold for 15.0% FPR is: 0.8505914807319641
5225 - confusion_matrix_vewiki.png saved!
5226 - False Positive Rate is: 0.14432989690721648
5227 - CONFUSION MATRIX -
5228Predicted not reverted reverted
5229Actual
5230not reverted 166 28
5231reverted 15 8
5232
5233
5234============ - piwiki - ============
5235 - Snapshot: 2025-06
5236 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5237 - Raw data shape: (96, 17)
5238 - Duplicate rows found and removed: 0
5239 - Clean data shape: (96, 17)
5240 - Unique revision_ids: 96 | Data Shape: 96 | Same? : -> True
5241 - Removing edits that are reverts from df | New Shape: (92, 17)
5242 - Is any revert_risk_score NA? : False
5243 - Is any user_edit_count NA? : False
5244 - Is any time_to_revert NA? : False
5245 - ROC_piwiki.png saved!
5246 - Optimal threshold for 15.0% FPR is: 0.9070486426353455
5247 - confusion_matrix_piwiki.png saved!
5248 - False Positive Rate is: 0.12359550561797752
5249 - CONFUSION MATRIX -
5250Predicted not reverted reverted
5251Actual
5252not reverted 78 11
5253reverted 3 0
5254
5255
5256============ - krcwiki - ============
5257 - Snapshot: 2025-06
5258 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5259 - Raw data shape: (4382, 17)
5260 - Duplicate rows found and removed: 22
5261 - Clean data shape: (4360, 17)
5262 - Unique revision_ids: 4360 | Data Shape: 4360 | Same? : -> True
5263 - Removing edits that are reverts from df | New Shape: (4352, 17)
5264 - Is any revert_risk_score NA? : False
5265 - Is any user_edit_count NA? : False
5266 - Is any time_to_revert NA? : False
5267 - ROC_krcwiki.png saved!
5268 - Optimal threshold for 15.0% FPR is: 0.28637173771858215
5269 - confusion_matrix_krcwiki.png saved!
5270 - False Positive Rate is: 0.15777262180974477
5271 - CONFUSION MATRIX -
5272Predicted not reverted reverted
5273Actual
5274not reverted 3630 680
5275reverted 24 18
5276
5277
5278============ - jamwiki - ============
5279 - Snapshot: 2025-06
5280 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5281 - Raw data shape: (315, 17)
5282 - Duplicate rows found and removed: 15
5283 - Clean data shape: (300, 17)
5284 - Unique revision_ids: 300 | Data Shape: 300 | Same? : -> True
5285 - Removing edits that are reverts from df | New Shape: (220, 17)
5286 - Is any revert_risk_score NA? : False
5287 - Is any user_edit_count NA? : False
5288 - Is any time_to_revert NA? : False
5289 - ROC_jamwiki.png saved!
5290 - Optimal threshold for 15.0% FPR is: 0.8582241535186768
5291 - confusion_matrix_jamwiki.png saved!
5292 - False Positive Rate is: 0.1518987341772152
5293 - CONFUSION MATRIX -
5294Predicted not reverted reverted
5295Actual
5296not reverted 134 24
5297reverted 24 38
5298
5299
5300============ - xalwiki - ============
5301 - Snapshot: 2025-06
5302 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5303 - Raw data shape: (895, 17)
5304 - Duplicate rows found and removed: 36
5305 - Clean data shape: (859, 17)
5306 - Unique revision_ids: 859 | Data Shape: 859 | Same? : -> True
5307 - Removing edits that are reverts from df | New Shape: (478, 17)
5308 - Is any revert_risk_score NA? : False
5309 - Is any user_edit_count NA? : False
5310 - Is any time_to_revert NA? : False
5311 - ROC_xalwiki.png saved!
5312 - Optimal threshold for 15.0% FPR is: 0.9615033268928528
5313 - confusion_matrix_xalwiki.png saved!
5314 - False Positive Rate is: 0.14915254237288136
5315 - CONFUSION MATRIX -
5316Predicted not reverted reverted
5317Actual
5318not reverted 251 44
5319reverted 60 123
5320
5321
5322============ - pntwiki - ============
5323 - Snapshot: 2025-06
5324 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5325 - Raw data shape: (372, 17)
5326 - Duplicate rows found and removed: 24
5327 - Clean data shape: (348, 17)
5328 - Unique revision_ids: 348 | Data Shape: 348 | Same? : -> True
5329 - Removing edits that are reverts from df | New Shape: (338, 17)
5330 - Is any revert_risk_score NA? : False
5331 - Is any user_edit_count NA? : False
5332 - Is any time_to_revert NA? : False
5333 - ROC_pntwiki.png saved!
5334 - Optimal threshold for 15.0% FPR is: 0.8978685140609741
5335 - confusion_matrix_pntwiki.png saved!
5336 - False Positive Rate is: 0.15873015873015872
5337 - CONFUSION MATRIX -
5338Predicted not reverted reverted
5339Actual
5340not reverted 265 50
5341reverted 6 17
5342
5343
5344============ - towiki - ============
5345 - Snapshot: 2025-06
5346 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5347 - Raw data shape: (222, 17)
5348 - Duplicate rows found and removed: 6
5349 - Clean data shape: (216, 17)
5350 - Unique revision_ids: 216 | Data Shape: 216 | Same? : -> True
5351 - Removing edits that are reverts from df | New Shape: (120, 17)
5352 - Is any revert_risk_score NA? : False
5353 - Is any user_edit_count NA? : False
5354 - Is any time_to_revert NA? : False
5355 - ROC_towiki.png saved!
5356 - Optimal threshold for 15.0% FPR is: 0.7774780988693237
5357 - confusion_matrix_towiki.png saved!
5358 - False Positive Rate is: 0.16363636363636364
5359 - CONFUSION MATRIX -
5360Predicted not reverted reverted
5361Actual
5362not reverted 92 18
5363reverted 7 3
5364
5365
5366============ - tumwiki - ============
5367 - Snapshot: 2025-06
5368 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5369 - Raw data shape: (525, 17)
5370 - Duplicate rows found and removed: 0
5371 - Clean data shape: (525, 17)
5372 - Unique revision_ids: 525 | Data Shape: 525 | Same? : -> True
5373 - Removing edits that are reverts from df | New Shape: (518, 17)
5374 - Is any revert_risk_score NA? : False
5375 - Is any user_edit_count NA? : False
5376 - Is any time_to_revert NA? : False
5377 - ROC_tumwiki.png saved!
5378 - Optimal threshold for 15.0% FPR is: 0.6807906031608582
5379 - confusion_matrix_tumwiki.png saved!
5380 - False Positive Rate is: 0.17221135029354206
5381 - CONFUSION MATRIX -
5382Predicted not reverted reverted
5383Actual
5384not reverted 423 88
5385reverted 1 6
5386
5387
5388============ - dzwiki - ============
5389 - Snapshot: 2025-06
5390 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5391 - Raw data shape: (949, 17)
5392 - Duplicate rows found and removed: 7
5393 - Clean data shape: (942, 17)
5394 - Unique revision_ids: 942 | Data Shape: 942 | Same? : -> True
5395 - Removing edits that are reverts from df | New Shape: (923, 17)
5396 - Is any revert_risk_score NA? : False
5397 - Is any user_edit_count NA? : False
5398 - Is any time_to_revert NA? : False
5399 - ROC_dzwiki.png saved!
5400 - Optimal threshold for 15.0% FPR is: 0.661038339138031
5401 - confusion_matrix_dzwiki.png saved!
5402 - False Positive Rate is: 0.14733178654292342
5403 - CONFUSION MATRIX -
5404Predicted not reverted reverted
5405Actual
5406not reverted 735 127
5407reverted 43 18
5408
5409
5410============ - chywiki - ============
5411 - Snapshot: 2025-06
5412 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5413 - Raw data shape: (155, 17)
5414 - Duplicate rows found and removed: 1
5415 - Clean data shape: (154, 17)
5416 - Unique revision_ids: 154 | Data Shape: 154 | Same? : -> True
5417 - Removing edits that are reverts from df | New Shape: (126, 17)
5418 - Is any revert_risk_score NA? : False
5419 - Is any user_edit_count NA? : False
5420 - Is any time_to_revert NA? : False
5421 - ROC_chywiki.png saved!
5422 - Optimal threshold for 15.0% FPR is: 0.8710290193557739
5423 - confusion_matrix_chywiki.png saved!
5424 - False Positive Rate is: 0.14953271028037382
5425 - CONFUSION MATRIX -
5426Predicted not reverted reverted
5427Actual
5428not reverted 91 16
5429reverted 12 7
5430
5431
5432============ - ikwiki - ============
5433 - Snapshot: 2025-06
5434 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5435 - Raw data shape: (227, 17)
5436 - Duplicate rows found and removed: 4
5437 - Clean data shape: (223, 17)
5438 - Unique revision_ids: 223 | Data Shape: 223 | Same? : -> True
5439 - Removing edits that are reverts from df | New Shape: (200, 17)
5440 - Is any revert_risk_score NA? : False
5441 - Is any user_edit_count NA? : False
5442 - Is any time_to_revert NA? : False
5443 - ROC_ikwiki.png saved!
5444 - Optimal threshold for 15.0% FPR is: 0.6205543875694275
5445 - confusion_matrix_ikwiki.png saved!
5446 - False Positive Rate is: 0.15819209039548024
5447 - CONFUSION MATRIX -
5448Predicted not reverted reverted
5449Actual
5450not reverted 149 28
5451reverted 1 22
5452
5453
5454============ - koiwiki - ============
5455 - Snapshot: 2025-06
5456 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5457 - Raw data shape: (216, 17)
5458 - Duplicate rows found and removed: 2
5459 - Clean data shape: (214, 17)
5460 - Unique revision_ids: 214 | Data Shape: 214 | Same? : -> True
5461 - Removing edits that are reverts from df | New Shape: (209, 17)
5462 - Is any revert_risk_score NA? : False
5463 - Is any user_edit_count NA? : False
5464 - Is any time_to_revert NA? : False
5465 - ROC_koiwiki.png saved!
5466 - Optimal threshold for 15.0% FPR is: 0.8761064410209656
5467 - confusion_matrix_koiwiki.png saved!
5468 - False Positive Rate is: 0.06829268292682927
5469 - CONFUSION MATRIX -
5470Predicted not reverted reverted
5471Actual
5472not reverted 191 14
5473reverted 3 1
5474
5475
5476============ - bmwiki - ============
5477 - Snapshot: 2025-06
5478 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5479 - Raw data shape: (194, 17)
5480 - Duplicate rows found and removed: 4
5481 - Clean data shape: (190, 17)
5482 - Unique revision_ids: 190 | Data Shape: 190 | Same? : -> True
5483 - Removing edits that are reverts from df | New Shape: (155, 17)
5484 - Is any revert_risk_score NA? : False
5485 - Is any user_edit_count NA? : False
5486 - Is any time_to_revert NA? : False
5487 - ROC_bmwiki.png saved!
5488 - Optimal threshold for 15.0% FPR is: 0.9002848267555237
5489 - confusion_matrix_bmwiki.png saved!
5490 - False Positive Rate is: 0.14814814814814814
5491 - CONFUSION MATRIX -
5492Predicted not reverted reverted
5493Actual
5494not reverted 115 20
5495reverted 11 9
5496
5497
5498============ - eewiki - ============
5499 - Snapshot: 2025-06
5500 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5501 - Raw data shape: (1284, 17)
5502 - Duplicate rows found and removed: 7
5503 - Clean data shape: (1277, 17)
5504 - Unique revision_ids: 1277 | Data Shape: 1277 | Same? : -> True
5505 - Removing edits that are reverts from df | New Shape: (1266, 17)
5506 - Is any revert_risk_score NA? : False
5507 - Is any user_edit_count NA? : False
5508 - Is any time_to_revert NA? : False
5509 - ROC_eewiki.png saved!
5510 - Optimal threshold for 15.0% FPR is: 0.6407523155212402
5511 - confusion_matrix_eewiki.png saved!
5512 - False Positive Rate is: 0.14297124600638977
5513 - CONFUSION MATRIX -
5514Predicted not reverted reverted
5515Actual
5516not reverted 1073 179
5517reverted 4 10
5518
5519
5520============ - rnwiki - ============
5521 - Snapshot: 2025-06
5522 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5523 - Raw data shape: (217, 17)
5524 - Duplicate rows found and removed: 7
5525 - Clean data shape: (210, 17)
5526 - Unique revision_ids: 210 | Data Shape: 210 | Same? : -> True
5527 - Removing edits that are reverts from df | New Shape: (199, 17)
5528 - Is any revert_risk_score NA? : False
5529 - Is any user_edit_count NA? : False
5530 - Is any time_to_revert NA? : False
5531 - ROC_rnwiki.png saved!
5532 - Optimal threshold for 15.0% FPR is: 0.805659830570221
5533 - confusion_matrix_rnwiki.png saved!
5534 - False Positive Rate is: 0.14754098360655737
5535 - CONFUSION MATRIX -
5536Predicted not reverted reverted
5537Actual
5538not reverted 156 27
5539reverted 7 9
5540
5541
5542============ - lbewiki - ============
5543 - Snapshot: 2025-06
5544 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5545 - Raw data shape: (694, 17)
5546 - Duplicate rows found and removed: 6
5547 - Clean data shape: (688, 17)
5548 - Unique revision_ids: 688 | Data Shape: 688 | Same? : -> True
5549 - Removing edits that are reverts from df | New Shape: (651, 17)
5550 - Is any revert_risk_score NA? : False
5551 - Is any user_edit_count NA? : False
5552 - Is any time_to_revert NA? : False
5553 - ROC_lbewiki.png saved!
5554 - Optimal threshold for 15.0% FPR is: 0.6905577182769775
5555 - confusion_matrix_lbewiki.png saved!
5556 - False Positive Rate is: 0.15522875816993464
5557 - CONFUSION MATRIX -
5558Predicted not reverted reverted
5559Actual
5560not reverted 517 95
5561reverted 3 36
5562
5563
5564============ - zawiki - ============
5565 - Snapshot: 2025-06
5566 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5567 - Raw data shape: (206, 17)
5568 - Duplicate rows found and removed: 3
5569 - Clean data shape: (203, 17)
5570 - Unique revision_ids: 203 | Data Shape: 203 | Same? : -> True
5571 - Removing edits that are reverts from df | New Shape: (172, 17)
5572 - Is any revert_risk_score NA? : False
5573 - Is any user_edit_count NA? : False
5574 - Is any time_to_revert NA? : False
5575 - ROC_zawiki.png saved!
5576 - Optimal threshold for 15.0% FPR is: 0.7917940020561218
5577 - confusion_matrix_zawiki.png saved!
5578 - False Positive Rate is: 0.18
5579 - CONFUSION MATRIX -
5580Predicted not reverted reverted
5581Actual
5582not reverted 123 27
5583reverted 4 18
5584
5585
5586============ - kiwiki - ============
5587 - Snapshot: 2025-06
5588 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5589 - Raw data shape: (190, 17)
5590 - Duplicate rows found and removed: 2
5591 - Clean data shape: (188, 17)
5592 - Unique revision_ids: 188 | Data Shape: 188 | Same? : -> True
5593 - Removing edits that are reverts from df | New Shape: (165, 17)
5594 - Is any revert_risk_score NA? : False
5595 - Is any user_edit_count NA? : False
5596 - Is any time_to_revert NA? : False
5597 - ROC_kiwiki.png saved!
5598 - Optimal threshold for 15.0% FPR is: 0.8098335266113281
5599 - confusion_matrix_kiwiki.png saved!
5600 - False Positive Rate is: 0.11409395973154363
5601 - CONFUSION MATRIX -
5602Predicted not reverted reverted
5603Actual
5604not reverted 132 17
5605reverted 2 14
5606
5607
5608============ - sgwiki - ============
5609 - Snapshot: 2025-06
5610 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5611 - Raw data shape: (146, 17)
5612 - Duplicate rows found and removed: 0
5613 - Clean data shape: (146, 17)
5614 - Unique revision_ids: 146 | Data Shape: 146 | Same? : -> True
5615 - Removing edits that are reverts from df | New Shape: (139, 17)
5616 - Is any revert_risk_score NA? : False
5617 - Is any user_edit_count NA? : False
5618 - Is any time_to_revert NA? : False
5619 - ROC_sgwiki.png saved!
5620 - Optimal threshold for 15.0% FPR is: 0.8118735551834106
5621 - confusion_matrix_sgwiki.png saved!
5622 - False Positive Rate is: 0.09848484848484848
5623 - CONFUSION MATRIX -
5624Predicted not reverted reverted
5625Actual
5626not reverted 119 13
5627reverted 5 2
5628
5629
5630============ - chwiki - ============
5631 - Snapshot: 2025-06
5632 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5633 - Raw data shape: (79, 17)
5634 - Duplicate rows found and removed: 1
5635 - Clean data shape: (78, 17)
5636 - Unique revision_ids: 78 | Data Shape: 78 | Same? : -> True
5637 - Removing edits that are reverts from df | New Shape: (71, 17)
5638 - Is any revert_risk_score NA? : False
5639 - Is any user_edit_count NA? : False
5640 - Is any time_to_revert NA? : False
5641 - ROC_chwiki.png saved!
5642 - Optimal threshold for 15.0% FPR is: 0.9022724032402039
5643 - confusion_matrix_chwiki.png saved!
5644 - False Positive Rate is: 0.15254237288135594
5645 - CONFUSION MATRIX -
5646Predicted not reverted reverted
5647Actual
5648not reverted 50 9
5649reverted 7 5
5650
5651
5652============ - tywiki - ============
5653 - Snapshot: 2025-06
5654 - Date Window: 2024-08-01 00:00:00 - 2025-09-05 00:00:00
5655 - Raw data shape: (92, 17)
5656 - Duplicate rows found and removed: 5
5657 - Clean data shape: (87, 17)
5658 - Unique revision_ids: 87 | Data Shape: 87 | Same? : -> True
5659 - Removing edits that are reverts from df | New Shape: (83, 17)
5660 - Is any revert_risk_score NA? : False
5661 - Is any user_edit_count NA? : False
5662 - Is any time_to_revert NA? : False
5663 - ROC_tywiki.png saved!
5664 - Optimal threshold for 15.0% FPR is: 0.8348962664604187
5665 - confusion_matrix_tywiki.png saved!
5666 - False Positive Rate is: 0.2857142857142857
5667 - CONFUSION MATRIX -
5668Predicted not reverted reverted
5669Actual
5670not reverted 55 22
5671reverted 6 0

Optimal Thresholds Table

WikiThreshold
dewiki0.560
jawiki0.780
viwiki0.672
thwiki0.708
nowiki0.568
elwiki0.807
hywiki0.665
hiwiki0.878
bgwiki0.867
dawiki0.740
hrwiki0.520
skwiki0.889
mswiki0.942
euwiki0.450
slwiki0.865
ltwiki0.351
tawiki0.635
zh_yuewiki0.857
kawiki0.460
eowiki0.3475
glwiki0.269
urwiki0.2603
sqwiki0.583
mywiki0.871
ckbwiki0.279
knwiki0.4486
shwiki0.117
uzwiki0.534
cebwiki0.0102
be_x_oldwiki0.613
aswiki0.250
newiki0.824
gawiki0.408
kuwiki0.342
scowiki0.836
arzwiki0.072
bawiki0.659
ttwiki0.388
astwiki0.149
jvwiki0.387
ocwiki0.866
lbwiki0.431
satwiki0.2335
mnwiki0.75
azbwiki0.621
guwiki0.324
brwiki0.565
warwiki0.137
siwiki0.643
minwiki0.170
wuuwiki0.401
sowiki0.933
orwiki0.277
tgwiki0.708
yiwiki0.912
avkwiki0.660
kywiki0.473
zh_min_nanwiki0.859
kmwiki0.893
zh_classicalwiki0.815
hywwiki0.363
alswiki0.635
fywiki0.546
anwiki0.327
suwiki0.781
yowiki0.83
arywiki0.117
sdwiki0.453
vecwiki0.631
pswiki0.581
ndswiki0.705
banwiki0.735
sahwiki0.846
tcywiki0.277
lijwiki0.369
lmowiki0.215
barwiki0.721
bclwiki0.400
cvwiki0.435
mtwiki0.525
iawiki0.595
szywiki0.352
cvwiki0.435
mtwiki0.525
iawiki0.595
szywiki0.352
pnbwiki0.328
scwiki0.621
cewiki0.442
vowiki0.212
tkwiki0.839
iowiki0.419
mnwwiki0.348
sawiki0.798
quwiki0.678
crhwiki0.751
bhwiki0.327
lowiki0.549
maiwiki0.425
diqwiki0.501
liwiki0.505
nds_nlwiki0.895
fowiki0.803
iewiki0.503
kwwiki0.590
htwiki0.418
oswiki0.314
igwiki0.334
pmswiki0.897
myvwiki0.477
acewiki0.858
abwiki0.638
tyvwiki0.730
gdwiki0.625
mznwiki0.165
mgwiki0.825
cowiki0.817
xmfwiki0.142
wawiki0.613
nqowiki0.734
pcdwiki0.527
amwiki0.913
emlwiki0.613
scnwiki0.426
zuwiki0.798
lldwiki0.487
bjnwiki0.613
frrwiki0.657
bat_smgwiki0.738
sewiki0.854
lfnwiki0.805
vepwiki0.284
kabwiki0.875
ruewiki0.475
ugwiki0.333
lezwiki0.882
szlwiki0.609
frpwiki0.833
olowiki0.615
bpywiki0.904
rwwiki0.602
mhrwiki0.811
gorwiki0.631
dsbwiki0.843
rmwiki0.773
glkwiki0.2158
napwiki0.901
gnwiki0.613
fiu_vrowiki0.626
snwiki0.844
hawwiki0.726
gomwiki0.791
atjwiki0.677
awawiki0.776
hifwiki0.599
vlswiki0.737
hsbwiki0.619
papwiki0.317
ilowiki0.861
angwiki0.623
udmwiki0.928
inhwiki0.512
shnwiki0.3912
roa_tarawiki0.855
pamwiki0.613
hakwiki0.872
xhwiki0.826
cdowiki0.749
crwiki0.960
bowiki0.682
mwlwiki0.752
kvwiki0.837
nvwiki0.3899
tiwiki0.934
lnwiki0.613
dinwiki0.974
pdcwiki0.877
wowiki0.909
ladwiki0.613
kaawiki0.3656
avwiki0.642
arcwiki0.843
nywiki0.802
cuwiki0.828
pflwiki0.613
csbwiki0.854
extwiki0.543
miwiki0.881
aywiki0.720
nrmwiki0.868
furwiki0.932
cbk_zamwiki0.787
newwiki0.521
nahwiki0.920
gvwiki0.349
omwiki0.840
klwiki0.816
zeawiki0.633
smwiki0.753
roa_rupwiki0.691
map_bmswiki0.908
stwiki0.474
kswiki0.2554
bxrwiki0.842
kbpwiki0.821
fjwiki0.483
ltgwiki0.812
gotwiki0.781
ganwiki0.806
pagwiki0.953
gagwiki0.560
sswiki0.654
rmywiki0.906
ffwiki0.390
nsowiki0.819
jbowiki0.929
chrwiki0.915
adywiki0.843
stqwiki0.842
tetwiki0.808
tnwiki0.529
lgwiki0.596
dvwiki0.935
gcrwiki0.825
tswiki0.743
kbdwiki0.905
novwiki0.824
twwiki0.509
srnwiki0.831
mdfwiki0.254
kshwiki0.765
tpiwiki0.968
pihwiki0.874
biwiki0.799
iuwiki0.929
bugwiki0.845
kgwiki0.463
vewiki0.850
piwiki0.907
krcwiki0.286
jamwiki0.858
xalwiki0.961
pntwiki0.897
towiki0.777
tumwiki0.680
dzwiki0.661
chywiki0.871
ikwiki0.620
koiwiki0.876
bmwiki0.900
eewiki0.640
rnwiki0.805
lbewiki0.690
zawiki0.791
kiwiki0.809
sgwiki0.811
chwiki0.902
tywiki0.834

@Samwalton9-WMF @OTichonova

Which of these wikis do we want to prioritize first for the dashboard?

@Samwalton9-WMF @OTichonova

Which of these wikis do we want to prioritize first for the dashboard?

thwiki is the additional one we're considering deploying our MVP to that doesn't currently have Revert Risk.

Perfect, I went ahead and created a ticket for deploying to thwiki: T409438: Enable revertrisk filters in thwiki and put it on the board for this sprint since I assume we'll want to get started on this ASAP. Feel free to move it as needed.

Change #1212081 had a related patch set uploaded (by Gkyziridis; author: Gkyziridis):

[operations/mediawiki-config@master] ores-extension: Enable revertrisk filters for multiple wikis.

https://gerrit.wikimedia.org/r/1212081

Change #1212086 had a related patch set uploaded (by Gkyziridis; author: Gkyziridis):

[operations/mediawiki-config@master] ores-extension: Enable revertrisklanguageagnostic on multiple wikis.

https://gerrit.wikimedia.org/r/1212086

Change #1212081 abandoned by Gkyziridis:

[operations/mediawiki-config@master] ores-extension: Enable revertrisk filters for multiple wikis.

Reason:

Abandon because of tests. New clean patch: 1212086

https://gerrit.wikimedia.org/r/1212081

Update

I configure all the rr thresholds for all the wikis and enabled the model for all of them in this patch: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1212086 .
I excluded thwiki from the above patch since you are using it for the MVP here: https://phabricator.wikimedia.org/T409438
I also avoided to run the composer manage-dblist add {wiki_name} ores for all the wikis, which means that whenever we deploy all these wikis we need to run the composer for all of them.

When we start the actual deployment:
Due to the fact that we have a huge number of wikis which are needed to be deployed, I suggest to to do it in batches. Right now, in the patch above only the thresholds are set for each wiki, that means that if this patch is merged and deployed nothing will be changed. In the next iterations, when we start to deploying the wikis we need to enable ORES model and enable the UI as well. Only then the thresholds which are configured in the patch will be functional. So, I suggest to enable ORES model in batches e.g. for 4-5 wikis per batch. This will take some time to finish all batches, but we can easily handle issues that could occur during the backport deployments