Page MenuHomePhabricator

Revisions with rev_page = 0 in production databases
Open, Needs TriagePublic

Description

There are hundreds of rows in the revision table in production databases that have rev_page set to 0. They cannot be accessed (at least not in the normal ways), as they aren’t connected to any page, but they can cause production errors like T309151. It should be investigated how they came to be (it’s especially interesting how unrelated the number of broken revisions compared to the wiki size is), they should be cleaned up, and maybe a foreign key constraint should be introduced to prevent them from being inserted in the first place.

tacsipacsi@tools-sgebastion-10:~$ for wiki in enwiki frwiki cebwiki metawiki commonswiki mediawikiwiki wikidatawiki ; do sql $wiki "SELECT COUNT(*) AS '$wiki' FROM revision WHERE rev_page = 0"; done
+--------+
| enwiki |
+--------+
|     21 |
+--------+
+--------+
| frwiki |
+--------+
|     23 |
+--------+
+---------+
| cebwiki |
+---------+
|       0 |
+---------+
+----------+
| metawiki |
+----------+
|      403 |
+----------+
+-------------+
| commonswiki |
+-------------+
|          39 |
+-------------+
+---------------+
| mediawikiwiki |
+---------------+
|           320 |
+---------------+
+--------------+
| wikidatawiki |
+--------------+
|            3 |
+--------------+

(This is not an exhaustive list of affected wikis, I just queried some wikis that came into my mind.)

This may be related to T112282, but I didn’t find any broken rows on mgwiktionary.

Event Timeline

Marostegui subscribed.

I think this is more MW side of things than the database storage itself.

200 wikis have this issue:

1acewiki: 1
2alswiki: 1
3arwiki: 16
4arwikiversity: 20
5arwiktionary: 67
6arzwiki: 158
7azbwiki: 164
8azwiki: 5
9barwiki: 4
10bawiki: 2
11betawikiversity: 36
12bewiki: 1
13bgwiki: 1
14bhwiki: 5
15bjnwiki: 2
16brwikimedia: 3
17bswiki: 1
18cawiki: 1
19chywiki: 1025
20commonswiki: 39
21cowikimedia: 1
22crwiki: 3
23cswiki: 8
24cswikisource: 75
25cywikisource: 7
26dewiki: 2105
27dewikiversity: 706
28dewikivoyage: 23
29dewiktionary: 1
30dsbwiki: 3
31elwiki: 255
32enwiki: 21
33enwikibooks: 3308
34enwikinews: 5
35enwikisource: 4
36enwikiversity: 64
37enwikivoyage: 9
38enwiktionary: 6
39eswiki: 31
40eswikibooks: 1
41eswikinews: 1
42eswikivoyage: 5
43etwiki: 1
44euwiki: 1
45extwiki: 100
46fawiki: 24
47fawikinews: 1
48fawikivoyage: 36
49fiwiki: 3
50foundationwiki: 24
51frrwiki: 225
52frwiki: 23
53frwikibooks: 45
54frwikinews: 1
55frwikiquote: 3
56frwikisource: 2
57frwiktionary: 184
58glwiktionary: 4
59gomwiki: 29
60guwiki: 1308
61hewiki: 1
62hewikiquote: 165
63hewikisource: 34
64hewikivoyage: 290
65hifwiki: 5
66hiwiki: 314
67huwiki: 16
68huwikinews: 2
69idwiki: 9
70idwiktionary: 5
71incubatorwiki: 787
72itwiki: 64
73itwikibooks: 2
74itwikisource: 420
75itwikiversity: 53
76itwikivoyage: 1673
77itwiktionary: 23
78jawiki: 9
79jawikiversity: 7
80kaawiki: 1
81kawiki: 1
82kbdwiki: 33
83kkwiki: 4
84kmwiki: 13
85knwiki: 1300
86knwikisource: 1
87koiwiki: 8
88kowikiversity: 962
89krcwiki: 1
90kshwiki: 1
91ladwiki: 1
92liwiktionary: 1
93ltgwiki: 29
94ltwiki: 1
95ltwiktionary: 1
96maiwiki: 516
97mdfwiki: 1
98mediawikiwiki: 320
99metawiki: 403
100mhrwiki: 7
101mkwiki: 1
102mlwiki: 2503
103mlwiktionary: 3
104mrjwiki: 1
105mrwiki: 1
106mwlwiki: 5
107mznwiki: 1766
108napwiki: 1
109nlwiki: 16
110nlwikisource: 2
111nlwiktionary: 1
112nowiki: 412
113nsowiki: 4
114nycwikimedia: 372
115officewiki: 4
116orwiki: 89
117outreachwiki: 200
118pflwiki: 2
119plwikibooks: 1
120plwikisource: 1
121pnbwiki: 16
122pnbwiktionary: 5
123pntwiki: 6
124ptwiki: 33
125ptwikinews: 1
126ptwikisource: 1
127ptwikiversity: 12
128quwiki: 1
129rowiki: 4
130ruewiki: 4
131ruwiki: 81
132ruwikimedia: 17
133ruwikisource: 3
134ruwikiversity: 5
135sahwiki: 5
136sawikiquote: 3
137sawiktionary: 17
138sewiki: 299
139sewikimedia: 1
140simplewiki: 128
141simplewiktionary: 10
142skwiki: 1
143skwikisource: 67
144slwiki: 1
145sourceswiki: 245
146sqwikinews: 5
147svwiki: 5
148svwikiversity: 11
149szlwiki: 2
150tawiki: 1
151test2wiki: 1
152testwiki: 28
153tetwiki: 13
154tewiki: 99
155thwiki: 2
156tlwiki: 2
157tpiwiki: 68
158trwiki: 7
159ukwiki: 2
160urwiki: 3
161vecwikisource: 276
162vepwiki: 5
163viwiki: 503
164viwikibooks: 106
165viwikivoyage: 26
166wikidatawiki: 3
167wikimania2014wiki: 129
168wuuwiki: 1
169yiwikisource: 1
170zhwikisource: 94
171zhwikivoyage: 32
172zhwiktionary: 1

It should be investigated how they came to be

For what it’s worth, it apparently hasn’t happened to new revisions in years:

lucaswerkmeister-wmde@stat1007:~$ for wiki in enwiki frwiki cebwiki metawiki commonswiki mediawikiwiki wikidatawiki ; do sudo -u analytics-wmde analytics-mysql $wiki <<< "SELECT SUBSTR(rev_timestamp, 1, 4) AS year, COUNT(*) AS '$wiki' FROM revision WHERE rev_page = 0 GROUP BY SUBSTR(rev_timestamp, 1, 4)"; printf '\n'; done
year	enwiki
2012	21

year	frwiki
2008	5
2009	8
2010	5
2011	5


year	metawiki
2005	1
2007	2
2008	33
2009	114
2010	2
2011	84
2012	161
2014	6

year	commonswiki
2005	3
2006	12
2008	3
2009	1
2010	15
2011	3
2012	2

year	mediawikiwiki
2007	62
2008	44
2009	108
2010	73
2011	24
2012	9

year	wikidatawiki
2004	1
2006	2

Though apparently revision rows like this can still be created via import (wikidatawiki didn’t exist in 2004/2006, so those must be imported revisions).

lucaswerkmeister-wmde@stat1007:~$ curl -sL https://phabricator.wikimedia.org/paste/raw/28505/ | cut -d: -f1 | while IFS= read -r wiki; do year=$(sudo -u analytics-wmde analytics-mysql "$wiki" -N <<< "SELECT SUBSTR(MAX(rev_timestamp), 1, 4) FROM revision WHERE rev_page = 0"); printf '%d\t%20s\n' "$year" "$wiki"; done | sort -rn
2015	            testwiki
2015	              orwiki
2015	              mlwiki
2015	             maiwiki
2015	              knwiki
2015	              guwiki
2015	        enwikivoyage
2015	             azbwiki
2014	          officewiki
2014	            metawiki
2014	        itwikisource
2014	        idwiktionary
2014	        hewikivoyage
2014	             gomwiki
2014	        fawikivoyage
2013	        zhwikivoyage
2013	        viwikivoyage
2013	              tewiki
2013	          simplewiki
2013	              sewiki
2013	              nowiki
2013	       kowikiversity
2013	       incubatorwiki
2013	        glwiktionary
2013	         frwikiquote
2013	        enwiktionary
2013	       enwikiversity
2013	        enwikisource
2013	         enwikibooks
2013	              dewiki
2012	         viwikibooks
2012	              viwiki
2012	              trwiki
2012	             tpiwiki
2012	           test2wiki
2012	              slwiki
2012	        sawiktionary
2012	              ruwiki
2012	              ptwiki
2012	       pnbwiktionary
2012	        nycwikimedia
2012	              nlwiki
2012	             mznwiki
2012	              mkwiki
2012	       mediawikiwiki
2012	              kmwiki
2012	              jawiki
2012	        itwikivoyage
2012	              itwiki
2012	        hewikisource
2012	        eswikivoyage
2012	              eswiki
2012	              enwiki
2012	              elwiki
2012	       dewikiversity
2012	         commonswiki
2012	             chywiki
2012	              bswiki
2011	        zhwiktionary
2011	             vepwiki
2011	             tetwiki
2011	              svwiki
2011	         sawikiquote
2011	        ruwikisource
2011	             ruewiki
2011	              rowiki
2011	        outreachwiki
2011	             nsowiki
2011	             ltgwiki
2011	              kkwiki
2011	             kbdwiki
2011	              idwiki
2011	              huwiki
2011	              hiwiki
2011	        frwikisource
2011	              frwiki
2011	      foundationwiki
2011	              fiwiki
2011	              fawiki
2011	          eswikinews
2011	          enwikinews
2011	        dewikivoyage
2011	              bhwiki
2011	              bewiki
2011	              bawiki
2011	              azwiki
2011	       arwikiversity
2011	              arwiki
2010	       vecwikisource
2010	              ukwiki
2010	              thwiki
2010	             szlwiki
2010	       svwikiversity
2010	          sqwikinews
2010	              skwiki
2010	              quwiki
2010	        nlwikisource
2010	             mrjwiki
2010	        ltwiktionary
2010	             krcwiki
2010	             koiwiki
2010	        knwikisource
2010	              kawiki
2010	        frwiktionary
2010	             frrwiki
2010	              crwiki
2010	              cawiki
2010	             bjnwiki
2009	        yiwikisource
2009	             wuuwiki
2009	   wikimania2014wiki
2009	              urwiki
2009	              tlwiki
2009	              tawiki
2009	         sewikimedia
2009	         ruwikimedia
2009	          ptwikinews
2009	             pntwiki
2009	             pnbwiki
2009	             mwlwiki
2009	              mrwiki
2009	        mlwiktionary
2009	             mhrwiki
2009	              ltwiki
2009	             ladwiki
2009	       itwikiversity
2009	         hewikiquote
2009	              hewiki
2009	          frwikinews
2009	          fawikinews
2009	              etwiki
2009	         eswikibooks
2009	              cswiki
2009	         brwikimedia
2009	              bgwiki
2009	             arzwiki
2009	        arwiktionary
2009	             alswiki
2009	             acewiki
2008	        zhwikisource
2008	         sourceswiki
2008	    simplewiktionary
2008	             sahwiki
2008	       ruwikiversity
2008	       ptwikiversity
2008	        plwikisource
2008	         plwikibooks
2008	             pflwiki
2008	        nlwiktionary
2008	             mdfwiki
2008	             kaawiki
2008	         itwikibooks
2008	             hifwiki
2008	         frwikibooks
2008	              euwiki
2008	     betawikiversity
2007	        liwiktionary
2007	       jawikiversity
2007	        itwiktionary
2007	          huwikinews
2007	             extwiki
2006	        wikidatawiki
2006	        skwikisource
2006	             dsbwiki
2006	        cswikisource
2005	        ptwikisource
2005	             napwiki
2005	             kshwiki
2005	        dewiktionary
2005	        cywikisource
2005	             barwiki
2004	         cowikimedia

and maybe a foreign key constraint should be introduced to prevent them from being inserted in the first place.

MediaWiki does not use foreign keys

MediaWiki does not use foreign keys

Why? Foreign keys are a key feature of relational databases, which help maintaining the integrity of the data even when the business logic is buggy (which is/was probably the case here as well), in case of a network or power outage etc. I know introducing foreign keys to MediaWiki is definitely out of scope for this ticket, but I’d like to understand why they aren’t currently used.

Foreign keys would work in a small setup but it quickly becomes a maintenance nightmare in large setups. e.g. Depooling, gradual roll out of schema changes, etc. In setups as big as ours we actually need to tolerate some level of inconsistency in data or trade it off with availability (CAP)