Page MenuHomePhabricator

Inconsistencies between labs replicas
Closed, ResolvedPublic

Description

This query:

use fawiki_p;
SELECT page_title 
FROM categorylinks 
JOIN page 
ON cl_from = page_id 
LEFT JOIN imagelinks 
ON il_to = page_title 
WHERE cl_to = 'محتویات_غیر_آزاد' and page_namespace = 6 and il_from is null;

returns different (i.e. incorrect) results when run against labsdb1003, while 001 and 002 are fine.

For instance, the result should NOT contain Image:Taylor_Swift_-_Wildest_Dreams_(Official_Single_Cover).png but it does for labsdb003.

Event Timeline

Huji raised the priority of this task from to Needs Triage.
Huji updated the task description. (Show Details)
Huji added a project: DBA.
Huji subscribed.
Se4598 added projects: Cloud-Services, Cloud-VPS.
Se4598 set Security to None.
jcrespo triaged this task as Medium priority.
jcrespo moved this task from Triage to In progress on the DBA board.

This seems to be an issue with labsdb1003 only, as not only the other hosts are ok, the original production host where filtering happened is ok, too.

root@iron:~$ mysql -h db1069 -P3317 fawiki -e "SELECT count(*) FROM imagelinks"
+----------+
| count(*) |
+----------+
|  6007034 |
+----------+
root@iron:~$ mysql -h labsdb1001 fawiki -e "SELECT count(*) FROM imagelinks"
+----------+
| count(*) |
+----------+
|  6007024 |
+----------+
root@iron:~$ mysql -h labsdb1002 fawiki -e "SELECT count(*) FROM imagelinks"
+----------+
| count(*) |
+----------+
|  6007034 |
+----------+
root@iron:~$ mysql -h labsdb1003 fawiki -e "SELECT count(*) FROM imagelinks"
+----------+
| count(*) |
+----------+
|  6005939 |
+----------+
root@iron:~$ mysql -h s7-master fawiki -e "SELECT count(*) FROM imagelinks"
+----------+
| count(*) |
+----------+
|  6007034 |
+----------+

Imagelinks has been reimported:

root@iron:~$ mysql -h s7-master fawiki -e "SELECT count(*) FROM imagelinks"
+----------+
| count(*) |
+----------+
|  6007121 |
+----------+
root@iron:~$ mysql -h labsdb1001 fawiki -e "SELECT count(*) FROM imagelinks"
+----------+
| count(*) |
+----------+
|  6007121 |
+----------+
root@iron:~$ mysql -h labsdb1002 fawiki -e "SELECT count(*) FROM imagelinks"
+----------+
| count(*) |
+----------+
|  6007121 |
+----------+
root@iron:~$ mysql -h labsdb1003 fawiki -e "SELECT count(*) FROM imagelinks"
+----------+
| count(*) |
+----------+
|  6007121 |
+----------+

I have synced labs with production. Replication to labs cannot be 100% reliable due to the filtering performed of private information; additionally if a queued job fails for any reason, image/link/etc. usage can become inconsistent until the page is purged, as these tables are not updated in real time. If you see in the future more inconsistencies, please report them and I will try to fix them as soon as possible. Thanks for the report.

root@iron:~$ mysql -h s7-master fawiki -e "SELECT page_title FROM categorylinks JOIN page ON cl_from = page_id LEFT JOIN imagelinks ON il_to = page_title WHERE cl_to = 'محتویات_غیر_آزاد' and page_namespace = 6 and il_from is null;"
+-------------------------------+
| page_title                    |
+-------------------------------+
| SanatNaftAbadanClub.png       |
| Superstarmovies.jpg           |
| Maskan.svg                    |
| ChipsPoster.JPG               |
| Mkxshinnokrender324.png       |
| NcXobAgpi.jpeg                |
| A-FONT.png                    |
| A-02_0.png                    |
| N-9.jpg                       |
| Shredder_TMNT.jpg             |
| Irib-channel-3-logo.png       |
| Newsted_heavy_metal_music.jpg |
| DSC_0280.jpg                  |
+-------------------------------+
root@iron:~$ mysql -h labsdb1001 fawiki -e "SELECT page_title FROM categorylinks JOIN page ON cl_from = page_id LEFT JOIN imagelinks ON il_to = page_title WHERE cl_to = 'محتویات_غیر_آزاد' and page_namespace = 6 and il_from is null;"
+-------------------------------+
| page_title                    |
+-------------------------------+
| SanatNaftAbadanClub.png       |
| Superstarmovies.jpg           |
| Maskan.svg                    |
| ChipsPoster.JPG               |
| Mkxshinnokrender324.png       |
| NcXobAgpi.jpeg                |
| A-FONT.png                    |
| A-02_0.png                    |
| N-9.jpg                       |
| Shredder_TMNT.jpg             |
| Irib-channel-3-logo.png       |
| Newsted_heavy_metal_music.jpg |
| DSC_0280.jpg                  |
+-------------------------------+
root@iron:~$ mysql -h labsdb1002 fawiki -e "SELECT page_title FROM categorylinks JOIN page ON cl_from = page_id LEFT JOIN imagelinks ON il_to = page_title WHERE cl_to = 'محتویات_غیر_آزاد' and page_namespace = 6 and il_from is null;"
+-------------------------------+
| page_title                    |
+-------------------------------+
| SanatNaftAbadanClub.png       |
| Superstarmovies.jpg           |
| Maskan.svg                    |
| ChipsPoster.JPG               |
| Mkxshinnokrender324.png       |
| NcXobAgpi.jpeg                |
| A-FONT.png                    |
| A-02_0.png                    |
| N-9.jpg                       |
| Shredder_TMNT.jpg             |
| Irib-channel-3-logo.png       |
| Newsted_heavy_metal_music.jpg |
| DSC_0280.jpg                  |
+-------------------------------+
root@iron:~$ mysql -h labsdb1003 fawiki -e "SELECT page_title FROM categorylinks JOIN page ON cl_from = page_id LEFT JOIN imagelinks ON il_to = page_title WHERE cl_to = 'محتویات_غیر_آزاد' and page_namespace = 6 and il_from is null;"
+-------------------------------+
| page_title                    |
+-------------------------------+
| A-02_0.png                    |
| A-FONT.png                    |
| ChipsPoster.JPG               |
| DSC_0280.jpg                  |
| Irib-channel-3-logo.png       |
| Maskan.svg                    |
| Mkxshinnokrender324.png       |
| N-9.jpg                       |
| NcXobAgpi.jpeg                |
| Newsted_heavy_metal_music.jpg |
| SanatNaftAbadanClub.png       |
| Shredder_TMNT.jpg             |
| Superstarmovies.jpg           |
+-------------------------------+