Page MenuHomePhabricator

[Bug] All language links not showing up, manual purge required
Closed, ResolvedPublic

Description

http://www.wikidata.org/wiki/Q2393031
The interwiki links do not show up in kowiki and nlwiki even though they are in the item and do show up in the other linked articles.

It seems those pages are missing all interwiki links - not just a subset.

Details

Reference
bz45839

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:26 AM
bzimport set Reference to bz45839.
bzimport added a subscriber: Unknown Object (MLST).

I had to purge both pages and then the links appear.

With our notifications to clients via job queue, purging is supposed to automatically happen.

The associated wikidata item was lasted edited on Jan 29

http://www.wikidata.org/w/index.php?title=Q2393031&action=history

so, there would have been no purge.

I don't know about retroactively going through and purging pages in the clients that are connected but connected before client deployments. (e.g. never got automatic purge.

If a purge fixes it and this is nothing that'll regularly happen I think we can close it.

We're getting more reports about this. Needs to be fixed.

  • Bug 46055 has been marked as a duplicate of this bug. ***

currently a purge is necessary in some occasions to see the links, e.g.
http://en.wikipedia.org/wiki/Wickiana?action=purge

In the last couple of days, I have encountered some English articles that are linked to Wikidata, but the "Edit links" link is missing. For example, http://en.wikipedia.org/wiki/Dyrehavsbakken now had 9 interlanguage links, but there was no "Edit links" text. All local links were removed from the article a month ago.

Is this another bug or a new symptom of the same bug? Purging seems to help also in this case.

(In reply to comment #6)

In the last couple of days, I have encountered some English articles that are
linked to Wikidata, but the "Edit links" link is missing. For example,
http://en.wikipedia.org/wiki/Dyrehavsbakken now had 9 interlanguage links,
but
there was no "Edit links" text. All local links were removed from the
article a
month ago.

Is this another bug or a new symptom of the same bug? Purging seems to help
also in this case.

I think this is the same issue, yes.

I have not gotten any more reports about this so I assume it is fixed. Please reopen if not.

Verified in Wikidata demo sprint 22-5

FriedhelmW subscribed.

Happened again on https://de.wikipedia.org/wiki/Lautsprecher. Links were visible only when logged in. Purge fixed it.

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Reproduced.

Can you please let us know how you reproduced this?

I couldn't find anything obvious regarding this in the logs, so I guess this is just something unrelated failing (like an exception happening during the page parse/ page props update/ …).

This occurs randomly. I don't find a pattern.

This occurs randomly. I don't find a pattern.

I need more articles with this problem in order to debug this (preferably where the problem still exists). Can you give me some links, please?

Thank you... I poked at this today and searched through various logs but couldn't find anything relevant yet. If you have more examples, please post them here.

thiemowmde lowered the priority of this task from High to Medium.Aug 13 2015, 3:22 PM
thiemowmde subscribed.
Lydia_Pintscher renamed this task from Not all language links showing up, manual purge required to [Bug] Not all language links showing up, manual purge required.Aug 17 2015, 3:18 PM
Lydia_Pintscher moved this task from incoming to ready to go on the Wikidata board.

Thanks a lot for the examples. Please keep them coming. We're investigating.

Do you know if any of these pages had sitelinks shown before that then vanished? Or where they just not shown ever?

Thank you!
Katie just suggested adding more debug output around the langlinkhandler to get to the bottom of this.

Hint from dewp: It seems the link to Wikidata is also missing in these cases.

20:48:02 <Josve05a> Help, something is broken... https://de.wikipedia.org/wiki/Referendariat linked to https://en.wikipedia.org/wiki/Referendary, but not vice-versa. (Q465288) I removed the enwp link from the item (which added 1,038 byte(?)) and then re-added the link, and now it shows up on enwp again. What is broken here?
20:48:02 <Josve05a> https://www.wikidata.org/w/index.php?title=Q465288&action=history

And yet another one: https://he.wikipedia.org/wiki/%D7%A0%D7%AA%D7%A0%D7%99%D7%94

This is an article about Netanya, one of the largest cities in Israel and one of the oldest in the Hebrew Wikipedia, which is a bit odd to me, because I was under the impression that this happens in new or obscure articles.

matmarex renamed this task from [Bug] Not all language links showing up, manual purge required to [Bug] All language links not showing up, manual purge required.Dec 30 2015, 9:33 PM
matmarex added subscribers: StudiesWorld, Glaisher, Schnark and 2 others.

i am looking at a few pages that I found on English Wikipedia. These all are related topics, and would share some categories and templates, but not all pages in the category are affected.

they are missing sidebar links and the "Data item" link, as well as the wgWikibaseItemId js config variable.

the parser output (from cache) has entity usage of only X + T, but wbc_entity_usage table now has X + T + S for the page.

however, I also see:

Number of Wikibase entities loaded: 1

and the page is in Category:Coordinates on Wikidata

this indicates that the lookup of item id in the wb_items_per_site failed.

the relevant Wikidata item was edited on January 12, and the wikipedia page was lasted edited in September. The page table entry for the page has page_links_update on January 18. I first noticed the page later in the day on January 19, so not convinced me viewing the page triggered anything.

the item definitely has an entry for the page in the wb_items_per_site table, and also has X + T + S entries in the wbc_entiy_usage table, as well as an entry in the page_props table for "wikibase_item".

other things noticed:

  • also notice that page_touched of the enwiki page is 18 seconds later than page_touched timestamp of the wikidata item.
  • SiteLinkTable::getItemIdForLink uses a slave connection
  • we wrap the SiteLinkLookup (db) in a CachingSiteLinkLookup which we probably still need to do, for other reasons
  • at least for me, it appears that WikibaseLuaBindings hit the SiteLinkLookup (db lookup) first and then the LangLinkHandler uses SiteLinkLookup (cached) and then ClientParserOutputDataUpdater->updateItemIdProperty (cached). Seems odd if lua was able to get an id in order to do entity lookup and stuff (and cache it), and the subsequent stuff didn't get an id from the caching lookup.
  • CachingSiteLinkLookup doesn't normalize the page title for the cache key. SiteLinkTable at least normalizes to convert underscores to spaces, since spaces is the form used there. Though, this might not be much or any problem in practice.
  • We are inconsistent in the client in sometimes using $title->getPrefixedText to build a SiteLink (e.g. for lookup) and sometimes we use $title->getFullText() which may contain a fragment. In practice, might also not be a problem but can't be entirely sure.

possible ideas:

  • could be just a database query error (e.g. timeout) when parsing the enwiki page
  • could be a race condition that the site link was missing in the site link table (on the slave) at the moment lookup was done. (e.g. the site links get deleted and readded in process of parsing on wikidata? or were previously missing?) Below is what is done when saving site links:
public function saveLinksOfItem( Item $item ) {                                                 
    //First check whether there's anything to update                                            
    $newLinks = $item->getSiteLinkList()->toArray();                                            
    $oldLinks = $this->getSiteLinksForItem( $item->getId() );                                   
                                                                                                
    $linksToInsert = array_udiff( $newLinks, $oldLinks, array( $this, 'compareSiteLinks' ) );   
    $linksToDelete = array_udiff( $oldLinks, $newLinks, array( $this, 'compareSiteLinks' ) );   
                                                                                                
    if ( !$linksToInsert && !$linksToDelete ) {                                                 
        wfDebugLog( __CLASS__, __FUNCTION__ . ": links did not change, returning." );           
        return true;                                                                            
    }                                                                                           
                                                                                                
    $ok = true;                                                                                 
    $dbw = $this->getConnection( DB_MASTER );                                                   
                                                                                                
    //TODO: consider doing delete and insert in the same callback, so they share a transaction. 
                                                                                                
    if ( $ok && $linksToDelete ) {                                                              
        wfDebugLog( __CLASS__, __FUNCTION__ . ": " . count( $linksToDelete ) . " links to delete." );
        $ok = $dbw->deadlockLoop( array( $this, 'deleteLinksInternal' ), $item, $linksToDelete, $dbw );
    }                                                                                           
                                                                                                
    if ( $ok && $linksToInsert ) {                                                              
        wfDebugLog( __CLASS__, __FUNCTION__ . ": " . count( $linksToInsert ) . " links to insert." );
        $ok = $dbw->deadlockLoop( array( $this, 'insertLinksInternal' ), $item, $linksToInsert, $dbw );
    }                                                                                           
                                                                                                
    $this->releaseConnection( $dbw );                                                           
                                                                                                
    return $ok;                                                                                 
}

suggestions:

  • maybe would help to instead use a master connection for site link lookup when parsing is happening.
  • and the way sitelinks get updated seems suboptimal and could be improved. (at minimum, implement the // TODO suggested in the code)

maybe would help to instead use a master connection for site link lookup when parsing is happening.

Maybe something that used to use master and that the feature relied on was switched to slave? There were a lot of changes as part of T92357: Fix database master queries from HTTP GET/HEAD before active-active multi-dc.

Another example: https://en.wikipedia.org/wiki/Bangor,_Gwynedd currently has no interwiki links or Wikidata link, last edited 3 days ago. The Wikidata item is https://www.wikidata.org/wiki/Q234178

the issues I mentioned regarding using a slave connection sometimes for the wb_items_per_site table might also be a problem for T44325

Michael claimed this task.
Michael subscribed.

I think this was resolved by some of the work we did in this area at some point in the last 8 years, so I'm tentatively closing this. Please reopen it if this problem persists.