Page MenuHomePhabricator

Wiktionary Cognate Dashboard not updated
Closed, ResolvedPublic

Description

New occurrence of this issue reported on Jully 24th, the current timestamp on https://wiktionary-analytics.wmcloud.org/Wiktionary_CognateDashboard/ indicates "last updated on: 2021-06-30" which means that the dashboard has not been updated for 3 weeks.

Event Timeline

GoranSMilovanovic renamed this task from Wiktionary Cognate Dashboard not update to Wiktionary Cognate Dashboard not updated.May 4 2020, 5:27 PM
GoranSMilovanovic moved this task from Wiktionary to Prioritized on the User-GoranSMilovanovic board.

Here it is:

Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
Calls: rownames<- ... row.names<-.tbl_df -> NextMethod -> row.names<-.data.frame
In addition: Warning messages:
1: Setting row names on a tibble is deprecated. 
2: non-unique value when setting 'row.names': ‘character(0)’ 
Execution halted

Inspecting now.

Probable cause of the update failure: Wiktionary_CognateDashboard_UpdateProduction.R cannot find Wiktionaries encoded as

-210957740613563972

and

4182753792216835591

in the cgpa_title field of the cognate_wiktionary.cognate_pages in the cognate_wiktionary.cognate_sites table.

@Addshore You might be interested to take a look at this?

Intermediary fix applied; running the update cycle manually now; monitoring.

Update Mon May 4 19:52:42 UTC 2020:

  • Manual back-end update completed;
  • waiting for the Dashboard update to fetch new data;
  • monitoring.

Update Tue May 5 08:22:31 UTC 2020

  • dashboard slow to pick-up the changes;
  • change public path to: /srv/published/datasets/...;
  • monitoring.

Update Tue May 5 15:07 CET 2020:

  • the dashboard update stamp is not updated yet: Last updated on: 2020-04-08 07:26:18 UTC;
  • the public datasets timestamp, however, is updated;
  • known problems with {curl} from R; inspecting now;
  • the update cycle is stable, checked.

Update Tue May 5 20:47 CET 2020:

  • current dashboard update timestamp is now matched correctly: Last updated on: 2020-05-05 07:27:02 UTC
  • Q: Why does it take so much time for our curl calls from CloudVPS to grab the updated datasets and the update timestamp?

Anyways, the updates are back. Thanks @Lea_Lacroix_WMDE and the anonymous volunteer for notifying me on this.

Status: monitoring the dashboard updates for a day or two, closing the ticket if no problems occur in the meantime.

GoranSMilovanovic lowered the priority of this task from High to Medium.May 5 2020, 6:50 PM
GoranSMilovanovic claimed this task.

Conclusion:

  • the dashboard update procedure is fixed to guard against the possible inconsistencies between cognate_wiktionary.cognate_pages and cognate_wiktionary.cognate_sites tables;
  • the data acquisition works and is delivered daily on a regular schedule from stat1007;
  • the dashboard itself, hosted in CloudVPS, is somewhat slow (i.e. matter of hours) to pick-up the latest daily update - but this is a problem and needs to be handled separately.

Closing the ticket.

Lea_Lacroix_WMDE updated the task description. (Show Details)
Lea_Lacroix_WMDE added a subscriber: Otourly.

@Lea_Lacroix_WMDE Status:

  • The Wiktionary Cognate Dashboard update was restarted manually from the CloudVPS instance.
  • The updated data should be available from the dashboard in an hour or so, maybe earlier.
  • Monitoring.

@Lea_Lacroix_WMDE

You are welcome!

The curren update is now in place.

I will keep the ticket opened until I am sure that the updating procedure is running smoothly.

The updates are now all in place.

Otourly updated the task description. (Show Details)

The issue is most probably related to some R internal memory allocation problems/constraints on the stat1007 analytics client.

  • Running a manual update now;
  • Monitoring.

Possible action: migrate the update engine to stat1005 or stat1008 (more resourceful than stat1007).

  • Migrating the update engine to stat1005 definitely now (only 64Gb RAM on stat1007; processes killed).
  • Manual update from stat1008 completed;
  • The dashboard should be able to pick the results in the following hour or so; monitoring;
  • next step: installing crontab from stat1008; removing the update engine from stat1007.

Unfortunately, the problem is not related merely to the update engine; the following was run from the Wiktionary Cognate Dashboard's running docker container sudo docker-compose exec wiktionarycognate sh:

# cat nohup.out

Attaching package: ‘curl’

The following object is masked from ‘package:httr’:

    handle_reset

Error in curl_fetch_memory(URL, handle = h) : 
  Could not resolve host: analytics.wikimedia.org
Execution halted

So, for whatever reason, our https://analytics.wikimedia.org/published/datasets/wmde-analytics-engineering/Wiktionary/ does not seem to resolve? Strange.

Inspecting the issue; contacting the CloudVPS team in case this proves to be too mysterious. The dashboard side update (an hourly run daemon from the dashboard's container) has been running smoothly for years before this unlikely occurrence.

Until this issue is resolved, the dashboard will not be updated and will continue to fall back to its test datasets.

@Otourly Thanks for checking this out and catching the issue in the first place.

I don't know what exactly to think of the curl/Docker related problem.
Let's please keep this ticket open for a while and I will monitor the dashboard to see if the problem reappears.

Unfortunately, the dashboard did not update since Last updated on: 2021-07-26 21:48:24 UTC:

  • inspecting the issue now,
  • in case the curl related problem from CloudVPS persists I will be getting in touch with the relevant team (following my consulations with the WMDE Devops Guild on Mattermost who said: it is simply strange).

Edit. The problem seems to be related to an erroneous crontab setup on stat1008, however; fixed, let's wait and see what happens in the next daily update which is scheduled for tomorrow 06:00 UTC.

@Otourly

  • The dashboard is updated now;
  • I cannot see a reason why it should not update daily in the future, as expected;
  • please re-open this ticket if this happens again (it is a quite complex update, so I expect that something might go wrong in the future, but not too frequently),
  • and once again, thank you for catching this!