Page MenuHomePhabricator

Content Translation analytics dashboard shows errors for some graphs that were working before
Closed, ResolvedPublic

Description

The analytics dashboard for Content Translation key metrics has recently started to show errors for some graphs that were rendered before:

  • Monthly translations year over year: Unexpected error
  • Monthly translations by user edit count: Apache Druid Error
  • Monthly rate of deleted translations: Unexpected error
  • Monthly translations by wiki: Apache Druid Error

The "Users last calendar month" shows a "Timeout error", but that is a known issue and other tasks will support to improve the performance for such graph.

A screenshot below to illustrate the issue:

superset.wikimedia.org_superset_dashboard_119__native_filters_key=li5w_ZRPZyIO7J23vHUPZEO2E79JPxtrRE63q5D7HCFOnH1Rq80dhAx4UmwagivV(Wiki Tablet).png (3×980 px, 323 KB)

Event Timeline

Pginer-WMF moved this task from Needs Triage to Bugs on the ContentTranslation board.
Pginer-WMF moved this task from Backlog to Priority Backlog on the Language-analytics board.

Some additional information on the error that's appearing for all line charts:

Reported error:
java.lang.RuntimeException: net.jpountz.lz4.LZ4Exception: Error decoding offset 53533 of input buffer

Other information:

  • These charts use a virtual dataset (daily_translations_byactivity_incl_deletions) that queries data from the druid.edits_hourly dataset. This error appeared after the edits_hourly dataset was updated with the 2023-08-01 snapshot. There were no issues loading this chart prior to that update.
  • The error specifically appears to occur when the time range filter is adjusted to include any months from 2021-06-12 until 2021-07-01. For example, if the time range filter is adjusted the following times the charts load correctly:
    • 2019-01-01 <= col < 2021-01-01
    • 2022-01-01 <= col < 2023-09-01

But will not work if adjusted to the following cases:

  • 2019-01-01 <= col < 2022-01-01 Or even
  • 2021-01-01 <= col < 2023-09-01

As a temporary fix, I've adjusted the default time filter to 2022-01-01 <= col < 2023-09-01. This will allow all the time series charts to load correctly and provide details on recent trends. However, as mentioned in T346636#9179412, the error will still reappear for these charts if the time range is extended to display larger time ranges.

@KCVelaga_WMF assigning to you to investigate further but let me know if any additional details would be helpful.

Thank you @JAllemandou!

@Pginer-WMF I just checked the dashboard and everything is working fine. Also, set the default time range filter back to start from 2019-01-01. Please check on your side, if it looks good, we can resolve this ticket.

Thank you @JAllemandou!

@Pginer-WMF I just checked the dashboard and everything is working fine. Also, set the default time range filter back to start from 2019-01-01. Please check on your side, if it looks good, we can resolve this ticket.

Confirmed. Graphs are in place. Thanks!

superset.wikimedia.org_superset_dashboard_119__native_filters_key=4x5lCmpQqZ16lvF3wqPTAyr-JNNIElfYsVXkO94zLQ5VXMxpIWpulZVNMkl5jwsx(Wiki Tablet).png (3×980 px, 500 KB)

Regarding the default time range adjustment, I noticed a small glitch. The end of the time range is defined as "Beginning of this month" and it shows as "2023-09-30" (Sept 30). Given that today is Sept 22, and the data for September is not available yet, this makes some graphs to show some data for September. For example, the "Users last calendar month" graph shows what seems an artificial drop at the end that may be caused by this.

superset.wikimedia.org_superset_dashboard_119__native_filters_key=4x5lCmpQqZ16lvF3wqPTAyr-JNNIElfYsVXkO94zLQ5VXMxpIWpulZVNMkl5jwsx(Wiki Tablet) (1).png (740×1 px, 120 KB)

This is not urgent and can be dealt in a separate task, but I wanted to mention since the adjustment of the default period was a recent change.

@Pginer-WMF The filter's behaviour is strange. The "beginning of this month" filter is not working well for September, but is working fine for August and even October. For now, I added a manual filter to end the data at 2023-09-01. As we are close to October, I will check this again early next week to see if the issue still persists. We can create a separate task then if needed.

@Pginer-WMF The filter's behaviour is strange. The "beginning of this month" filter is not working well for September, but is working fine for August and even October. For now, I added a manual filter to end the data at 2023-09-01. As we are close to October, I will check this again early next week to see if the issue still persists. We can create a separate task then if needed.

Thanks for the update. For October onwards it is ok to leave the filter to the "beginning of this month" even if there are some glitches. I can adjust the date manually if I need to export a image.