Page MenuHomePhabricator

Productionize the WMDE Analytics Front-End
Closed, ResolvedPublic

Description

The WMDE analytics front-end, related or not to Wikidata, needs serious re-engineering in order to place all of its Shiny dashboards and R Markdown reports to production properly. The whole WMDE analytics front-end is currently running on Open Source Shiny Server from the CloudVPS Wmde-dashboards project. A Docker container encompassing the whole WMDE analytics front-end (or almost all of it) is currently running on a test server. Very soon, the front-end will be containerized from the same CloudVPS project where it is running now. However, a more thorough system engineering will need to take place to keep everything in order, manageable, and served more efficiently (i.e. the Open Source Shiny Server does not help us overcome the single-threaded nature of R, while some of our dashboards are quite demanding on the resource side).

In the first step, as soon as we receive an increased quota on our CloudVPS analytics project (see: T261743) to be able to spin three XL instances there, a separation is planned:

  • Instance A: WDCM (see below), Wikidata Analytics, and Wikidata Structural Systems
  • Instance B: Wiktionary Cognate Dashboard
  • Instance C: WMDE New editors team.

On all three instances we will switch from using Open Source Shiny Server to ShinyProxy, so that each new user connection will be spinning up its own Docker container serving the desired product.

We have the following analytics RStudio Shiny dashboards and standardized R Markdown notebooks developed, or in development, in WMDE, and running from the CloudVPS Wmde-dashboards project; all of them will be re-distributed across the three (new) virtual instances:

Wikidata Concepts Monitor (WDCM) dashboards - Instance A

Wikidata Analytics - Instance A

Wikidata Structural Systems - Instance A

Qurator Projects - Instance A

Wiktionary - Instance B

WMDE New Editors Team - Instance C

  • New Editors dashboard (development is currently postponed)
  • many R Markdown WMDE Banner Campaign reports.

With twenty data products currently served or under development, it is obvious that we cannot rely on a single instance of Open Source Shiny Server and manage all dependencies manually. While there was only WDCM, the situation was much simpler, and using one instance of Shiny Server running on one VM was acceptable; not anymore.

The {golem} framework will be used across all listed (and all future) data products in order to secure their robustness and deal with production/reproducibility issues.

Proposed timeline: Q4; completion expected until the end of 2020.

Event Timeline

2020/09/15, status:

A new Wikidata Analytics Portal is now available at the test server: http://datakolektiv.org/app/WikidataAnalytics

@Lydia_Pintscher @WMDE-leszek @Lea_Lacroix_WMDE

We are in production:

https://wikidata-analytics.wmcloud.org/app/WikidataAnalytics

The old Wikidata Analytics URLs - all dashboards and reports - are still operational and the services there will not be discontinued for a while. I have sent out an email explaining what changes should be the community informed of.

LEFTOVERS

  • Include to the Reports Section
    • Report on Monthly Wikidata Editor Activity (an occasional, ad hoc request from Jan)
    • Report on our WDQS response optimization study
  • Gerrit repo: Wikidata Analytics
    • Include back-end engines (ETL, ML)
    • Once formed -> remove all old Wikidata Analytics existing Gerrit repos

@Lea_Lacroix_WMDE @Lydia_Pintscher @WMDE-leszek

The URLs issue - where the URL did not change from the landing page of the new Wikidata Analytics services - is resolved.
@Lea_Lacroix_WMDE This means that our users will be able to bookmark any dashboard directly.

Remaining to do:

I will get in touch via email on this.

@Lea_Lacroix_WMDE @Lydia_Pintscher @WMDE-leszek

Everything in relation to T261905#6688736 is completed.
I guess we could inform the community about the new Wikidata Analytics and begin the transition.

Everything in relation to T261905#6688736 is completed.
I guess we could inform the community about the new Wikidata Analytics and begin the transition.

The transition Phab ticket: T272192 - Migrate to new Wikidata Analytics.

Closing the ticket as resolved.