Page MenuHomePhabricator

Pageviews analysis does not support lombard wiktionary
Closed, ResolvedPublicBUG REPORT

Description

Pageviews analysis (https://pageviews.toolforge.org/) does not support lombard wiktionary

Event Timeline

It appears the tool only supports sites listed in its own site map: https://github.com/MusikAnimal/pageviews/blob/master/javascripts/shared/site_map.js.

@MusikAnimal I'm wondering, would it make sense to automatically load that from meta_p.wiki? Otherwise, this is (yet another) thing we'd need to take care about when a wiki gets born, which doesn't scale, unfortunately. All the data appear to be there:

MariaDB [meta_p]> select * from wiki where dbname='lmowiktionary';
+---------------+------+------------+------------+----------------------------+------+-----------+-----------+----------+-----------------+------------------+--------------+--------------+
| dbname        | lang | name       | family     | url                        | size | slice     | is_closed | has_echo | has_flaggedrevs | has_visualeditor | has_wikidata | is_sensitive |
+---------------+------+------------+------------+----------------------------+------+-----------+-----------+----------+-----------------+------------------+--------------+--------------+
| lmowiktionary | lmo  | Wiktionary | wiktionary | https://lmo.wiktionary.org |    1 | s5.labsdb |         0 |        1 |               0 |                1 |            1 |            1 |
+---------------+------+------------+------------+----------------------------+------+-----------+-----------+----------+-----------------+------------------+--------------+--------------+
1 row in set (0.01 sec)

MariaDB [meta_p]>

I'm wondering, would it make sense to automatically load that from meta_p.wiki? Otherwise, this is (yet another) thing we'd need to take care about when a wiki gets born, which doesn't scale, unfortunately. All the data appear to be there:

Not all projects in meta_p have pageviews, and not all projects that have pageviews are in meta_p, so that won't work :( Instead we have to go by the pageviews allow list and verify the projects are on the Toolforge replicas (since we also query for edit data). This something I've had to do manually all this time. I have a git branch from 2017 that was an attempt to automate the process, but I never finished it.

Anyways, I will get lmowiktionary added as well any other projects since the last sync.

Additionally, projects in meta_p are not necessarily replicated, either. For instance as I'm doing this sync, I found ami.wikipedia was added to the pageviews whitelist on October 4, but amiwiki_p still doesn't exist on the replicas, even though it is in meta_p.wiki.

Unfortunately, the lmowiktionary database is also not yet available on the Toolforge replicas. But, I've decided to ignore this and add it to Pageviews Analysis' site map anyway. So you now have pageviews (i.e. https://pageviews.toolforge.org/?project=lmo.wiktionary.org&pages=Pagina_principala ), but you will get NaN wherever edit data is supposed to be. When I have time I'll change it to say "Unavailable" rather than the confusing "NaN", which is due to a JavaScript data type mismatch. But anyways, if and when lmowiktionary becomes replicated, the edit data should start showing up without my intervention.

I've also added amiwiki, bjnwikibooks, jvwikisource, and pwnwiki, some of which won't have edit data either.

Unfortunately, the lmowiktionary database is also not yet available on the Toolforge replicas. But, I've decided to ignore this and add it to Pageviews Analysis' site map anyway. So you now have pageviews (i.e. https://pageviews.toolforge.org/?project=lmo.wiktionary.org&pages=Pagina_principala ), but you will get NaN wherever edit data is supposed to be. When I have time I'll change it to say "Unavailable" rather than the confusing "NaN", which is due to a JavaScript data type mismatch. But anyways, if and when lmowiktionary becomes replicated, the edit data should start showing up without my intervention.

I've also added amiwiki, bjnwikibooks, jvwikisource, and pwnwiki, some of which won't have edit data either.

lmowiktionary should be available (per T291404) and the database itself seems to be accessible: P18263. It's just the DNS that is missing.

Mentioned in SAL (#wikimedia-cloud) [2021-12-24T22:51:32Z] <majavah> ran the wikireplica dns script on s5 T298303

[...]
I've also added amiwiki, bjnwikibooks, jvwikisource, and pwnwiki, some of which won't have edit data either.

bjnwikibooks seems to not exist (only as part of incubator)? Or am I missing something?

bjnwikibooks seems to not exist (only as part of incubator)? Or am I missing something?

Indeed! I missed that when browsing to the domain. It was added to the pageviews allow list with 829a3639, and it does have some pageview data for the domain itself. The main Pageviews Analysis tool won't function for it, and although in most cases it fails gracefully, I should probably still remove it from the local site map. Thanks for pointing this out.

Mentioned in SAL (#wikimedia-cloud) [2021-12-24T22:51:32Z] <majavah> ran the wikireplica dns script on s5 T298303

Thanks Majavah :)