Page MenuHomePhabricator

Add the truthy nt dump to dcat-AP
Closed, ResolvedPublic

Description

Event Timeline

hoo created this task.Apr 19 2017, 2:23 PM
Lydia_Pintscher triaged this task as Low priority.May 5 2017, 1:48 PM
Lydia_Pintscher moved this task from incoming to ready to go on the Wikidata board.

What's happening here, any decision?

I believe the decision was to remove the DCAT-AP dumps from the dumping system as it is not well integrated and was not deemed critical enough to be worth the effort on doing so. @hoo?

Any conclusion on this?

From the mailing lists https://lists.wikimedia.org/pipermail/wikidata/2017-October/011303.html
and https://lists.wikimedia.org/pipermail/wikidata/2017-November/011407.html there seems to be some support for keeping dcat-ap.

Adding another format shouldn't be hard, I only need to check that it doesn't filter out all dumps marked as BETA.

Lokal_Profil added a comment.EditedNov 9 2017, 8:51 PM

Adding another format shouldn't be hard, I only need to check that it doesn't filter out all dumps marked as BETA.

So currently the dcat-ap setup excluds all files marked as BETA since they don't fit into the expected naming scheme (in scanDump()). The same mechanism would also not allow "-truthy-" to be part of the string.

Both could be supported by changing the dump-info config from

"mediatype": {
    "json": "application/json",
    "ttl": "text/turtle"
 },

to something like

"mediatype": {
    "json": {
            "contentType": "application/json",
            "prefix": "",
    },
    "ttl":  {
            "contentType": "text/turtle",
            "prefix": "-BETA",
    }
},

Mainly this is down to whether we want to expose the BETA labelled dumps.

Related to this is T154914: Add .nt to DCAT-AP for Wikidata dumps still relevant or have those been replaced by the truthy ones?

Change 390312 had a related patch set uploaded (by Lokal Profil; owner: Lokal Profil):
[operations/dumps/dcat@master] [WIP]Support prefixed dump types

https://gerrit.wikimedia.org/r/390312

@hoo Would the above suggestion and patch work for you as a solution?

Change 424291 had a related patch set uploaded (by Lokal Profil; owner: Lokal Profil):
[operations/puppet@production] Support prefixed dump types

https://gerrit.wikimedia.org/r/424291

@hoo ping on this one again

Change 390312 merged by jenkins-bot:
[operations/dumps/dcat@master] Support prefixed dump types

https://gerrit.wikimedia.org/r/390312

hoo added a comment.Apr 9 2018, 12:34 PM

Related to this is T154914: Add .nt to DCAT-AP for Wikidata dumps still relevant or have those been replaced by the truthy ones?

It's still relevant. We might not do it anytime soon, but we might eventually.

Change 425038 had a related patch set uploaded (by Hoo man; owner: Hoo man):
[operations/dumps/dcat@master] Make DCAT backwards compatible to old config

https://gerrit.wikimedia.org/r/425038

Change 425065 had a related patch set uploaded (by Lokal Profil; owner: Lokal Profil):
[operations/dumps/dcat@master] [WIP]Add versioning for config and validate it

https://gerrit.wikimedia.org/r/425065

Change 425038 merged by jenkins-bot:
[operations/dumps/dcat@master] Make DCAT backwards compatible to old config

https://gerrit.wikimedia.org/r/425038

Change 425987 had a related patch set uploaded (by Lokal Profil; owner: Lokal Profil):
[operations/dumps/dcat@master] Allow prefix to override "all"

https://gerrit.wikimedia.org/r/425987

I realised the truthy dump is named differently (it does not start with -all) so had to do another patch.

Change 425987 merged by jenkins-bot:
[operations/dumps/dcat@master] Allow prefix to override "all"

https://gerrit.wikimedia.org/r/425987

Change 424291 merged by ArielGlenn:
[operations/puppet@production] Support prefixed dump types

https://gerrit.wikimedia.org/r/424291

Lokal_Profil closed this task as Resolved.EditedJul 3 2018, 6:27 AM
Lokal_Profil moved this task from Non-WMSE to Done on the User-LokalProfil board.

Other than https://gerrit.wikimedia.org/r/425065 everything has been merged an nt-truthy is now supported. The final patch is a WIP about versioning the config for the future. While the idea sprung from this task it is not directly related to it.

Lokal_Profil moved this task from Backlog to Done on the Dumps-Generation board.Jul 3 2018, 6:27 AM