Page MenuHomePhabricator

Make meta.dt required on all schemas that declare it
Closed, ResolvedPublic

Description

From current data at event.rc1_mediawiki_page_content_change we notice that both dt and meta.dt are never NULL:

spark-sql (default)> SELECT count(1)
                   > FROM rc1_mediawiki_page_content_change
                   > WHERE dt IS NULL;
count(1)
0
Time taken: 69.411 seconds, Fetched 1 row(s)


spark-sql (default)> SELECT count(1)
                   > FROM rc1_mediawiki_page_content_change
                   > WHERE meta.dt IS NULL;
count(1)
0
Time taken: 27.15 seconds, Fetched 1 row(s)

But from the page change schema, we notice that meta.dt is not marked as required.

This seems incorrect, as, if there was no meta.dt, the current partitioning strategy of event.rc1_mediawiki_page_content_change would fail.

The ask: make meta.dt required on all schemas that declare it.

Related: Docs suggests it is required: https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas/Guidelines#Required_fields

Event Timeline

After discussions with @Ottomata, we speculated that the fact that meta.dt is not marked as required in the page change schema must have been an historical consequence of keeping schemas backwards compatible.

However, the common fragment does indeed have meta.dt set as required since version 1.0.0: https://github.com/wikimedia/schemas-event-primary/blob/master/jsonschema/fragment/common/1.0.0.yaml#L15C11-L15C11

So I am now quite confused. Is the meta field in page change schema a composition from common, or is it a different thing?

We removed meta.dt requiredness in common/2.0.0.

I honestly cannot remember why we did this. Some context in {T240460#6614767}, but there is no info about making it un-required.

Maybe it was because we wanted fewer required fields in meta in general? I am just guessing here, I'm really not sure.

xcollazo claimed this task.

meta.dt, in practice, is always set.

I'll take this as good enough since there is no practical consequences of having the schema not have it required, other than a bit of confusion. Hopefully next person that considers this can find this ticket.

Closing.