In T240460#6614767 we decided the following:
- dt is always a client AKA event timestamp.
- meta.dt is always a server side receive timestamp.
To accomplish this:
- All schemas should be updated to have both a meta.dt and a dt field. dt should be required.
- EventBus should be modified to set dt to event time, but not set meta.dt (allowing EventGate to fill it in).
- All eventgates should use meta.dt as the Kafka timestamp.
- All gobblin ingestion jobs should use meta.dt as the partitioning timestamp
Ideally, any clients that produce directly to Kafka (not via EventGate) should use a maintained Event Platform producer library where these conventions are automatically handled (like wikimedia-event-utilities).
Downstream Kafka topics and/or Hive tables can use whatever timestamp field is appropriate. E.g. Kafka compacted topics will likely want to use event time dt as Kafka timestamp. Downstream (update-able Iceberg?) Hive tables that select from event tables will likely want to use event time dt as their own partition timestamp.