It would be amazing if all Data Lake tables used a consistent format for their time columns. The first step is picking a standard.
= Datetime or timestamp? =
EventBus, EventLogging, and call it a datetime (`dt`), but `mediawiki_history` calls it a timestamp (`xxxxxx_timestamp`). `webrequest`, for some reason, has both a `ts` and a `dt` field with different formats.
= Which format? =
There are actually four different formats in use!
| format | tables using | comment
| ------- | ----- | -----
| `YYYY-mm-ddTHH:MM:SSZ` | EventLogging tables ( e.g. `event.editattemptstep` and `event.editorjourney`), `mediawiki_wikitext_history` | "normal" ISO 8601 standard
| `YYYY-mm-ddTHH:MM:SS+00:00` | EventBus tables (e.g. `event.mediawiki_revision_create` and `event.mediawiki_page_create`) | "alternative" ISO 8601 standard
| `YYYY-mm-dd HH:MM:SS.0` | `mediawiki_history` and siblings, webrequest's `ts` field | [Hive/JDBC standard](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-timestamp)— alternatives can be configured starting with Hive 1.2.0. Nicely human readable.
| `YYYY-mm-ddTHH:MM:SS` | webrequest's `dt` field | weird combo format 😛