Page MenuHomePhabricator

[HiveToDruid] Add support for ingesting subfields of map columns
Open, MediumPublic


It would be nice to be able to ingest subfields of Map type columns, like geocoded_data.
Also, the geocoded_data capsule field has another issue: it has an underscore and it messes up with the code that identifies fields vs subfields.

Event Timeline

mforns created this task.Nov 2 2018, 2:14 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 2 2018, 2:14 PM
fdans triaged this task as High priority.Nov 5 2018, 5:29 PM
fdans moved this task from Incoming to Smart Tools for Better Data on the Analytics board.
fdans added subscribers: Nuria, elukey, Ottomata.
Nuria assigned this task to mforns.Dec 12 2018, 8:14 PM

I think this can be resolved with your latest changes, let me know otherwise.

Hm! good point...
I think part of it has been solved by the recent changes in T210099.
Namely there was a bug in accessing capsule fields that had underscores in them, like geocoded_data. This is solved.
However, there's still some additions needed:

To add a dimension on a subfield of a struct field you can do now:


And the code will "flatten" it to event_namespace_id (and do all this implies in the ingestion spec).
To do the same with a map field, you would do:


But the code is not yet able to flatten that syntax into geocoded_data_country.

So we should change that. However, I think it will be easier now.

Milimetric lowered the priority of this task from High to Medium.Jan 7 2019, 5:16 PM
Ottomata claimed this task.Jan 16 2020, 6:22 PM

I'm going to find some time to work on this.

Ottomata moved this task from Next Up to In Progress on the Analytics-Kanban board.
Ottomata renamed this task from [EventLoggingToDruid] Add support for ingesting subfields of map columns to [HiveToDruid] Add support for ingesting subfields of map columns.Apr 21 2020, 2:24 PM
Ottomata removed Ottomata as the assignee of this task.
Ottomata updated the task description. (Show Details)