Page MenuHomePhabricator

[HiveToDruid] Add support for ingesting subfields of map columns
Open, MediumPublic

Description

It would be nice to be able to ingest subfields of Map type columns, like geocoded_data.
Also, the geocoded_data capsule field has another issue: it has an underscore and it messes up with the code that identifies fields vs subfields.

Event Timeline

I think this can be resolved with your latest changes, let me know otherwise.

Hm! good point...
I think part of it has been solved by the recent changes in T210099.
Namely there was a bug in accessing capsule fields that had underscores in them, like geocoded_data. This is solved.
However, there's still some additions needed:

To add a dimension on a subfield of a struct field you can do now:

event.namespace_id

And the code will "flatten" it to event_namespace_id (and do all this implies in the ingestion spec).
To do the same with a map field, you would do:

geocoded_data['country']

But the code is not yet able to flatten that syntax into geocoded_data_country.

So we should change that. However, I think it will be easier now.

Milimetric lowered the priority of this task from High to Medium.Jan 7 2019, 5:16 PM

I'm going to find some time to work on this.

Ottomata renamed this task from [EventLoggingToDruid] Add support for ingesting subfields of map columns to [HiveToDruid] Add support for ingesting subfields of map columns.Apr 21 2020, 2:24 PM
Ottomata removed Ottomata as the assignee of this task.
Ottomata updated the task description. (Show Details)