It would be nice to be able to ingest subfields of Map type columns, like geocoded_data.
Also, the geocoded_data capsule field has another issue: it has an underscore and it messes up with the code that identifies fields vs subfields.
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | elukey | T203669 Return to real time banner impressions in Druid | |||
Open | None | T208589 [HiveToDruid] Add support for ingesting subfields of map columns | |||
Declined | None | T218347 Ingest cirrussearchrequest data into druid |
Event Timeline
Comment Actions
Hm! good point...
I think part of it has been solved by the recent changes in T210099.
Namely there was a bug in accessing capsule fields that had underscores in them, like geocoded_data. This is solved.
However, there's still some additions needed:
To add a dimension on a subfield of a struct field you can do now:
event.namespace_id
And the code will "flatten" it to event_namespace_id (and do all this implies in the ingestion spec).
To do the same with a map field, you would do:
geocoded_data['country']
But the code is not yet able to flatten that syntax into geocoded_data_country.
So we should change that. However, I think it will be easier now.