Page MenuHomePhabricator

[EventLoggingToDruid] Add support for ingesting subfields of map columns
Open, MediumPublic

Description

It would be nice to be able to ingest subfields of Map type columns, like geocoded_data.
Also, the geocoded_data capsule field has another issue: it has an underscore and it messes up with the code that identifies fields vs subfields.

Event Timeline

mforns created this task.Nov 2 2018, 2:14 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 2 2018, 2:14 PM
fdans triaged this task as High priority.Nov 5 2018, 5:29 PM
fdans moved this task from Incoming to Smart Tools for Better Data on the Analytics board.
fdans added subscribers: Nuria, elukey, Ottomata.
Nuria assigned this task to mforns.Dec 12 2018, 8:14 PM

I think this can be resolved with your latest changes, let me know otherwise.

Hm! good point...
I think part of it has been solved by the recent changes in T210099.
Namely there was a bug in accessing capsule fields that had underscores in them, like geocoded_data. This is solved.
However, there's still some additions needed:

To add a dimension on a subfield of a struct field you can do now:

event.namespace_id

And the code will "flatten" it to event_namespace_id (and do all this implies in the ingestion spec).
To do the same with a map field, you would do:

geocoded_data['country']

But the code is not yet able to flatten that syntax into geocoded_data_country.

So we should change that. However, I think it will be easier now.

Milimetric lowered the priority of this task from High to Medium.Jan 7 2019, 5:16 PM