In output-sql generateInsertActorQueries, we build a representation of the data for each IP address to be stored in the actor_data table:
const actorObj = { actor_data: { ip: actor.ip, org: actor.organization, client_count: actor.client.count || 0, types: actor.client.types ? getActorTypes(actor.client.types) : actorTypes.UNKNOWN, conc_city: actor.client.concentration && actor.client.concentration.city ? actor.client.concentration.city : '', conc_state: actor.client.concentration && actor.client.concentration.state ? actor.client.concentration.state : '', conc_country: actor.client.concentration && actor.client.concentration.country ? actor.client.concentration.country : '', countries: actor.client.countries || 0, location_country: actor.location.country || '', risks: actor.risks ? getActorRisks(actor.risks) : riskTypes.UNKNOWN }, behaviors: actor.client.behaviors || [], proxies: actor.client.proxies || [], tunnels: actor.tunnels && actor.tunnels.length ? getTunnels(actor.tunnels) : false };
How we handle empty fields is inconsistent - e.g. for behaviors, we store nothing, but for types we store actorTypes.UNKNOWN.
- What should we store for missing fields? (Which might be legitimately empty)
- What should we store for unexpected data? (E.g. an unrecognized behavior)