Page MenuHomePhabricator

"Data too long for column 'behavior' at row 1"
Closed, DeclinedPublic1 Estimated Story PointsBUG REPORT

Description

What is the problem?

For certain data, exception:

/srv/service/node_modules/mariadb/lib/misc/errors.js:61
  return new SqlError(msg, sql, fatal, info, sqlState, errno, additionalStack, addHeader);
         ^

SqlError: (conn=212, no: 1406, SQLState: 22001) Data too long for column 'behavior' at row 1
sql: INSERT INTO behaviors (behavior) VALUES (?); - parameters:['~É|񽐇ï
jñ÷Õ󤿷򊣞üzHÁ󋇁ùE񯧖򮞠s񧏳􏺦©񏿲𴆛ì']
    at Object.module.exports.createError (/srv/service/node_modules/mariadb/lib/misc/errors.js:61:10)
    at PacketNodeEncoded.readError (/srv/service/node_modules/mariadb/lib/io/packet.js:511:19)
    at Query.readResponsePacket (/srv/service/node_modules/mariadb/lib/cmd/resultset.js:46:28)
    at PacketInputStream.receivePacketBasic (/srv/service/node_modules/mariadb/lib/io/packet-input-stream.js:104:9)
    at PacketInputStream.onData (/srv/service/node_modules/mariadb/lib/io/packet-input-stream.js:169:20)
    at Socket.emit (node:events:513:28)
    at addChunk (node:internal/streams/readable:315:12)
    at readableAddChunk (node:internal/streams/readable:289:9)
    at Socket.Readable.push (node:internal/streams/readable:228:10)
    at TCP.onStreamRead (node:internal/stream_base_commons:190:23) {
  text: "Data too long for column 'behavior' at row 1",
  sql: "INSERT INTO behaviors (behavior) VALUES (?); - parameters:['~É|񽐇ï\n" +
    "jñ÷Õ󤿷򊣞üzHÁ󋇁ùE񯧖򮞠s񧏳􏺦©񏿲𴆛ì']",
  fatal: false,
  errno: 1406,
  sqlState: '22001',
  code: 'ER_DATA_TOO_LONG'
}
Steps to reproduce problem
  1. Save the JSON from "Reproduction data" below as a .gz file (e.g. reprod.json.gz) into the ipoid/tmp directory
  2. If necessary, start up docker in the ipoid directory (e.g. docker compose up -d)
  3. Initialise the database: docker compose exec web node init-db.js
  4. Run this command: docker compose exec web node import-data.js ./tmp/reprod.json.gz
Environment

ipoid commit ede94c5172e61f0c629893f8639f85ebc9590bcd

Reproduction data
{"as": {"Organization": null, "number": true}, "client": {"behaviors": [false, null, true, "~\u00c9|\ud9b5\udc07\u00ef\nj\u00f1\u00f7\u00d5\udb53\udff7\ud9ea\udcde\u00fczH\u00c1\udaec\uddc1\u00f9E\ud97e\uddd6\uda79\udfa0s\ud95c\udff3\udbff\udea6\u00a9\ud8ff\udff2\ud890\udd9b\u00ec", "\u009b", null, "\"", "\udb2f\udca9", "\u00a1", false, false, "\u00d9", "\u00fe", 17646, "\u00c1", "\ud961\udf3b", null, true, null], "concentration": {"country": "\u00ec\u00b0\ueffau\ud91f\ude67\u00f9Ah\u00f3}", "skew": true}, "count": 3516, "countries": 12374, "proxies": [false, "w", true, "\uda8d\udc66", 104868539997107551, "\u00daG"], "spread": null, "types": ["MOBILE", "MOBILE", "IOT", "HEADLESS", "HEADLESS", "MOBILE", "DESKTOP", "IOT", "HEADLESS", "DESKTOP", "MOBILE", "HEADLESS", "IOT", "MOBILE", "HEADLESS", "DESKTOP"]}, "infrastructure": false, "location": {"city": "m,\u00c5\ud969\ude37?F\u00b0\u00e6\u00ffM\uda22\uddf8", "state": null}, "organization": 22891, "risks": ["TUNNEL", "TUNNEL", "CALLBACK_PROXY"], "services": [null, false, null, ")\f\u0080\ud9b0\udfef", "j", "\u00b7", true, false, "3", "\u001d\u00ab", -78, true, 85, "\u00ce", -4513, null, null, null], "tunnels": [{"anonymous": null, "entries": ["\u00fd\n", true, -27984, 22266, null, null, null, 89], "operator": 13, "type": "\u00d9\ud977\ude2d\ud96d\udd35\udaa8\ude8b^9\uda8f\udd76\u008b\u008b{"}, {"anonymous": null, "entries": [], "operator": null, "type": "\n"}, {"anonymous": "i\ud992\udc9f\u007f\n", "entries": ["\u00187\u00c7\u035d\ud8b5\udc23", 2560376972599902232, "\u0091", true, -120, 15703], "operator": -26986, "type": true}], "ip": "8cc2d1e5-4d95-466c-bbeb-d23cd2bd80e7"}

Event Timeline

The documentation describes behavior as an array of behavior tags for an IP Address.

All possible values of behavior are defined as an enum here
with possible values as

FILE_SHARING
<SERVICE_TAG>_USER

And a list of all possible values for <SERVICE_TAG> are listed under service tags here.

Currently behaviors.behavior is a VARBINARY(64) column in the schema, and this should be sufficient to store all possible values listed under service tags.

I think we can close/decline this unless anyone has a different opinion on this.

cc: @STran

I quickly skimmed the list of tags and around 30 characters was the longest service tag I saw, well under the 64 character limit. I think it's fine to close this out. The example given in the reproduction doesn't appear to be one of the enumerations possible.

AGueyte set the point value for this task to 1.Jul 24 2023, 1:24 PM