What is the problem?
I am seeing cases where ipoid reports Duplicate entry found for IP... errors. But, the IPs appear only once in the feed.
Steps to reproduce problem
Assuming docker:
- Setup the docker environment (docker compose up -d; docker compose exec web mkdir /tmp/ipoid; docker compose exec web node -e "require('./create-users.js')();")
- Download the files in Reproduction data and save them to tmp/
- Copy them to the ipoid docker container (docker compose exec web cp tmp/spur_random_data_valid_1_100_10_46dd3514-dc1a-4610-aa97-6fc75d2f88d0_20240103.sorted.json.gz /tmp/ipoid/20240101.json.gz; docker compose exec web cp tmp/spur_random_data_valid_1_100_10_46dd3514-dc1a-4610-aa97-6fc75d2f88d0_20240104.sorted.json.gz /tmp/ipoid/20240102.json.gz)
- docker compose exec web ./main.sh --init true --today 20240101 --debug true
- docker compose exec web ./main.sh --yesterday 20240101 --today 20240102 --debug true --batchsize 10 (note, I cannot reproduce if batchsize is default)
Expected behaviour: Update happens without any errors or warnings.
Observed behaviour: You see 4 warnings below. I have checked and those IPs appear only once in today's feed and once in yesterday's feed.
...
{"log.level":"info","@timestamp":"2024-03-07T10:24:16.163Z","process.pid":109592,"host.hostname":"2a75629c3145","ecs.version":"8.10.0","message":"Importing /tmp/ipoid/sub/query_split_aaaaj.sql...","trace.id":""}
{"@timestamp":"2024-03-07T10:24:16.462Z","code":"ER_DUP_ENTRY","ecs.version":"8.10.0","errno":1062,"fatal":false,"log.level":"info","message":"Duplicate entry found for IP, attempting to delete and re-insert","name":"SqlError","sql":"INSERT INTO actor_data (ip,org,client_count,types,conc_city,conc_state,conc_country,countries,location_country,risks) VALUES ('2001:1:0:0:0:0:0:2',NULL,0,1,'','','',0,'VIYIoDDy',1); - parameters:[]","sqlMessage":"Duplicate entry '2001:1:0:0:0:0:0:2' for key 'ip'","sqlState":"23000","stack":"SqlError: Duplicate entry found for IP, attempting to delete and re-insert\n at Object.module.exports.createError (/srv/service/node_modules/mariadb/lib/misc/errors.js:64:10)\n at PacketNodeEncoded.readError (/srv/service/node_modules/mariadb/lib/io/packet.js:582:19)\n at Query.readResponsePacket (/srv/service/node_modules/mariadb/lib/cmd/parser.js:58:28)\n at PacketInputStream.receivePacketBasic (/srv/service/node_modules/mariadb/lib/io/packet-input-stream.js:85:9)\n at PacketInputStream.onData (/srv/service/node_modules/mariadb/lib/io/packet-input-stream.js:135:20)\n at Socket.emit (node:events:513:28)\n at addChunk (node:internal/streams/readable:315:12)\n at readableAddChunk (node:internal/streams/readable:289:9)\n at Socket.Readable.push (node:internal/streams/readable:228:10)\n at TCP.onStreamRead (node:internal/stream_base_commons:190:23)","trace.id":""}
{"@timestamp":"2024-03-07T10:24:16.466Z","code":"ER_DUP_ENTRY","ecs.version":"8.10.0","errno":1062,"fatal":false,"log.level":"info","message":"Duplicate entry found for IP, attempting to delete and re-insert","name":"SqlError","sql":"INSERT INTO actor_data (ip,org,client_count,types,conc_city,conc_state,conc_country,countries,location_country,risks) VALUES ('2001:1:0:0:0:0:0:1',NULL,248593,10,'','','WK',675990,'mvCJNzlVf',1); - parameters:[]","sqlMessage":"Duplicate entry '2001:1:0:0:0:0:0:1' for key 'ip'","sqlState":"23000","stack":"SqlError: Duplicate entry found for IP, attempting to delete and re-insert\n at Object.module.exports.createError (/srv/service/node_modules/mariadb/lib/misc/errors.js:64:10)\n at PacketNodeEncoded.readError (/srv/service/node_modules/mariadb/lib/io/packet.js:582:19)\n at Query.readResponsePacket (/srv/service/node_modules/mariadb/lib/cmd/parser.js:58:28)\n at PacketInputStream.receivePacketBasic (/srv/service/node_modules/mariadb/lib/io/packet-input-stream.js:85:9)\n at PacketInputStream.onData (/srv/service/node_modules/mariadb/lib/io/packet-input-stream.js:135:20)\n at Socket.emit (node:events:513:28)\n at addChunk (node:internal/streams/readable:315:12)\n at readableAddChunk (node:internal/streams/readable:289:9)\n at Socket.Readable.push (node:internal/streams/readable:228:10)\n at TCP.onStreamRead (node:internal/stream_base_commons:190:23)","trace.id":""}
{"@timestamp":"2024-03-07T10:24:16.485Z","ecs.version":"8.10.0","log.level":"info","message":"All updates complete","trace.id":""}
{"log.level":"info","@timestamp":"2024-03-07T10:24:17.505Z","process.pid":109592,"host.hostname":"2a75629c3145","ecs.version":"8.10.0","message":"Importing /tmp/ipoid/sub/query_split_aaaak.sql...","trace.id":""}
{"@timestamp":"2024-03-07T10:24:17.789Z","code":"ER_DUP_ENTRY","ecs.version":"8.10.0","errno":1062,"fatal":false,"log.level":"info","message":"Duplicate entry found for IP, attempting to delete and re-insert","name":"SqlError","sql":"INSERT INTO actor_data (ip,org,client_count,types,conc_city,conc_state,conc_country,countries,location_country,risks) VALUES ('192.0.0.171','pL',676513,1,'','','',798744,'DWn',2); - parameters:[]","sqlMessage":"Duplicate entry '192.0.0.171' for key 'ip'","sqlState":"23000","stack":"SqlError: Duplicate entry found for IP, attempting to delete and re-insert\n at Object.module.exports.createError (/srv/service/node_modules/mariadb/lib/misc/errors.js:64:10)\n at PacketNodeEncoded.readError (/srv/service/node_modules/mariadb/lib/io/packet.js:582:19)\n at Query.readResponsePacket (/srv/service/node_modules/mariadb/lib/cmd/parser.js:58:28)\n at PacketInputStream.receivePacketBasic (/srv/service/node_modules/mariadb/lib/io/packet-input-stream.js:85:9)\n at PacketInputStream.onData (/srv/service/node_modules/mariadb/lib/io/packet-input-stream.js:135:20)\n at Socket.emit (node:events:513:28)\n at addChunk (node:internal/streams/readable:315:12)\n at readableAddChunk (node:internal/streams/readable:289:9)\n at Socket.Readable.push (node:internal/streams/readable:228:10)\n at TCP.onStreamRead (node:internal/stream_base_commons:190:23)","trace.id":""}
{"@timestamp":"2024-03-07T10:24:17.796Z","ecs.version":"8.10.0","log.level":"info","message":"All updates complete","trace.id":""}
{"log.level":"info","@timestamp":"2024-03-07T10:24:18.817Z","process.pid":109592,"host.hostname":"2a75629c3145","ecs.version":"8.10.0","message":"Importing /tmp/ipoid/sub/query_split_aaaai.sql...","trace.id":""}
{"@timestamp":"2024-03-07T10:24:19.100Z","code":"ER_DUP_ENTRY","ecs.version":"8.10.0","errno":1062,"fatal":false,"log.level":"info","message":"Duplicate entry found for IP, attempting to delete and re-insert","name":"SqlError","sql":"INSERT INTO actor_data (ip,org,client_count,types,conc_city,conc_state,conc_country,countries,location_country,risks) VALUES ('192.0.0.170',NULL,75008,20,'xU','TCw','B',0,'',1); - parameters:[]","sqlMessage":"Duplicate entry '192.0.0.170' for key 'ip'","sqlState":"23000","stack":"SqlError: Duplicate entry found for IP, attempting to delete and re-insert\n at Object.module.exports.createError (/srv/service/node_modules/mariadb/lib/misc/errors.js:64:10)\n at PacketNodeEncoded.readError (/srv/service/node_modules/mariadb/lib/io/packet.js:582:19)\n at Query.readResponsePacket (/srv/service/node_modules/mariadb/lib/cmd/parser.js:58:28)\n at PacketInputStream.receivePacketBasic (/srv/service/node_modules/mariadb/lib/io/packet-input-stream.js:85:9)\n at PacketInputStream.onData (/srv/service/node_modules/mariadb/lib/io/packet-input-stream.js:135:20)\n at Socket.emit (node:events:513:28)\n at addChunk (node:internal/streams/readable:315:12)\n at readableAddChunk (node:internal/streams/readable:289:9)\n at Socket.Readable.push (node:internal/streams/readable:228:10)\n at TCP.onStreamRead (node:internal/stream_base_commons:190:23)","trace.id":""}
...Further observations:
- You will also see in the output of step 5 { changed: 1, removed: 77, inserted: 75 }. This does not seem right as there are at least 4 IPs which appear in both today's and yesterday's feeds (the IPs which trigger the duplicate errors).
Environment
ipoid commit 3fc04e78cb82ac031188446aed0aa1210d1200f0
Reproduction data
20240101.json.gz
20240102.json.gz