Page MenuHomePhabricator

Assess Wikidata dump import hardware
Open, In Progress, MediumPublic

Description

THIS IS A DRAFT

The purpose of this task is to document observed Wikidata dump import characteristics with different hardware, using the graph database Blazegraph that is employed by the Wikidata Query Service. It is also to understand how this performance stacks up against an import into the managed Amazon Web Services ("AWS") offering Amazon Neptune. Blazegraph's company Systap was brought into Amazon several years back.

SUMMARY

  • A current state of the art cloud compute instance approaches the performance characteristics of a 2018 gaming desktop and 2019 Intel-based MacBook Pro.
  • NVMes confer a speed advantage for Blazegraph imports relative to SATA SSDs.
  • CPU clock speed confers a speed advantage for Blazegraph imports.
  • Amazon Neptune is capable of importing a full February 2024 Wikidata dump dramatically faster than with a standalone cloud compute instance or later generation bare metal server in our data center. Instead of 15-25 days (bare metal in one of our data centers) it takes approximately 63 hours (2.625 days).
  • There are cost considerations.

NARRATIVE

As noted in T336443: Investigate performance differences between wdqs2022 and older hosts, full Wikidata imports are slower on newer servers provisioned with some later generation processors, in contrast with some older servers with older generation processors. The latest full import appeared to take 25 days.

It's important to address this speed challenge in order to ensure that newer hardware is capable of repopulating Blazegraph quickly in the event of a system failure. Additionally, faster imports also enable us to conduct experiments for graph splitting approaches more quickly - we have been targeting a load time below 10 days, which is achievable for graphs of about 7.6B triples but we believe will likely be difficult to achieve consistently for graphs of 10B triples or more with the newer nodes. Finally, in the event of a catastrophic system failure, we want to understand if the current state of the art, for example as exhibited with Amazon Neptune, may still provide adequate import capabilities.

The following prior art is useful for understanding similar analysis.

More to be ported from text files, Etherpad, etc.

Event Timeline

dr0ptp4kt changed the task status from Open to In Progress.Mar 4 2024, 3:28 PM
dr0ptp4kt claimed this task.
dr0ptp4kt triaged this task as Medium priority.
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt moved this task from Incoming to In Progress on the Discovery-Search (Current work) board.

Change rOPUP100894305575 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] partman: configure wdqs1025 partioning

https://gerrit.wikimedia.org/r/1008943

Change rOPUP100894305575 merged by Bking:

[operations/puppet@production] partman: configure wdqs1025 partioning

https://gerrit.wikimedia.org/r/1008943

First, adding some commands that were used for Blazegraph imports on Ubuntu 22.04. I had originally tried a good number of EC2 instance types, and then after that went back to focus on just four of them with a sequence of repeatable commands (this wasn't scripted, as I didn't want to spend time automating and also wanted to make sure I got the systems' feedback along the way). I forgot to grab RAM frequency as a routine step when running these commands (I recall checking on one server maybe in the original checks, and did look at my Alienware), but generally servers are DDR4 unless the documentation in AWS says DDR5 (for my 2018 Alienware and 2019 MacBook Pro they're DDR4, BTW).

# get the specs, get the software, ready the mount
lscpu
free -h
lsblk
sudo fdisk /dev/nvme1n1
 n
 p
 1
 ENTER
 ENTER
 w
lsblk
sudo mkfs.ext4 /dev/nvme1n1p1
mkdir rdf
sudo mount -t auto -v /dev/nvme1n1p1 /home/ubuntu/rdf
sudo chown ubuntu:ubuntu rdf
git clone https://gerrit.wikimedia.org/r/wikidata/query/rdf rdfdownload
cp -r rdfdownload/. rdf
cd rdf
df -h .
sudo apt update
sudo apt install openjdk-8-jdk-headless
./mvnw package -DskipTests

# ready Blazegraph and run a partial import
sudo mkdir /var/log/wdqs
sudo chown ubuntu:ubuntu /var/log/wdqs
touch /var/log/wdqs/wdqs-blazegraph.log
cd /home/ubuntu/rdf/dist/target/
tar xzvf service-0.3.138-SNAPSHOT-dist.tar.gz
cd service-0.3.138-SNAPSHOT/
# using logback.xml like prod:
mv ~/logback.xml .
# using runBlazegraph.sh like prod, 31g heap and pointer to logback.xml:
mv ~/runBlazegraph.sh .
vi runBlazegraph.sh
screen
 ./runBlazegraph.sh
CTRL-a-d to leave screen up
time ./loadData.sh -n wdq -d /home/ubuntu/ -s 1 -e 9
screen -r
CTRL-c to kill Blazegraph
 exit from screen
ls -alh wikidata.jnl

# try it with a ramdisk
rm wikidata.jnl
sudo modprobe brd rd_size=50331648 max_part=1 rd_nr=1
sudo mkfs -t ext4 /dev/ram0
mkdir /home/ubuntu/rdfram
sudo mount /dev/ram0 /home/ubuntu/rdfram
sudo chown ubuntu:ubuntu /home/ubuntu/rdfram
cd
cp -r rdf/. rdfram
cd rdfram/dist/target/service-0.3.138-SNAPSHOT/
cp /home/ubuntu/wikidump-* /home/ubuntu/rdfram
df -h ./
screen
 ./runBlazegraph.sh
CTRL-a-d to leave screen up
time ./loadData.sh -n wdq -d /home/ubuntu/rdfram -s 1 -e 9
dr0ptp4kt updated the task description. (Show Details)

Change 1009574 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] wdqs: make "monitoring_tier" var optional

https://gerrit.wikimedia.org/r/1009574

Change 1009574 merged by Bking:

[operations/puppet@production] wdqs: move monitoring logic into role declaration

https://gerrit.wikimedia.org/r/1009574

@dr0ptp4kt wdqs1025 should be ready for your I/O tests. Let us know how it goes!

Thanks @bking ! It looks like the NVMe in this one is not a higher speed one for writes (based on the reported model from lsblk I think this is a https://i.dell.com/sites/csdocuments/Shared-Content_data-Sheets_Documents/en/Dell-PowerEdge-Express-Flash-NVMe-Mixed-Use-PCIe-SSD.pdf ), and I'm also wondering if perhaps its write performance has degraded with age. I'll paste in the results here, but this was slower than the other servers, ironically (although not surprisingly because of the slower NVMe and slightly slower processor). This slower write speed is atypical of the other NVMes I've encountered. I believe the newer model ones are rated for 6000 MB/s for writes. But, I'm going to ping on task to see if we can get a comparative read of disk throughput from one of the newer and faster cp#### NVMes.

dr0ptp4kt@wdqs1025:/srv/deployment/wdqs/wdqs-cache$ ls /srv/wdqs/
aliases.map  wikidata.jnl               wikidump-000000002.ttl.gz  wikidump-000000004.ttl.gz  wikidump-000000006.ttl.gz  wikidump-000000008.ttl.gz
dumps        wikidump-000000001.ttl.gz  wikidump-000000003.ttl.gz  wikidump-000000005.ttl.gz  wikidump-000000007.ttl.gz  wikidump-000000009.ttl.gz
dr0ptp4kt@wdqs1025:/srv/deployment/wdqs/wdqs-cache$ cd cache
dr0ptp4kt@wdqs1025:/srv/deployment/wdqs/wdqs-cache/cache$ time ./loadData.sh -n wdq -d /srv/wdqs -s 1 -e 9
Processing wikidump-000000001.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=214282ms, elapsed=214279ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=233942ms, commitTime=1709910647417, mutationCount=22829952</p
></html
>Processing wikidump-000000002.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=196470ms, elapsed=196469ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=227786ms, commitTime=1709910874952, mutationCount=15807617</p
></html
>Processing wikidump-000000003.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=183111ms, elapsed=183110ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=213965ms, commitTime=1709911089170, mutationCount=12654001</p
></html
>Processing wikidump-000000004.ttl.gz
^C

real    14m4.855s
user    0m0.084s
sys     0m0.053s
dr0ptp4kt@wdqs1025:/srv/deployment/wdqs/wdqs-cache/cache$ cd /srv
dr0ptp4kt@wdqs1025:/srv$ df .
Filesystem      1K-blocks    Used  Available Use% Mounted on
/dev/nvme0n1   1537157352 9508448 1449491832   1% /srv
dr0ptp4kt@wdqs1025:/srv$ sudo sync; sudo dd if=/dev/zero of=tempfile bs=25M count=1024; sudo sync
1024+0 records in
1024+0 records out
26843545600 bytes (27 GB, 25 GiB) copied, 27.1995 s, 987 MB/s
dr0ptp4kt@wdqs1025:/srv$ sudo sync; sudo dd if=/dev/zero of=tempfile bs=25M count=1024; sudo sync
1024+0 records in
1024+0 records out
26843545600 bytes (27 GB, 25 GiB) copied, 37.5448 s, 715 MB/s
dr0ptp4kt@wdqs1025:/srv$ lsblk -o MODEL,SERIAL,SIZE,STATE --nodeps
MODEL                                SERIAL                 SIZE STATE
...
Dell Express Flash PM1725a 1.6TB SFF       S39XNX0JC01060   1.5T

@ssingh would you mind if the following command is run on one of the newer cp#### hosts with a new higher write throughput NVMe? If so, got a recommended node? I don't have access, but I think @bking may.

sudo sync; sudo dd if=/dev/zero of=tempfile bs=25M count=1024; sudo sync

Heads up, I'm out for the rest of the day.

@ssingh @dr0ptp4kt hold up on the testing for on your hosts for now...we might be able to get an NVMe into this year's budget, will let you know.

@dr0ptp4kt If you want to run i/o tests on the existing hosts, I recommend the approach detailed in this wikitech page . Brendan Gregg is considered an expert on computer performance and benchmarking, you might try his approach as well.

AWS EC2 servers

After exploring a battery of EC2 servers, four instance types were selected and the commands posted were run.

The configuration most like our wdqs1021-1023 servers (third generation Intel Xeon) is listed first. The fastest option among the four servers was a Gravitron3 ARM-based configuration from Amazon.

Time Disk ➡️ DiskTime RAMdisk ➡️ RAMdiskInstance TypeCost Per HourHD TransferProcessor CommentRAM Comment
26m46.651s26m26.923sm6id.16xlarge$3.7968EBS ➡️ NVMe64 vCPU @ "Up to 3.5 GHz 3rd Generation Intel Xeon Scalable processors (Ice Lake 8375C)"256 GB @ DDR4
22m5.442s20m31.244sm5zn.metal$3.9641EBS ➡️ EBS48 vCPU @ ""2nd Generation Intel Xeon Scalable Processors (Cascade Lake 8252C) with an all-core turbo frequency up to 4.5 GHz""192 GiB @ DDR4
21m40.537s20m57.268sc5d.12xlarge$2.304EBS ➡️ NVMe48 vCPU @ " C5 and C5d 12xlarge, 24xlarge, and metal instance sizes feature custom 2nd generation Intel Xeon Scalable Processors (Cascade Lake 8275CL) with a sustained all core Turbo frequency of 3.6GHz and single core turbo frequency of up to 3.9GHz."96 GiB @ DDR4
19m18.825s19m23.868sc7gd.16xlarge$2.903EBS ➡️ NVMe64 vCPU @ "Powered by custom-built AWS Graviton3 processors"128 GiB @ DDR5

2018 gaming desktop

Commands were then run against a a gaming-class desktop from 2018. This outperformed the fastest Gravitron3 configuration in AWS.

The Blazegraph bufferCapacity configuration variable was tested. Increasing the bufferCapacity from 100000 to 1000000 yielded a sizable performance improvement.

Time Disk ➡️ DiskInstance TypebufferCapacityHD TransferProcessor CommentRAM Comment
18m31.647sAlienware Aurora R7 (upgraded) i7-8700100000SATA SSD ➡️ NVMe6 CPU @ up to 4.6 GHz (i7-8700 page)64 GB @ DDR4
18m3.798sAlienware Aurora R7 (upgraded) i7-8700100000NVMe ➡️ same NVMe6 CPU @ up to 4.6 GHz (i7-8700 page)64 GB @ DDR4
15m1.658sAlienware Aurora R7 (upgraded)10000000NVMe ➡️ same NVMe6 CPU @ up to 4.6 GHz (i7-8700 page)64 GB @ DDR4
13m28.076sAlienware Aurora R7 (upgraded)1000000NVMe ➡️ same NVMe6 CPU @ up to 4.6 GHz (i7-8700 page)64 GB @ DDR4

2019 MacBook Pro

The commands were run against a 2019-era Intel-based MacBook Pro. A 16 GB Java heap was used, as this laptop only has 32 GB of memory.

This was a fast arrangement, but not quite as fast as the gaming-class desktop. The 2019 MacBook Pro has a faster processor, but a somewhat slower disk, than the gaming desktop.

Time Disk ➡️ DiskInstance TypebufferCapacityHD TransferProcessor CommentRAM Comment
17m00.040s2019 MacBook Pro100000High-end SSD ➡️ Same SSD (Tom's Hardware)"Configurable to 2.4GHz 8‑core Intel Core i9, Turbo Boost up to 5.0GHz, with 16MB shared L3 cache"32 GB @ DDR4
16m27.390s2019 MacBook Pro1000000High-end SSD ➡️ Same SSD (Tom's Hardware)"Configurable to 2.4GHz 8‑core Intel Core i9, Turbo Boost up to 5.0GHz, with 16MB shared L3 cache"32 GB @ DDR4

Gaming desktop with MemStore

The following represents an in-memory configuration of the Blazegraph database instead of a .jnl database journal file. It's fast, but do note that the operating system killed the process after completion, apparently due to memory pressure. A higher memory instance would be required in order to load the full database, and again, it appears there isn't an easy way to serialize to disk for later restoration. But this is illustrative of what's possible .

Time Disk ➡️ DiskInstance TypebufferCapacityHD TransferProcessor CommentRAM Comment
9m34.808sAlienware Aurora R7 (upgraded)100000 w/ MemStoreNVMe ➡️ same NVMe6 core @ up to 4.6 GHz (i7-8700 page)64 GB @ DDR4

Queue Capacity

As suggested at https://github.com/blazegraph/database/wiki/IOOptimization the queue capacity was set at 4000 for these runs.

com.bigdata.btree.writeRetentionQueue.capacity=4000

This is probably an okay middle ground value. But, for the purpose of checking out the behavior with this setting, the same command was run, setting the capacity to a value of 8000 (and using a bufferCapacity of 1000000) on the 2019 MacBook Pro. Indeed, it was rather faster as expected, at 12m29.030s, showing promise like in a previous analysis.

This is likely an area for further examination against the 2018 gaming desktop. Soon to follow is some discussion of what was observed for NVMe versus SSD targets on that 2018 gaming desktop.

More about NVMe versus SSD

Runs were also done to see the effects on 150 munged files (out of a set of 2202 files) from the full Wikidata import, which allows for exercising more disk related pieces. This was tried with both types of target disk - SATA SSD and M.2 NVMe - on the 2018 gaming desktop. This was done with the bufferCapacity of 100000.

The M.2 NVMe was faster, somewhere between 16%-19% faster.

Notice in the following commands the paths

  • ~/rdf, which is part of a mount on the NVMe
  • /mnt/t, which is a copy of ~/rdf, but on a SATA SSD
  • /mnt/firehose/, yet another SATA SSD, bearing the full set of munged files

Target is NVMe

ubuntu22:~/rdf/dist/target/service-0.3.138-SNAPSHOT$ time ./loadData.sh -n wdq -d /mnt/firehose/munge_on_later_data_set -s 1 -e 150

...

>Processing wikidump-000000150.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=33999ms, elapsed=33999ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=76005ms, commitTime=1709099819611, mutationCount=3098484</p
></html
>
real    319m50.828s

Target is SATA SSD, run attempt 1

Now, the SATA SSD as the target (as before, the source has been a different SATA SSD).

ubuntu22:/mnt/t/rdf/dist/target/service-0.3.138-SNAPSHOT$ time ./loadData.sh -n wdq -d /mnt/firehose/munge_on_later_data_set -s 1 -e 150

>Processing wikidump-000000150.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=45665ms, elapsed=45665ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=114606ms, commitTime=1709141576293, mutationCount=3098484</p
></html
>
real    381m19.703s

So, the SATA SSD as target yielded a result about 19% slower.

Target is SATA SSD, run attempt 2

The SATA SSD target was tried this again from the same directory (as always, first stopping Blazegraph and deleting the journal) again just to get a feeling of whether this wasn't a fluke on the SATA SSD.

ubuntu22:/mnt/t/rdf/dist/target/service-0.3.138-SNAPSHOT$ time ./loadData.sh -n wdq -d /mnt/firehose/munge_on_later_data_set -s 1 -e 150

><body<p>totalElapsed=46490ms, elapsed=46490ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=120472ms, commitTime=1709169683880, mutationCount=3098484</p
></html
>
real    373m52.079s

Still, some 16.5% slower on the SSD.

More about bufferCapacity

Similarly, with 150 munged files, was attempted with the buffer in RWStore.properties increased from 100000 to 1000000 with the target as the NVMe.

com.bigdata.rdf.sail.bufferCapacity=1000000
ubuntu22:~/rdf/dist/target/service-0.3.138-SNAPSHOT$ time ./loadData.sh -n wdq -d /mnt/firehose/munge_on_later_data_set -s 1 -e 150
...
real    240m5.344s

Remember, for nine munged files the difference in performance for NVMe ➡️ same NVMe between the bufferCapacity of 100000 versus about 1000000 was about 34%. (~1.3412 minus 1.0000), and what we see here for 150 munged files is somewhat consistent at about 33%.

Going for much of the full import

Further import commenced from there with a bufferCapacity of 1000000:

ubuntu22:~/rdf/dist/target/service-0.3.138-SNAPSHOT$ date
Mon Mar  4 06:31:06 PM CST 2024

ubuntu22:~/rdf/dist/target/service-0.3.138-SNAPSHOT$ time ./loadData.sh -n wdq -d /mnt/firehose/munge_on_later_data_set -s 151 -e 2202
Processing wikidump-000000151.ttl.gz

Munge files 151 through 1333 were processed, but the last file apparently successfully processed was file 1303 at Friday, March 8, 2024 12:07:23 AM CST, suggested by the following stack trace material.

SPARQL-UPDATE: updateStr=LOAD <file:///mnt/firehose/munge_on_later_data_set/wikidump-000001333.ttl.gz>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem with entry at -153323735397432922: lastRootBlock=rootBlock{ rootBlock=1, challisField=1303, version=3, nextOffset=47326115792378284, localTime=1709878050435 [Friday, March 8, 2024 12:07:30 AM CST], firstCommitTime=1709578180406 [Monday, March 4, 2024 12:49:40 PM CST], lastCommitTime=1709878043961 [Friday, March 8, 2024 12:07:23 AM CST], commitCounter=1303, commitRecordAddr={off=NATIVE:-141569206,len=422}, commitRecordIndexAddr={off=NATIVE:-76760975,len=220}, blockSequence=20886, quorumToken=-1, metaBitsAddr=1981067087054125, metaStartAddr=11989126, storeType=RW, uuid=4afe5c11-044b-4d55-a8f4-a4c90f0faa10, offsetBits=42, checksum=1206986935, createTime=1709578179709 [Monday, March 4, 2024 12:49:39 PM CST], closeTime=0}

So, we have about 4 hours for files 1-150, then another 77.6 hours for files 151-1303. This means about 59% of the full dump (1303 of 2202 files; although the last ) was processed in probably about 3.4 days.

As noted earlier, there may be an opportunity to set the queue capacity higher and squeeze out even better performance. This will need to wait until I'm physically at the gaming class desktop.

Anecdotally, the N3-style Turtle files have a more dramatic slowdown in their import speed when contrasted with simple N-Triples style files, as more and more triples are imported in a bulk ingestion. What we know here is that even with the complicated N3-style Turtle file format it's possible to import around 9B records in less than four days.

About Amazon Neptune

Amazon Neptune was set to import using the simpler N-Triples file format with a serverless configuration at 128 NCUs (about 256 GB of RAM with some attendant CPU). We don't use N-Triples files in our existing import process (we use N3-style Turtle), but it is the sort of format used in the graph split imports.

The first import was just with lexemes to make sure things were working on a 32 NCU configuration.

curl -v -X POST \
    -H 'Content-Type: application/json' \
    https://db-neptune-1.cluster-cnim20k6c0mh.us-west-2.neptune.amazonaws.com:8182/loader -d '
    {
      "source" : "s3://blazegraphdump/latest-lexemes.nt.bz2",
      "format" : "ntriples",
      "iamRoleArn" : "arn:aws:iam::ACCOUNTID:role/NeptuneLoadFromS3",
      "region" : "us-west-2",
      "failOnError" : "FALSE",
      "parallelism" : "HIGH",
      "updateSingleCardinalityProperties" : "FALSE",
      "queueRequest" : "TRUE"
    }'

...

$ curl -G 'https://db-neptune-1.cluster-cnim20k6c0mh.us-west-2.neptune.amazonaws.com:8182/loader/b29f432b-233e-406d-a36d-e3bd0a4ec5bb'
{
    "status" : "200 OK",
    "payload" : {
        "feedCount" : [
            {
                "LOAD_COMPLETED" : 1
            }
        ],
        "overallStatus" : {
            "fullUri" : "s3://bucket/latest-lexemes.nt.bz2",
            "runNumber" : 1,
            "retryNumber" : 3,
            "status" : "LOAD_COMPLETED",
            "totalTimeSpent" : 7472,
            "startTime" : 1708643480,
            "totalRecords" : 161015491,
            "totalDuplicates" : 137948,
            "parsingErrors" : 0,
            "datatypeMismatchErrors" : 0,
            "insertErrors" : 0
        }
    }

This required a bunch of grants, and I had to make my personal bucket hosting the file listable and readable, as well as the objects listable and readable within it (it's possible to do chained IAM grants, but it is a bit of work and requires somewhat complicated STSes). It appeared that it was also necessary to create a VPC endpoint as described in the documentation.

This was started at 1:30 PM CT on Monday, February 26, 2024. Note that this is the lexemes dump.

Next, I went to verify that with 128 NCUs it goes faster than with 32 NCUs. Because if it did, that would be useful for the bigger dump.

curl -v -X POST \
    -H 'Content-Type: application/json' \
    https://db-neptune-1-instance-1.cwnhpfsf87ne.us-west-2.neptune.amazonaws.com:8182/loader -d '
    {
      "source" : "s3://blazegraphdump/latest-lexemes.nt.bz2",
      "format" : "ntriples",
      "iamRoleArn" : "arn:aws:iam::ACCOUNTID:role/NeptuneLoadFromS3Attempt",
      "region" : "us-west-2",
      "failOnError" : "FALSE",
      "parallelism" : "OVERSUBSCRIBE",
      "updateSingleCardinalityProperties" : "FALSE",
      "queueRequest" : "TRUE"
    }'


{
    "status" : "200 OK",
    "payload" : {
        "loadId" : "8ace45ed-2989-4fd4-aa19-d13b9a59e824"
    }

curl -G 'https://db-neptune-1-instance-1.cwnhpfsf87ne.us-west-2.neptune.amazonaws.com:8182/loader/8ace45ed-2989-4fd4-aa19-d13b9a59e824'


{
    "status" : "200 OK",
    "payload" : {
        "feedCount" : [
            {
                "LOAD_COMPLETED" : 1
            }
        ],
        "overallStatus" : {
            "fullUri" : "s3://blazegraphdump/latest-lexemes.nt.bz2",
            "runNumber" : 1,
            "retryNumber" : 0,
            "status" : "LOAD_COMPLETED",
            "totalTimeSpent" : 2142,
            "startTime" : 1708975752,
            "totalRecords" : 163715491,
            "totalDuplicates" : 141148,
            "parsingErrors" : 0,
            "datatypeMismatchErrors" : 0,
            "insertErrors" : 0
        }
    }
}

It was faster, but that may have had something to do with the earlier 32 NCU configuration having to re-run (successfully) and the 128 NCU not having to automatically re-run, as well as the use of the "parallelism" : "OVERSUBSCRIBE", setting for the 128 NCU use case. OVERSUBSCRIBE is okay and even beneficial for an ingestion like this.

Now, for the full Wikidata load. This was started at about 2:20 PM CT on Monday, February 26, 2024.

curl -v -X POST \
    -H 'Content-Type: application/json' \
    https://db-neptune-1-instance-1.cwnhpfsf87ne.us-west-2.neptune.amazonaws.com:8182/loader -d '
    {
      "source" : "s3://blazegraphdump/latest-all.nt.bz2",
      "format" : "ntriples",
      "iamRoleArn" : "arn:aws:iam::ACCOUNTID:role/NeptuneLoadFromS3Attempt",
      "region" : "us-west-2",
      "failOnError" : "FALSE",
      "parallelism" : "OVERSUBSCRIBE",
      "updateSingleCardinalityProperties" : "FALSE",
      "queueRequest" : "TRUE"
    }'

{
    "status" : "200 OK",
    "payload" : {
        "loadId" : "54dc9f5a-6e3c-428d-8897-180e10c96dbf"
    }

As a frame of reference, over 9B records imported in a a bit over 26 hours. This is in the ballpark of what's discussed in https://hal.science/hal-03132794/document , and remember this is serverless and is probably using lower clock speed vCPU and possibly lower rated RAM. A higher powered provisioned instance is likely to go faster.

curl -G 'https://db-neptune-1-instance-1.cwnhpfsf87ne.us-west-2.neptune.amazonaws.com:8182/loader/54dc9f5a-6e3c-428d-8897-180e10c96dbf'

{
    "status" : "200 OK",
    "payload" : {
        "feedCount" : [
            {
                "LOAD_IN_PROGRESS" : 1
            }
        ],
        "overallStatus" : {
            "fullUri" : "s3://blazegraphdump/latest-all.nt.bz2",
            "runNumber" : 1,
            "retryNumber" : 0,
            "status" : "LOAD_IN_PROGRESS",
            "totalTimeSpent" : 101809,
            "startTime" : 1708978871,
            "totalRecords" : 9312600000,
            "totalDuplicates" : 145923094,
            "parsingErrors" : 0,
            "datatypeMismatchErrors" : 0,
            "insertErrors" : 0
        }
    }
}

I let this continue running. Following was the final outcome - 63 hours for this very full dump.

ubuntu@ip-172-31-23-68:~$ curl 'https://db-neptune-1-instance-1.cwnhpfsf87ne.us-west-2.neptune.amazonaws.com:8182/loader/54dc9f5a-6e3c-428d-8897-180e10c96dbf'
{
    "status" : "200 OK",
    "payload" : {
        "feedCount" : [
            {
                "LOAD_COMPLETED" : 1
            }
        ],
        "overallStatus" : {
            "fullUri" : "s3://blazegraphdump/latest-all.nt.bz2",
            "runNumber" : 1,
            "retryNumber" : 0,
            "status" : "LOAD_COMPLETED",
            "totalTimeSpent" : 227384,
            "startTime" : 1708978871,
            "totalRecords" : 19558358339,
            "totalDuplicates" : 341740320,
            "parsingErrors" : 0,
            "datatypeMismatchErrors" : 0,
            "insertErrors" : 0
        }
    }

This style of query worked fine, although query caching and pre-warmup would be advisable through other means if ever running such a service in the real world.

curl -X POST --data-binary 'query=select ?s ?p ?o where {?s ?p ?o} limit 10' 'https://db-neptune-1-instance-1.cwnhpfsf87ne.us-west-2.neptune.amazonaws.com:8182/sparql'

Next, we tryed a provisioned instance using a high powered db.x2iedn.24xlarge - 128 vCPU and 4,096 GB of RAM according to its dropdown in the Create Database dialog (n.b., its info panel sayid the following - Instance class db.x2iedn.24xlarge Current generation vCPU 96Memory 3072 GiB; indeed, this is what the db.x2iedn.24xlarge is said to be at https://aws.amazon.com/ec2/instance-types/ ). This is not the sort of machine we'd run for more than an import as the costs would be prohibitive, but it is worth exploring. https://docs.aws.amazon.com/neptune/latest/userguide/best-practices-general-basic.html#best-practices-loader-tempinstance describes how to go about using a more powerful import machine, which is mostly about ingesting faster (it's unclear if there's much of a cost difference).

After deletion of the serverless based graph database data, the run with the provisioned instance (non-serverless, but still a managed service) was started at about 2:39 PM on February 29, 2024 (note the same VPC endpoint, but keep in mind it's a different server / cluster) :

curl -v -X POST \
    -H 'Content-Type: application/json' \
    https://db-neptune-2.cluster-cwnhpfsf87ne.us-west-2.neptune.amazonaws.com:8182/loader -d '
    {
      "source" : "s3://blazegraphdump/latest-all.nt.bz2",
      "format" : "ntriples",
      "iamRoleArn" : "arn:aws:iam::ACCOUNTID:role/NeptuneLoadFromS3Attempt",
      "region" : "us-west-2",
      "failOnError" : "FALSE",
      "parallelism" : "OVERSUBSCRIBE",
      "updateSingleCardinalityProperties" : "FALSE",
      "queueRequest" : "TRUE"
    }'


{
    "status" : "200 OK",
    "payload" : {
        "loadId" : "75b910c5-25d5-4257-8e9f-8a91f8c2df88"
    }


watch -n 60 curl -G 'https://db-neptune-2.cluster-cwnhpfsf87ne.us-west-2.neptune.amazonaws.com:8182/loader/75b910c5-25d5-4257-8e9f-8a91f8c2df88'

After running for a while, we just stopped it. At this pace it would take about 14 hours to import 9 billion triples. That would be faster than serverless, but not so much so as to justify continuing with this particular import (it may be worth it if there's time pressure). CPU utilization when using this powerful provisioned node was hovering around 20%, which is lower than expected. Keep in mind that serverless was using 50%-70+% once it got going.

Here was the point where we stopped with this powerful managed server approach.

ubuntu@ip-172-31-23-68:~$ curl 'https://db-neptune-2.cluster-cwnhpfsf87ne.us-west-2.neptune.amazonaws.com:8182/loader/75b910c5-25d5-4257-8e9f-8a91f8c2df88'{
    "status" : "200 OK",
    "payload" : {
        "feedCount" : [
            {
                "LOAD_IN_PROGRESS" : 1
            }
        ],
        "overallStatus" : {
            "fullUri" : "s3://blazegraphdump/latest-all.nt.bz2",
            "runNumber" : 1,
            "retryNumber" : 0,
            "status" : "LOAD_IN_PROGRESS",
            "totalTimeSpent" : 7720,
            "startTime" : 1709239199,
            "totalRecords" : 1311570000,
            "totalDuplicates" : 8806377,
            "parsingErrors" : 0,
            "datatypeMismatchErrors" : 0,
            "insertErrors" : 0
        }
    }

Although there was some level of storage cost, there was also an allowance for Amazon Neptune serverless - up to 750 free compute hours are available for an account in the first month of use of Amazon Neptune serverless (this appeared to be fanned out across multiple servers' NCUs somehow).

The policies for this sort of usage may be subject to change, so it's important that any attempt to replicate pay close attention to hourly billing and potential case budgeting is considered (e.g., up to $500/day for import).

Needs post an import could vary dramatically from use case to use case, ranging from comparatively affordable for such a big database to comparatively more expensive.

It is possible to migrate workloads and adjust configuration after import to better contain costs.

Possible approaches for loading may involve

  1. Use of the technique in https://docs.aws.amazon.com/neptune/latest/userguide/feature-overview-lookup-cache.html for a managed server (this may generalize across other NVMe-connected instance types) if needing to support relatively warm lookup cache facilities immediately.
  2. Alternatively, serverless import could be used and then the installation could be switched over to a managed server that supports the lookup cache, letting the cache gradually warm up.

Use of configuration parameters as described in https://docs.aws.amazon.com/neptune/latest/userguide/parameters.html would be useful for things like the lookup cache, as well as autoscaling configuration, query engine, and so on.

Attempting a run with a queue capacity of 8000 and buffer of 1000000 and heap size of 16g on the gaming-class desktop to mimic the MacBook Pro, things were slower than a queue capacity of 4000 and buffer of 1000000 and heap size of 31g on the gaming-class desktop.

ubuntu22:~/rdf/dist/target/service-0.3.138-SNAPSHOT$ time ./loadData.sh -n wdq -d /mnt/firehose/munge_on_later_data_set -s 1 -e 150
...
real    280m46.264s

A run is in progress to verify if there's anything noticeable when the heap size is set to 31g but the queue capacity is at 8000 and the buffer is at 1000000 when processing the first 150 files on the gaming-class desktop.

The run to check with heap size of 31g, queue capacity of 8000, and buffer at 1000000 stalled at file 107.

By the way, I'm attempting a run for the first 1332 munged files (one shy of the 1333 where terminated last time around) with buffer at 1000000 and heap size at 31g and queue capacity at 4000 on the gaming-class desktop to see whether this imports smoothly and whether performance gains are noticeable.

ubuntu22:~/rdf/dist/target/service-0.3.138-SNAPSHOT$ date
Wed Mar 20 02:36:59 PM CDT 2024
ubuntu22:~/rdf/dist/target/service-0.3.138-SNAPSHOT$ time ./loadData.sh -n wdq -d /mnt/firehose/munge_on_later_data_set -s 1 -e 1332

...screen'ing in to check:

Processing wikidump-000000505.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=13452ms, elapsed=13452ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=167329ms, commitTime=1711041930967, mutationCount=4566497</p
></html
>Thu Mar 21 12:25:35 PM CDT 2024
Processing wikidump-000000506.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=15405ms, elapsed=15405ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=203202ms, commitTime=1711042135111, mutationCount=5262167</p
></html
>Thu Mar 21 12:28:58 PM CDT 2024
Processing wikidump-000000507.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=14701ms, elapsed=14700ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=178754ms, commitTime=1711042314114, mutationCount=5005853</p
></html
>Thu Mar 21 12:31:57 PM CDT 2024

The run with with buffer at 1000000 and heap size at 31g and queue capacity at 4000 on the gaming-class desktop completed.

Processing wikidump-000001332.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=13580ms, elapsed=13580ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=266483ms, commitTime=1711304860167, mutationCount=4772590</p
></html
>Sun Mar 24 01:27:45 PM CDT 2024

real    5690m30.371s

... which is 3.95 days. I'm trying again, but going back to the buffer capacity at 100000 instead of 1000000 for one last comparison with these runs on this subset of munged data, and without any larger pause between batches of files (remember the previous run with buffer capacity at 100000 and 31g heap and queue capacity at 4000 was done by first running files 1-150, then after coming back to the terminal sometime later resumed from file 151; but in the real world we usually hope to just let this thing run one file after another without any pause...in practice it could be that the time between files 150 and 151 allowed the JVM more time to heal itself and created some artificial speed gains when adding together the two time intervals, but we'll see).

Starting on Friday, March 29, 2024 at 1:40 PM CT...

ubuntu22:~/rdf/dist/target/service-0.3.138-SNAPSHOT$ time ./loadData.sh -n wdq -d /mnt/firehose/munge_on_later_data_set -s 1 -e 1332
Processing wikidump-000000001.ttl.gz

I'll update when it's done. It should complete presumably sometime in the next 24 hours.

Now this is interesting: we're now past 4 days (about 4 days and 1 hour) of this running, and with buffer capacity at 100000 instead of 1000000 (but this time without any gap between the batches of files), there's still a good way to go yet.

Processing wikidump-000001177.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=612796ms, elapsed=612796ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=689208ms, commitTime=1712085811545, mutationCount=12297407</p
></html
>Tue Apr  2 02:23:35 PM CDT 2024
Processing wikidump-000001178.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=850122ms, elapsed=850121ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=950693ms, commitTime=1712086762086, mutationCount=16659867</p
></html
>Tue Apr  2 02:39:26 PM CDT 2024
Processing wikidump-000001179.ttl.gz

It's possible this means that a higher buffer capacity actually makes a difference. I will let this run complete so we can see what is the percentage difference.

After this I will check if this sort of behavior is reproducible, and to what extent, with one side of the graph split when using these two different buffer sizes.

This morning of April 3 around 6:25 AM I had SSH'd to check progress, and it was working, but going slowly, similar to the day before. It was on a file number in the 1200s, but I didn't write down the number or copy terminal output; I do remember seeing it was taking around 796 seconds for one of the files at that time. Look at the previous comment, you'll see those were going slow; not surprising as we know imports on these munged files are slower upon more stuff is imported.

I checked several hours later in the middle of a meeting, and it had gone into a bad spiral.

I've been able to use screen backscrolling to obtain much of the stack trace, but could not backscroll to a point of having all of the information to tell when the last successful file imported without a stack trace for sure. What we can say is that probably the last somewhat stable commit was on file 1302 at about 7:24 AM. And probably file 1303 and definitely 1304 and 1305 have been failing badly and taking a really long time in doing so; this would probably continue indefinitely from here without killing the process. Just a slice of the paste here to give an idea of things (notice lastCommitTime and commitCounter in the stack trace).

Wed Apr  3 02:05:26 PM CDT 2024
Processing wikidump-000001305.ttl.gz
SPARQL-UPDATE: updateStr=LOAD <file:///mnt/firehose/munge_on_later_data_set/wikidump-000001305.ttl.gz>
java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem 
with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wedne
sday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024
 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555,
 quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=16003356
92, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}

Unfortunately jstack seems to hiccup.

ubuntu22:~$ sudo jstack -m 987870
[sudo] password: 
Attaching to process ID 987870, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.402-b06
Exception in thread "main" java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)
	at sun.tools.jstack.JStack.main(JStack.java:106)
Caused by: java.lang.RuntimeException: Unable to deduce type of thread from address 0x00007fecb400b800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)
	at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:169)
	at sun.jvm.hotspot.runtime.Threads.first(Threads.java:153)
	at sun.jvm.hotspot.tools.PStack.initJFrameCache(PStack.java:200)
	at sun.jvm.hotspot.tools.PStack.run(PStack.java:71)
	at sun.jvm.hotspot.tools.PStack.run(PStack.java:58)
	at sun.jvm.hotspot.tools.PStack.run(PStack.java:53)
	at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)
	at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)
	at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)
	at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)
	at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)
	... 6 more
Caused by: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007fecb400b800
	at sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62)
	at sun.jvm.hotspot.runtime.VirtualConstructor.instantiateWrapperFor(VirtualConstructor.java:80)
	at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:165)
	... 16 more
ubuntu22:~$ sudo jstack -Flm 987870
Usage:
    jstack [-l] <pid>
        (to connect to running process)
    jstack -F [-m] [-l] <pid>
        (to connect to a hung process)
    jstack [-m] [-l] <executable> <core>
        (to connect to a core file)
    jstack [-m] [-l] [server_id@]<remote server IP or hostname>
        (to connect to a remote debug server)

Options:
    -F  to force a thread dump. Use when jstack <pid> does not respond (process is hung)
    -m  to print both java and native frames (mixed mode)
    -l  long listing. Prints additional information about locks
    -h or -help to print this help message
ubuntu22:~$ sudo jstack -F -l -m 987870
Attaching to process ID 987870, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.402-b06
Exception in thread "main" java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)
	at sun.tools.jstack.JStack.main(JStack.java:106)
Caused by: java.lang.RuntimeException: Unable to deduce type of thread from address 0x00007fecb400b800 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)
	at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:169)
	at sun.jvm.hotspot.runtime.Threads.first(Threads.java:153)
	at sun.jvm.hotspot.tools.PStack.initJFrameCache(PStack.java:200)
	at sun.jvm.hotspot.tools.PStack.run(PStack.java:71)
	at sun.jvm.hotspot.tools.PStack.run(PStack.java:58)
	at sun.jvm.hotspot.tools.PStack.run(PStack.java:53)
	at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)
	at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)
	at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)
	at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)
	at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)
	... 6 more
Caused by: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007fecb400b800
	at sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62)
	at sun.jvm.hotspot.runtime.VirtualConstructor.instantiateWrapperFor(VirtualConstructor.java:80)
	at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:165)
	... 16 more

The number of files imported between the previous comment at 2:39 PM yesterday and 7:24 AM today is about what we might expect if we were for example assuming about 10 minutes or less per file in keeping with the trajectory observed.

(1302-1179+1)*10/60

That is to say, we would have expected an additional 20.6 hours up to file 1302 at ten minutes per file, but it was actually about an additional 16.9 hours.

Blazegraph was still running, so I took this opportunity to check how many triples were in place.

select (count(*) as ?ct)
where { ?s ?p ?o }

produced a result of 9082617267.

For the sake of extrapolation, let's suppose then that it would take about 8.1 minutes per remaining file and things would have kept working all the way up to 1332 files (with 1302 being the last good one). 30 files times 8.1 minutes per file = 243 minutes, or 4 hours and 3 minutes. If this had been the case, this would put us at about 11:27 AM today for completion. Remember, this was started around Friday, March 29, 2024 at 1:40 PM CT. So we're really quite close to a projection of about 4.90 days with this lower buffer capacity setting. Given that the other approach took about 3.95 days , from this run we might say that this run would be about 24% (1.24 minus 1) times longer than the other approach.

This sort of instability is unfortunately proving to be common (indeed, it's part of the reason for the graph split to have fewer triples and to reduce risk), but we may have stumbled onto something with the buffer capacity that helps us get both greater stability and greater speed, at least up to a certain number of munged files and at least on this machine without any pauses between files.

As I said earlier, I'm going to try comparing buffer capacity pieces against graph split files and see how things stack up. Graph split based files is ultimately our target, after all.

Knowing that at least on this machine the larger buffer capacity seems to likely be associated with greater stability (we cannot say this without qualification given an issue in an earlier run, though), we should probably try a full reload on another server with the munged N3-style Turtle files with the larger buffer capacity as well and see what this does for a full graph import. Or at least be aware that this is an option in case we need to do a full import (e.g., we might try it on one node with the higher buffer capacity in parallel with any other nodes attempting to recover with the currently established buffer capacity).

Here's the paste of what stack trace material I could screen backscroll:

1 at com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore.java:99)
2 at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.commit2(BigdataSail.java:3932)
3 at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.commit2(BigdataSailRepositoryConnection.java:330)
4 at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertCommit(AST2BOpUpdate.java:375)
5 at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdate(AST2BOpUpdate.java:321)
6 at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1072)
7 ... 9 more
8Caused by: java.lang.RuntimeException: Problem with entry at -83289912769511002
9 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5425)
10 at com.bigdata.rwstore.RWStore.checkDeferredFrees(RWStore.java:3896)
11 at com.bigdata.journal.RWStrategy.checkDeferredFrees(RWStrategy.java:780)
12 at com.bigdata.journal.AbstractJournal$CommitState.writeCommitRecord(AbstractJournal.java:3500)
13 at com.bigdata.journal.AbstractJournal$CommitState.access$2800(AbstractJournal.java:3301)
14 at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:4111)
15 at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3132)
16 ... 15 more
17Caused by: java.lang.RuntimeException: addr=-151834637 : cause=java.lang.IllegalStateException: Error reading from WriteCache addr: 704140099584 length: 12, write
18CacheDebug: No WriteCache debug info
19 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2399)
20 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2120)
21 at com.bigdata.rwstore.RWStore.freeImmediateBlob(RWStore.java:2685)
22 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2725)
23 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2712)
24 at com.bigdata.rwstore.RWStore.freeImmediateBlob(RWStore.java:2700)
25 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2725)
26 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2712)
27 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5341)
28 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5411)
29 ... 21 more
30Caused by: java.lang.IllegalStateException: Error reading from WriteCache addr: 704140099584 length: 12, writeCacheDebug: No WriteCache debug info
31 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2321)
32 ... 30 more
33Caused by: com.bigdata.util.ChecksumError: offset=704140099584,nbytes=16,expected=-154356869,actual=543360410
34 at com.bigdata.io.writecache.WriteCacheService._readFromLocalDiskIntoNewHeapByteBuffer(WriteCacheService.java:3783)
35 at com.bigdata.io.writecache.WriteCacheService._getRecord(WriteCacheService.java:3598)
36 at com.bigdata.io.writecache.WriteCacheService.access$2500(WriteCacheService.java:200)
37 at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3435)
38 at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3419)
39 at com.bigdata.util.concurrent.Memoizer$1.call(Memoizer.java:77)
40 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
41 at com.bigdata.util.concurrent.Memoizer.compute(Memoizer.java:92)
42 at com.bigdata.io.writecache.WriteCacheService.loadRecord(WriteCacheService.java:3540)
43 at com.bigdata.io.writecache.WriteCacheService.read(WriteCacheService.java:3259)
44 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2315)
45 ... 30 more
46SPARQL-UPDATE: updateStr=LOAD <file:///mnt/firehose/munge_on_later_data_set/wikidump-000001304.ttl.gz>
47java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem
48with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wedne
49sday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024
50 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555,
51 quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=16003356
5292, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
53 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
54 at java.util.concurrent.FutureTask.get(FutureTask.java:206)
55 at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:292)
56 at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlUpdate(QueryServlet.java:460)
57 at com.bigdata.rdf.sail.webapp.QueryServlet.doPost(QueryServlet.java:241)
58 at com.bigdata.rdf.sail.webapp.RESTServlet.doPost(RESTServlet.java:269)
59 at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doPost(MultiTenancyServlet.java:195)
60 at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
61 at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
62 at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
63 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
64 at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:320)
65 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
66 at org.wikidata.query.rdf.blazegraph.throttling.SystemOverloadFilter.doFilter(SystemOverloadFilter.java:82)
67 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
68 at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:50)
69 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
70 at org.wikidata.query.rdf.blazegraph.filters.QueryEventSenderFilter.doFilter(QueryEventSenderFilter.java:110)
71 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
72 at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
73 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
74 at org.wikidata.query.rdf.blazegraph.filters.JWTIdentityFilter.doFilter(JWTIdentityFilter.java:66)
75 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
76 at org.wikidata.query.rdf.blazegraph.filters.RealAgentFilter.doFilter(RealAgentFilter.java:33)
77 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
78 at org.wikidata.query.rdf.blazegraph.filters.RequestConcurrencyFilter.doFilter(RequestConcurrencyFilter.java:50)
79 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
80 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
81 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
82 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
83 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
84 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
85 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
86 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
87 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
88 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
89 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
90 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
91 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
92 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
93 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
94 at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
95 at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
96 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
97 at org.eclipse.jetty.server.Server.handle(Server.java:503)
98 at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
99 at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
100 at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
101 at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
102 at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
103 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
104 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
105 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
106 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
107 at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
108 at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
109 at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
110 at java.lang.Thread.run(Thread.java:750)
111Caused by: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem with entry at -832899127695110
11202: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM
113 CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounte
114r=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=
11526754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [
116Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
117 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
118 at java.util.concurrent.FutureTask.get(FutureTask.java:192)
119 at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlUpdateTask.call(QueryServlet.java:571)
120 at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlUpdateTask.call(QueryServlet.java:472)
121 at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68)
122 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
123 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
124 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
125 ... 1 more
126Caused by: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0,
127 challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Fri
128day, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140
129859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126
130, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], c
131loseTime=0}
132 at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1080)
133 at com.bigdata.rdf.sail.BigdataSailUpdate.execute2(BigdataSailUpdate.java:152)
134 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$UpdateTask.doQuery(BigdataRDFContext.java:2003)
135 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1579)
136 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1544)
137 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:757)
138 ... 4 more
139Caused by: java.lang.RuntimeException: Problem with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47
140806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCom
141mitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=
142NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434
143-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
144 at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3134)
145 at com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore.java:99)
146 at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.commit2(BigdataSail.java:3932)
147 at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.commit2(BigdataSailRepositoryConnection.java:330)
148 at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertCommit(AST2BOpUpdate.java:375)
149 at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdate(AST2BOpUpdate.java:321)
150 at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1072)
151 ... 9 more
152Caused by: java.lang.RuntimeException: Problem with entry at -83289912769511002
153 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5425)
154 at com.bigdata.rwstore.RWStore.checkDeferredFrees(RWStore.java:3896)
155 at com.bigdata.journal.RWStrategy.checkDeferredFrees(RWStrategy.java:780)
156 at com.bigdata.journal.AbstractJournal$CommitState.writeCommitRecord(AbstractJournal.java:3500)
157 at com.bigdata.journal.AbstractJournal$CommitState.access$2800(AbstractJournal.java:3301)
158 at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:4111)
159 at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3132)
160 ... 15 more
161Caused by: java.lang.RuntimeException: addr=-151834637 : cause=java.lang.IllegalStateException: Error reading from WriteCache addr: 704140099584 length: 12, write
162CacheDebug: No WriteCache debug info
163 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2399)
164 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2120)
165 at com.bigdata.rwstore.RWStore.freeImmediateBlob(RWStore.java:2685)
166 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2725)
167 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2712)
168 at com.bigdata.rwstore.RWStore.freeImmediateBlob(RWStore.java:2700)
169 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2725)
170 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2712)
171 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5341)
172 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5411)
173 ... 21 more
174Caused by: java.lang.IllegalStateException: Error reading from WriteCache addr: 704140099584 length: 12, writeCacheDebug: No WriteCache debug info
175 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2321)
176 ... 30 more
177Caused by: com.bigdata.util.ChecksumError: offset=704140099584,nbytes=16,expected=-154356869,actual=543360410
178 at com.bigdata.io.writecache.WriteCacheService._readFromLocalDiskIntoNewHeapByteBuffer(WriteCacheService.java:3783)
179 at com.bigdata.io.writecache.WriteCacheService._getRecord(WriteCacheService.java:3598)
180 at com.bigdata.io.writecache.WriteCacheService.access$2500(WriteCacheService.java:200)
181 at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3435)
182 at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3419)
183 at com.bigdata.util.concurrent.Memoizer$1.call(Memoizer.java:77)
184 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
185 at com.bigdata.util.concurrent.Memoizer.compute(Memoizer.java:92)
186 at com.bigdata.io.writecache.WriteCacheService.loadRecord(WriteCacheService.java:3540)
187 at com.bigdata.io.writecache.WriteCacheService.read(WriteCacheService.java:3259)
188 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2315)
189 ... 30 more
190SPARQL-UPDATE: updateStr=LOAD <file:///mnt/firehose/munge_on_later_data_set/wikidump-000001304.ttl.gz>
191java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem
192with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wedne
193sday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024
194 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555,
195 quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=16003356
19692, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
197 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
198 at java.util.concurrent.FutureTask.get(FutureTask.java:206)
199 at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:292)
200 at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlUpdate(QueryServlet.java:460)
201 at com.bigdata.rdf.sail.webapp.QueryServlet.doPost(QueryServlet.java:241)
202 at com.bigdata.rdf.sail.webapp.RESTServlet.doPost(RESTServlet.java:269)
203 at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doPost(MultiTenancyServlet.java:195)
204 at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
205 at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
206 at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
207 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
208 at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:320)
209 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
210 at org.wikidata.query.rdf.blazegraph.throttling.SystemOverloadFilter.doFilter(SystemOverloadFilter.java:82)
211 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
212 at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:50)
213 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
214 at org.wikidata.query.rdf.blazegraph.filters.QueryEventSenderFilter.doFilter(QueryEventSenderFilter.java:110)
215 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
216 at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
217 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
218 at org.wikidata.query.rdf.blazegraph.filters.JWTIdentityFilter.doFilter(JWTIdentityFilter.java:66)
219 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
220 at org.wikidata.query.rdf.blazegraph.filters.RealAgentFilter.doFilter(RealAgentFilter.java:33)
221 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
222 at org.wikidata.query.rdf.blazegraph.filters.RequestConcurrencyFilter.doFilter(RequestConcurrencyFilter.java:50)
223 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
224 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
225 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
226 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
227 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
228 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
229 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
230 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
231 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
232 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
233 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
234 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
235 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
236 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
237 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
238 at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
239 at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
240 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
241 at org.eclipse.jetty.server.Server.handle(Server.java:503)
242 at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
243 at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
244 at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
245 at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
246 at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
247 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
248 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
249 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
250 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
251 at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
252 at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
253 at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
254 at java.lang.Thread.run(Thread.java:750)
255Caused by: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem with entry at -832899127695110
25602: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM
257 CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounte
258r=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=
25926754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [
260Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
261 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
262 at java.util.concurrent.FutureTask.get(FutureTask.java:192)
263 at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlUpdateTask.call(QueryServlet.java:571)
264 at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlUpdateTask.call(QueryServlet.java:472)
265 at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68)
266 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
267 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
268 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
269 ... 1 more
270Caused by: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0,
271 challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Fri
272day, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140
273859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126
274, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], c
275loseTime=0}
276 at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1080)
277 at com.bigdata.rdf.sail.BigdataSailUpdate.execute2(BigdataSailUpdate.java:152)
278 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$UpdateTask.doQuery(BigdataRDFContext.java:2003)
279 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1579)
280 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1544)
281 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:757)
282
283 ... 4 more
284Caused by: java.lang.RuntimeException: Problem with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47
285806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCom
286mitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=
287NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434
288-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
289 at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3134)
290 at com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore.java:99)
291 at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.commit2(BigdataSail.java:3932)
292 at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.commit2(BigdataSailRepositoryConnection.java:330)
293 at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertCommit(AST2BOpUpdate.java:375)
294 at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdate(AST2BOpUpdate.java:321)
295 at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1072)
296 ... 9 more
297Caused by: java.lang.RuntimeException: Problem with entry at -83289912769511002
298 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5425)
299 at com.bigdata.rwstore.RWStore.checkDeferredFrees(RWStore.java:3896)
300 at com.bigdata.journal.RWStrategy.checkDeferredFrees(RWStrategy.java:780)
301 at com.bigdata.journal.AbstractJournal$CommitState.writeCommitRecord(AbstractJournal.java:3500)
302 at com.bigdata.journal.AbstractJournal$CommitState.access$2800(AbstractJournal.java:3301)
303 at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:4111)
304 at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3132)
305 ... 15 more
306Caused by: java.lang.RuntimeException: addr=-151834637 : cause=java.lang.IllegalStateException: Error reading from WriteCache addr: 704140099584 length: 12, write
307CacheDebug: No WriteCache debug info
308 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2399)
309 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2120)
310 at com.bigdata.rwstore.RWStore.freeImmediateBlob(RWStore.java:2685)
311 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2725)
312 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2712)
313 at com.bigdata.rwstore.RWStore.freeImmediateBlob(RWStore.java:2700)
314 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2725)
315 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2712)
316 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5341)
317 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5411)
318 ... 21 more
319Caused by: java.lang.IllegalStateException: Error reading from WriteCache addr: 704140099584 length: 12, writeCacheDebug: No WriteCache debug info
320 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2321)
321 ... 30 more
322Caused by: com.bigdata.util.ChecksumError: offset=704140099584,nbytes=16,expected=-154356869,actual=543360410
323 at com.bigdata.io.writecache.WriteCacheService._readFromLocalDiskIntoNewHeapByteBuffer(WriteCacheService.java:3783)
324 at com.bigdata.io.writecache.WriteCacheService._getRecord(WriteCacheService.java:3598)
325 at com.bigdata.io.writecache.WriteCacheService.access$2500(WriteCacheService.java:200)
326 at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3435)
327 at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3419)
328 at com.bigdata.util.concurrent.Memoizer$1.call(Memoizer.java:77)
329 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
330 at com.bigdata.util.concurrent.Memoizer.compute(Memoizer.java:92)
331 at com.bigdata.io.writecache.WriteCacheService.loadRecord(WriteCacheService.java:3540)
332 at com.bigdata.io.writecache.WriteCacheService.read(WriteCacheService.java:3259)
333 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2315)
334 ... 30 more
335Wed Apr 3 02:05:26 PM CDT 2024
336Processing wikidump-000001305.ttl.gz
337SPARQL-UPDATE: updateStr=LOAD <file:///mnt/firehose/munge_on_later_data_set/wikidump-000001305.ttl.gz>
338java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem
339with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wedne
340sday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024
341 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555,
342 quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=16003356
34392, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
344 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
345 at java.util.concurrent.FutureTask.get(FutureTask.java:206)
346 at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:292)
347 at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlUpdate(QueryServlet.java:460)
348 at com.bigdata.rdf.sail.webapp.QueryServlet.doPost(QueryServlet.java:241)
349 at com.bigdata.rdf.sail.webapp.RESTServlet.doPost(RESTServlet.java:269)
350 at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doPost(MultiTenancyServlet.java:195)
351 at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
352 at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
353 at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
354 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
355 at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:320)
356 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
357 at org.wikidata.query.rdf.blazegraph.throttling.SystemOverloadFilter.doFilter(SystemOverloadFilter.java:82)
358 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
359 at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:50)
360 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
361 at org.wikidata.query.rdf.blazegraph.filters.QueryEventSenderFilter.doFilter(QueryEventSenderFilter.java:110)
362 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
363 at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
364 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
365 at org.wikidata.query.rdf.blazegraph.filters.JWTIdentityFilter.doFilter(JWTIdentityFilter.java:66)
366 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
367 at org.wikidata.query.rdf.blazegraph.filters.RealAgentFilter.doFilter(RealAgentFilter.java:33)
368 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
369 at org.wikidata.query.rdf.blazegraph.filters.RequestConcurrencyFilter.doFilter(RequestConcurrencyFilter.java:50)
370 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
371 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
372 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
373 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
374 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
375 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
376 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
377 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
378 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
379 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
380 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
381 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
382 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
383 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
384 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
385 at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
386 at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
387 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
388 at org.eclipse.jetty.server.Server.handle(Server.java:503)
389 at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
390 at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
391 at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
392 at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
393 at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
394 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
395 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
396 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
397 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
398 at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
399 at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
400 at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
401 at java.lang.Thread.run(Thread.java:750)
402Caused by: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem with entry at -832899127695110
40302: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM
404 CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounte
405r=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=
40626754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [
407Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
408 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
409 at java.util.concurrent.FutureTask.get(FutureTask.java:192)
410 at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlUpdateTask.call(QueryServlet.java:571)
411 at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlUpdateTask.call(QueryServlet.java:472)
412 at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68)
413 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
414 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
415 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
416 ... 1 more
417Caused by: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0,
418 challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Fri
419day, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140
420859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126
421, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], c
422loseTime=0}
423 at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1080)
424 at com.bigdata.rdf.sail.BigdataSailUpdate.execute2(BigdataSailUpdate.java:152)
425 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$UpdateTask.doQuery(BigdataRDFContext.java:2003)
426 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1579)
427 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1544)
428 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:757)
429 ... 4 more
430Caused by: java.lang.RuntimeException: Problem with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47
431806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCom
432mitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=
433NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434
434-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
435 at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3134)
436 at com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore.java:99)
437 at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.commit2(BigdataSail.java:3932)
438 at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.commit2(BigdataSailRepositoryConnection.java:330)
439 at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertCommit(AST2BOpUpdate.java:375)
440 at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdate(AST2BOpUpdate.java:321)
441 at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1072)
442 ... 9 more
443Caused by: java.lang.RuntimeException: Problem with entry at -83289912769511002
444 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5425)
445 at com.bigdata.rwstore.RWStore.checkDeferredFrees(RWStore.java:3896)
446 at com.bigdata.journal.RWStrategy.checkDeferredFrees(RWStrategy.java:780)
447 at com.bigdata.journal.AbstractJournal$CommitState.writeCommitRecord(AbstractJournal.java:3500)
448 at com.bigdata.journal.AbstractJournal$CommitState.access$2800(AbstractJournal.java:3301)
449 at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:4111)
450 at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3132)
451 ... 15 more
452Caused by: java.lang.RuntimeException: addr=-151834637 : cause=java.lang.IllegalStateException: Error reading from WriteCache addr: 704140099584 length: 12, write
453CacheDebug: No WriteCache debug info
454 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2399)
455 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2120)
456 at com.bigdata.rwstore.RWStore.freeImmediateBlob(RWStore.java:2685)
457 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2725)
458 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2712)
459 at com.bigdata.rwstore.RWStore.freeImmediateBlob(RWStore.java:2700)
460 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2725)
461 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2712)
462 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5341)
463 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5411)
464 ... 21 more
465Caused by: java.lang.IllegalStateException: Error reading from WriteCache addr: 704140099584 length: 12, writeCacheDebug: No WriteCache debug info
466 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2321)
467 ... 30 more
468Caused by: com.bigdata.util.ChecksumError: offset=704140099584,nbytes=16,expected=-154356869,actual=543360410
469 at com.bigdata.io.writecache.WriteCacheService._readFromLocalDiskIntoNewHeapByteBuffer(WriteCacheService.java:3783)
470 at com.bigdata.io.writecache.WriteCacheService._getRecord(WriteCacheService.java:3598)
471 at com.bigdata.io.writecache.WriteCacheService.access$2500(WriteCacheService.java:200)
472 at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3435)
473 at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3419)
474 at com.bigdata.util.concurrent.Memoizer$1.call(Memoizer.java:77)
475 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
476 at com.bigdata.util.concurrent.Memoizer.compute(Memoizer.java:92)
477 at com.bigdata.io.writecache.WriteCacheService.loadRecord(WriteCacheService.java:3540)
478 at com.bigdata.io.writecache.WriteCacheService.read(WriteCacheService.java:3259)
479 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2315)
480 ... 30 more
481SPARQL-UPDATE: updateStr=LOAD <file:///mnt/firehose/munge_on_later_data_set/wikidump-000001305.ttl.gz>
482java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem
483with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wedne
484sday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024
485 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555,
486 quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=16003356
48792, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
488 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
489 at java.util.concurrent.FutureTask.get(FutureTask.java:206)
490 at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:292)
491 at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlUpdate(QueryServlet.java:460)
492 at com.bigdata.rdf.sail.webapp.QueryServlet.doPost(QueryServlet.java:241)
493 at com.bigdata.rdf.sail.webapp.RESTServlet.doPost(RESTServlet.java:269)
494 at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doPost(MultiTenancyServlet.java:195)
495 at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
496 at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
497 at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
498 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
499 at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:320)
500 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
501 at org.wikidata.query.rdf.blazegraph.throttling.SystemOverloadFilter.doFilter(SystemOverloadFilter.java:82)
502 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
503 at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:50)
504 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
505 at org.wikidata.query.rdf.blazegraph.filters.QueryEventSenderFilter.doFilter(QueryEventSenderFilter.java:110)
506 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
507 at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
508 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
509 at org.wikidata.query.rdf.blazegraph.filters.JWTIdentityFilter.doFilter(JWTIdentityFilter.java:66)
510 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
511 at org.wikidata.query.rdf.blazegraph.filters.RealAgentFilter.doFilter(RealAgentFilter.java:33)
512 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
513 at org.wikidata.query.rdf.blazegraph.filters.RequestConcurrencyFilter.doFilter(RequestConcurrencyFilter.java:50)
514 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
515 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
516 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
517 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
518 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
519 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
520 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
521 at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
522 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
523 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
524 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
525 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
526 at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
527 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
528 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
529 at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
530 at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
531 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
532 at org.eclipse.jetty.server.Server.handle(Server.java:503)
533 at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
534 at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
535 at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
536 at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
537 at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
538 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
539 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
540 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
541 at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
542 at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
543 at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
544 at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
545 at java.lang.Thread.run(Thread.java:750)
546Caused by: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem with entry at -832899127695110
54702: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM
548 CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounte
549r=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=
55026754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [
551Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
552 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
553 at java.util.concurrent.FutureTask.get(FutureTask.java:192)
554 at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlUpdateTask.call(QueryServlet.java:571)
555 at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlUpdateTask.call(QueryServlet.java:472)
556 at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68)
557 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
558 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
559 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
560 ... 1 more
561Caused by: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: Problem with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0,
562 challisField=1302, version=3, nextOffset=47806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Fri
563day, March 29, 2024 1:39:34 PM CDT], lastCommitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140
564859033,len=422}, commitRecordIndexAddr={off=NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126
565, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], c
566loseTime=0}
567 at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1080)
568 at com.bigdata.rdf.sail.BigdataSailUpdate.execute2(BigdataSailUpdate.java:152)
569 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$UpdateTask.doQuery(BigdataRDFContext.java:2003)
570 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1579)
571 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1544)
572 at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:757)
573 ... 4 more
574Caused by: java.lang.RuntimeException: Problem with entry at -83289912769511002: lastRootBlock=rootBlock{ rootBlock=0, challisField=1302, version=3, nextOffset=47
575806576684846562, localTime=1712147044389 [Wednesday, April 3, 2024 7:24:04 AM CDT], firstCommitTime=1711737574896 [Friday, March 29, 2024 1:39:34 PM CDT], lastCom
576mitTime=1712147041973 [Wednesday, April 3, 2024 7:24:01 AM CDT], commitCounter=1302, commitRecordAddr={off=NATIVE:-140859033,len=422}, commitRecordIndexAddr={off=
577
578NATIVE:-93467508,len=220}, blockSequence=34555, quorumToken=-1, metaBitsAddr=26754033649714513, metaStartAddr=11989126, storeType=RW, uuid=f993598d-497c-46a7-8434-d25c8859a0b8, offsetBits=42, checksum=1600335692, createTime=1711737574192 [Friday, March 29, 2024 1:39:34 PM CDT], closeTime=0}
579 at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3134)
580 at com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore.java:99)
581 at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.commit2(BigdataSail.java:3932)
582 at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.commit2(BigdataSailRepositoryConnection.java:330)
583 at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertCommit(AST2BOpUpdate.java:375)
584 at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdate(AST2BOpUpdate.java:321)
585 at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1072)
586 ... 9 more
587Caused by: java.lang.RuntimeException: Problem with entry at -83289912769511002
588 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5425)
589 at com.bigdata.rwstore.RWStore.checkDeferredFrees(RWStore.java:3896)
590 at com.bigdata.journal.RWStrategy.checkDeferredFrees(RWStrategy.java:780)
591 at com.bigdata.journal.AbstractJournal$CommitState.writeCommitRecord(AbstractJournal.java:3500)
592 at com.bigdata.journal.AbstractJournal$CommitState.access$2800(AbstractJournal.java:3301)
593 at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:4111)
594 at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3132)
595 ... 15 more
596Caused by: java.lang.RuntimeException: addr=-151834637 : cause=java.lang.IllegalStateException: Error reading from WriteCache addr: 704140099584 length: 12, writeCacheDebug: No WriteCache debug info
597 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2399)
598 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2120)
599 at com.bigdata.rwstore.RWStore.freeImmediateBlob(RWStore.java:2685)
600 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2725)
601 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2712)
602 at com.bigdata.rwstore.RWStore.freeImmediateBlob(RWStore.java:2700)
603 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2725)
604 at com.bigdata.rwstore.RWStore.immediateFree(RWStore.java:2712)
605 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5341)
606 at com.bigdata.rwstore.RWStore.freeDeferrals(RWStore.java:5411)
607 ... 21 more
608Caused by: java.lang.IllegalStateException: Error reading from WriteCache addr: 704140099584 length: 12, writeCacheDebug: No WriteCache debug info
609 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2321)
610 ... 30 more
611Caused by: com.bigdata.util.ChecksumError: offset=704140099584,nbytes=16,expected=-154356869,actual=543360410
612 at com.bigdata.io.writecache.WriteCacheService._readFromLocalDiskIntoNewHeapByteBuffer(WriteCacheService.java:3783)
613 at com.bigdata.io.writecache.WriteCacheService._getRecord(WriteCacheService.java:3598)
614 at com.bigdata.io.writecache.WriteCacheService.access$2500(WriteCacheService.java:200)
615 at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3435)
616 at com.bigdata.io.writecache.WriteCacheService$1.compute(WriteCacheService.java:3419)
617 at com.bigdata.util.concurrent.Memoizer$1.call(Memoizer.java:77)
618 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
619 at com.bigdata.util.concurrent.Memoizer.compute(Memoizer.java:92)
620 at com.bigdata.io.writecache.WriteCacheService.loadRecord(WriteCacheService.java:3540)
621 at com.bigdata.io.writecache.WriteCacheService.read(WriteCacheService.java:3259)
622 at com.bigdata.rwstore.RWStore.getData(RWStore.java:2315)
623 ... 30 more

Following roughly the procedure in P54284 to rename the Spark-produced graph files (and updating loadData.sh with FORMAT=part-%05d-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz and still having a date call after each curl in it), I kicked off an import of the scholarly article entity graph like so to see how it goes with a buffer capacity of 100000:

ubuntu22:~/rdf/dist/target/service-0.3.138-SNAPSHOT$ date; time ./loadData.sh -n wdq -d /mnt/firehose/split_0/nt_wd_schol -s 0 -e 0 2>&1 | tee loadData.log; time ./loadData.sh -n wdq -d /mnt/firehose/split_0/nt_wd_schol 2>&1 | tee -a loadData.log
Wed Apr  3 09:32:54 PM CDT 2024
Processing part-00000-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=55629ms, elapsed=55584ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=61598ms, commitTime=1712198035155, mutationCount=7349689</p
></html
>Wed Apr  3 09:33:56 PM CDT 2024

real    1m1.702s
user    0m0.004s
sys     0m0.006s
Processing part-00001-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=61251ms, elapsed=61251ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=71925ms, commitTime=1712198106800, mutationCount=7774048</p
></html
>Wed Apr  3 09:35:08 PM CDT 2024
Processing part-00002-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz

This is with the following values in RWStore.properties

com.bigdata.btree.writeRetentionQueue.capacity=4000
com.bigdata.rdf.sail.bufferCapacity=100000

and the following variable in loadData.sh

HEAP_SIZE=${HEAP_SIZE:-"31g"}

Just updating on how far along this run is, file 550 of the scholarly article entity side of the graph is being processed. There are files 0 through 1023 for this side of the graph. Note that I did think to tee output this time around so that generally/hopefully there's more info available to review output, stack traces (although hopefully there are none), and so on, should it be needed.

Processing part-00549-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=299675ms, elapsed=299675ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=392531ms, commitTime=1712329890306, mutationCount=7032172</p
></html
>Fri Apr  5 10:11:32 AM CDT 2024
Processing part-00550-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz

Sidebar: the "non"-scholarly article entity graph also has files 0-1023 and is similarly sized in terms of triples, but naturally the manner in which nodes are interconnected varies in a sense because of the type of entities, what kind of data entities are imbued with, and so on.

Update. On the gaming-class machine it took about 3.25 days to import the scholarly article entity graph, using a buffer capacity of 100000 (compare this with 5.875 days on wdqs1024). This resulted in 7_643_858_078 triples as expected. Next up will be with a buffer capacity of 1000000 to see if there is any obvious difference in import time.

>Sun Apr  7 03:34:59 AM CDT 2024
Processing part-01023-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=181901ms, elapsed=181901ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=226511ms, commitTime=1712479122009, mutationCount=7946575</p
></html
>Sun Apr  7 03:38:46 AM CDT 2024
File /mnt/firehose/split_0/nt_wd_schol/part-01024-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz not found, terminating

real    4684m49.905s

With bufferCapacity at 1000000, kicked it off again with the scholarly article entity graph files:

ubuntu22:~/rdf/dist/target/service-0.3.138-SNAPSHOT$ date | tee loadData.log; time ./loadData.sh -n wdq -d /mnt/firehose/split_0/nt_wd_schol -s 0 -e 0 2>&1 | tee -a loadData.log; time ./loadData.sh -n wdq -d /mnt/firehose/split_0/nt_wd_schol 2>&1 | tee -a loadData.log
Sun Apr  7 12:03:19 PM CDT 2024
Processing part-00000-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz

Update: With the buffer capacity at 1000000, file number 550 of the scholarly graph was imported as of Mon Apr 8 03:22:08 PM CDT 2024 . So, under 28 hours so far (buffer capacity at 100000 was more than 36 hours).

Processing part-00550-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=51018ms, elapsed=51018ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=245278ms, commitTime=1712607725882, mutationCount=7414497</p
></html
>Mon Apr  8 03:22:08 PM CDT 2024

Will update when it completes.

Good news. With the N-triples style scholarly entity graph files, with a buffer capacity of 1000000, a write retention queue capacity of 4000, and a heap size of 31g, on the gaming-class desktop, it took about 2.40 days. Recall that with buffer capacity of 100000 it took about 3.25 days on this desktop (and again, recall that it was 5.875 days on wdqs1024). So, there was about a 35% (1.35 minus 1) speed increase with the higher buffer capacity here on this gaming-class desktop.

It appears then that the combination of faster CPU, NVMe, and a higher buffer capacity is somewhere around 144% (5.875 / 2.40 = 2.44, 2.44 minus 1 = 1.44) faster than what we observed on a target data center machine.

It will likely be somewhat less dramatic on 10B triples if the previous munged file runs are any clue. I'm going to think on how to check this notion - it could be done by using the scholarly graph plus a portion of the main graph, which would be probably close enough for our purposes.

A high speed NVMe is in the process of being acquired so that we can verify on wdqs2024 the level of speedup achieved on a server similar to what was used for the graph split test servers. wdqs2024 has a hardware profile similar to wdqs1024 at present.

Some stuff from the terminal from the import on the gaming-class desktop:

ubuntu22:~$ head -9 ~/rdf/dist/target/service-0.3.138-SNAPSHOT/loadData.log
Sun Apr  7 12:03:19 PM CDT 2024
Processing part-00000-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=64069ms, elapsed=64024ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=71897ms, commitTime=1712509470732, mutationCount=7349689</p
></html
>Sun Apr  7 12:04:31 PM CDT 2024
Processing part-00001-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz

# screen output at the end:

Processing part-01023-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text&#47;html;charset=UTF-8"><title>blazegraph&trade; by SYSTAP</title
></head
><body<p>totalElapsed=51703ms, elapsed=51703ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p
><hr><p>COMMIT: totalElapsed=181013ms, commitTime=1712716306763, mutationCount=7946575</p
></html
>Tue Apr  9 09:31:50 PM CDT 2024
File /mnt/firehose/split_0/nt_wd_schol/part-01024-46f26ac6-0b21-4832-be79-d7c8709f33fb-c000.ttl.gz not found, terminating

real    3447m18.542s

In T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors) we saw that this took about 3702 minutes, or about 2.57 days, for the scholarly article entity with the CPU governor change (described in T336443#9726600 ) alone on wdqs2023.

And for the second run in T362920: Benchmark Blazegraph import with increased buffer capacity (and other factors) we saw that this took about 3089 minutes, or about 2.15 days, for the scholarly article entity graph with the CPU governor change (described in T336443#9726600 ) plus the bufferCapacity at 1000000 on wdqs2023.

On the gaming-class 2018 desktop, although the bufferCapacity value at 1000000 sped things up as described on this here ticket, application of the CPU governor change did not seem to have any additional bearing (it took 2.47 days as compared to its previous record of 2.44). It's possible that the existing BIOS configuration of the gaming-class 2018 desktop (which was already set to a high performance mode) was already squeezing out optimal performance, for example, or something else about the processor architecture's interaction with the rest of the hardware and operating system is just different as contrasted with the data center server. In any case, it's nice to see that the data center server is faster!

One of my theories is that the gaming class desktop with 64GB of total RAM may play some role, but the hardware provider has indicated that although more memory can be installed, it will only run with 64GB RAM and can't jump to 128GB RAM. Another is that perhaps the default memory swappiness (60) on the gaming class desktop could play a role. However, I find this less likely, as memory spikes haven't seemed to be a problem on this machine while loading data, plus the hard drive is an NVMe and so paging is somewhat less likely to manifest problematically anyway. Maybe something to check another day, as we use a swappiness of 0 in the data center generally (including the WDQS hosts).

I remembered I should post this:

For the "main" graph load on the desktop gaming-class 2018 desktop with bufferCapacity at 1000000:

Start time: Tue May 28 04:36:30 PM CDT 2024
Finish time: Fri May 31 03:43:25 AM CDT 2024

The first file (ends with 0000) was reported with the following time:

totalElapsed=71424ms

The remaining files (from 00001 to the end - part-01023-c261bb68-4091-4613-ae52-88ce97d22c14-c000.ttl.gz) were timed as follows:

real 3545m44.021s

So, 2.463 days.

The command used was

date | tee loadDataMAIN.log; time ./loadData.sh -n wdq -d /mnt/firehose/split_0/nt_wd_main -s 0 -e 0 2>&1 | tee -a loadDataMAIN.log; time ./loadData.sh -n wdq -d /mnt/firehose/split_0/nt_wd_main 2>&1 | tee -a loadDataMAIN.log