Page MenuHomePhabricator

Investigate: Alternatives or improvements to data import method
Closed, ResolvedPublic13 Estimated Story Points

Description

init-db.js expects a gzipped file at the FEED_PATH and reads it as a stream. Every line is transformed and then written into the db.

  • afaik. this is the best practice way of reading the file, which uncompressed is about 4GB and cannot be stored in its entirety in memory
  • it's being done asynchronously, which is hitting upon a known problem with line reader in that the stream will close when it finishes pushing lines through the pipe, not when all the pipes are done resolving and is solved/hacked around via the closeConnectionWhenDone function
  • regardless, around 100k lines, the process will kill itself without any additional warnings. I'm making the assumption this is because it's ooming (quick research suggests a back pressure problem possibly?)

Please investigate:

  • If there's a better way to batch import this data. There are expected to be millions of lines.
  • If the stream is the best practice, then how can the implementation be improved so it doesn't fail?
  • Is there a better way to implement the stream?
  • How will the import deal with errors and retries?

And as a stretch goal:

  • How long will the entire import take? I think this is important to know because it seems weird to be importing data as we're deprecating it if importing takes a very long time. The estimated completion time might have an impact on our scheduling frequency.

Event Timeline

Restricted Application added a subscriber: Aklapper. ยท View Herald TranscriptDec 20 2022, 1:38 PM
Niharika set the point value for this task to 13.Dec 20 2022, 5:39 PM
Niharika subscribed.

Flagging that there is a possible risk that we can't find a good solution. Node should be able to handle this, ideally.

Some notes while I work through it:

  • as is, around 500k the process kills itself. I thought it was the stream ooming but watching the db insert rows, I think there are too many connections to it. After kicking off 100k writes, it takes time to write all 100k to the db.
  • Streaming only the lines without taking action, started at :20 and ended at :25 with a fail at 7M lines when mariaDB closed the connection despite not sending any queries
  • Streaming only the lines without taking action and without connecting to the db, started at :25, finished at :36, ran through 17M lines
  • Roughly batching it 100k at a time, 2.5M lines takes 45 minutes and died at 3M when it seems like mariaDB closed the connection (Error: read ECONNRESET)

Seems like the issue is mariaDB's connection and I'll investigate that next.

Some notes while I work through it:

  • as is, around 500k the process kills itself. I thought it was the stream ooming but watching the db insert rows, I think there are too many connections to it. After kicking off 100k writes, it takes time to write all 100k to the db.
  • Streaming only the lines without taking action, started at :20 and ended at :25 with a fail at 7M lines when mariaDB closed the connection despite not sending any queries
  • Streaming only the lines without taking action and without connecting to the db, started at :25, finished at :36, ran through 17M lines
  • Roughly batching it 100k at a time, 2.5M lines takes 45 minutes and died at 3M when it seems like mariaDB closed the connection (Error: read ECONNRESET)

Seems like the issue is mariaDB's connection and I'll investigate that next.

@STran, are these observations relating to the existing init-db.js script, or do you have a work-in-progress patch that implements the batching logic? Are you using the docker-compose.yml environment for the application and MariaDB?

In T305114#8841656, @Marostegui suggests batching deletes in groups of 1K, maybe we need to experiment with smaller batches for INSERT. Are you using multi-value inserts?

One thing I was wondering about for the overall workflow, would it work if we did something like:

  • download the latest feed data and decompress
  • SELECT all items in the database (~178 million) and iterate over each row, check to see if the DB entry exists in the feed data file.
    • If the database entry exists in the feed data file, UPDATE the database entry based on the relevant line in the feed data file, and delete the line from the feed data file
    • If the database entry does not exist in the feed data file, DELETE the database entry, and delete the line from the feed data file
  • Any remaining items in the feed data file (90% should have been handled by UPDATE/DELETE) are the INSERT operations

With this approach, we would not need a separate script that deletes records based on examining the last_updated column. That's nice in case the import script fails for some reason, but the delete script continues, resulting in an empty database.

Change 921035 had a related patch set uploaded (by STran; author: STran):

[wikimedia/security/security-api@master] Refactor init-db

https://gerrit.wikimedia.org/r/921035

Change 921036 had a related patch set uploaded (by STran; author: STran):

[wikimedia/security/security-api@master] Better accomodate range of possible data

https://gerrit.wikimedia.org/r/921036

Change 921037 had a related patch set uploaded (by STran; author: STran):

[wikimedia/security/security-api@master] [WIP] Process data in stream

https://gerrit.wikimedia.org/r/921037

Solved a node problem, currently stuck on a connection problem. The node-side problem happened bc of an implementation error. I took the data out of the pipe to write it to the db when I should have kept it as part of the pipe. Since writes were slower than the data being pushed to it, it was causing a backpresssure problem, eating up the memory, and eventually killing the process. Moving it back to the pipe allows node to manage data throughput (as it should) and is probably best practice anyway.

I let it run and the good news is that the memory problems seem to have been solved. Usage didn't go above ~300MB while I was watching it and the pace of writes kept up with the data being pushed to the db. However, it eventually reached 9M lines before hanging for no immediately obvious reason. The problem didn't seem to be node-side and I gathered a few of these logs from the db:

2023-05-18  7:44:23 3 [Warning] Aborted connection 3 to db: 'test' user: 'root' host: '172.30.0.1' (Got an error reading communication packets)

Subsequent attempts to reconnect to the database fail. Trying to connect and ping fails:

SqlError: (conn=3, no: 45042, SQLState: 0A000) Ping timeout
    at Object.module.exports.createError (/srv/service/node_modules/mariadb/lib/misc/errors.js:61:10)
    at Timeout._onTimeout (/srv/service/node_modules/mariadb/lib/connection.js:277:20)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7) {
  text: 'Ping timeout',
  sql: null,
  fatal: true,
  errno: 45042,
  sqlState: '0A000',
  code: 'ER_PING_TIMEOUT'
}

I've restarted the service and the server and still can't connect regardless. I checked the size of the database and it's only about 1.5GB which doesn't seem like an unreasonable size and I can connect to the database from from the command line just fine.

Nothing...seems like it's choking? Here are some logs that may help:

MariaDB [test]> show status where `variable_name` = 'Threads_connected';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Threads_connected | 1     |
+-------------------+-------+
1 row in set (0.001 sec)
MariaDB [test]> SHOW PROCESSLIST;
+----+------+-----------+------+---------+------+----------+------------------+----------+
| Id | User | Host      | db   | Command | Time | State    | Info             | Progress |
+----+------+-----------+------+---------+------+----------+------------------+----------+
| 13 | root | localhost | test | Query   |    0 | starting | SHOW PROCESSLIST |    0.000 |
+----+------+-----------+------+---------+------+----------+------------------+----------+
1 row in set (0.000 sec)
MariaDB [test]> SHOW ENGINE INNODB STATUS;
+--------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Type   | Name | Status                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+--------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| InnoDB |      | 
=====================================
2023-05-18 08:03:18 0x7fcf24090640 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 7 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 0 srv_active, 0 srv_shutdown, 1146 srv_idle
srv_master_thread log flush and writes: 1146
----------
SEMAPHORES
----------
------------
TRANSACTIONS
------------
Trx id counter 176703396
Purge done for trx's n:o < 176703396 undo n:o < 0 state: running but idle
History list length 0
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION (0x7fcf24972b80), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
--------
FILE I/O
--------
Pending flushes (fsync): 0
36226 OS file reads, 31096 OS file writes, 6760 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 24495165053
Log flushed up to   24495165053
Pages flushed up to 24430187298
Last checkpoint at  24430187298
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 167772160
Dictionary memory allocated 871192
Buffer pool size   8064
Free buffers       1
Database pages     8063
Old database pages 2956
Modified db pages  3133
Percent of dirty pages(LRU & free pages): 38.847
Max dirty pages percent: 90.000
Pending reads 0
Pending writes: LRU 0, flush list 0
Pages made young 5254, not young 24012147
0.00 youngs/s, 0.00 non-youngs/s
Pages read 36204, created 11942, written 25012
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 8063, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 read views open inside InnoDB
Process ID=0, Main thread ID=0, state: sleeping
Number of rows inserted 5112, updated 0, deleted 4699492, read 4699492
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
Number of system rows inserted 0, updated 0, deleted 0, read 0
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
 |
+--------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)

I'm currently testing with a pool instead of a long-running single connection and seeing if that improves the issue.

@kostajh

@STran, are these observations relating to the existing init-db.js script, or do you have a work-in-progress patch that implements the batching logic? Are you using the docker-compose.yml environment for the application and MariaDB?

  • Existing, with me working on improvements
  • Yes, using docker-compose.yml

In T305114#8841656, @Marostegui suggests batching deletes in groups of 1K, maybe we need to experiment with smaller batches for INSERT. Are you using multi-value inserts?

No because each row requires 4 queries:

  1. insert actor (return id)
  2. insert proxies (with actor id as an informal pk)
  3. insert behaviors (with actor id as an informal pk)
  4. insert tunnels (with actor id as an informal pk)

One thing I was wondering about for the overall workflow, would it work if we did something like:

download the latest feed data and decompress
SELECT all items in the database (~178 million) and iterate over each row, check to see if the DB entry exists in the feed data file.
If the database entry exists in the feed data file, UPDATE the database entry based on the relevant line in the feed data file, and delete the line from the feed data file
If the database entry does not exist in the feed data file, DELETE the database entry, and delete the line from the feed data file
Any remaining items in the feed data file (90% should have been handled by UPDATE/DELETE) are the INSERT operations
With this approach, we would not need a separate script that deletes records based on examining the last_updated column. That's nice in case the import script fails for some reason, but the delete script continues, resulting in an empty database.

I personally wouldn't, but could be convinced otherwise.

  1. I wouldn't edit a file while it's being read given that it's the source of truth. I could see writing to a new file to work off of.
  2. How would you propose checking against the feed file? By grepping? I'm not sure I want to do command line things from node.
  3. It sounds like at the end, you'd still have to loop over a file large enough to cause at-scale problems to insert/update so I think we might as well do it in one loop instead of multiple loops

If the concern is that a delete script running on an independent schedule than the import script could delete the database, maybe we could instead initiate the delete portion on successful completion of the import script. Streams have ways to emit finished events. That way, the delete script would only run after an import successfully finished.

If the concern is that a delete script running on an independent schedule than the import script could delete the database, maybe we could instead initiate the delete portion on successful completion of the import script. Streams have ways to emit finished events. That way, the delete script would only run after an import successfully finished.

Kicking off the delete script after the import one finishes sounds like a good idea.

Solved a node problem, currently stuck on a connection problem. The node-side problem happened bc of an implementation error. I took the data out of the pipe to write it to the db when I should have kept it as part of the pipe. Since writes were slower than the data being pushed to it, it was causing a backpresssure problem, eating up the memory, and eventually killing the process. Moving it back to the pipe allows node to manage data throughput (as it should) and is probably best practice anyway.

I let it run and the good news is that the memory problems seem to have been solved. Usage didn't go above ~300MB while I was watching it and the pace of writes kept up with the data being pushed to the db. However, it eventually reached 9M lines before hanging for no immediately obvious reason. The problem didn't seem to be node-side and I gathered a few of these logs from the db:

2023-05-18  7:44:23 3 [Warning] Aborted connection 3 to db: 'test' user: 'root' host: '172.30.0.1' (Got an error reading communication packets)

Subsequent attempts to reconnect to the database fail. Trying to connect and ping fails:

SqlError: (conn=3, no: 45042, SQLState: 0A000) Ping timeout
    at Object.module.exports.createError (/srv/service/node_modules/mariadb/lib/misc/errors.js:61:10)
    at Timeout._onTimeout (/srv/service/node_modules/mariadb/lib/connection.js:277:20)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7) {
  text: 'Ping timeout',
  sql: null,
  fatal: true,
  errno: 45042,
  sqlState: '0A000',
  code: 'ER_PING_TIMEOUT'
}

I've restarted the service and the server and still can't connect regardless. I checked the size of the database and it's only about 1.5GB which doesn't seem like an unreasonable size and I can connect to the database from from the command line just fine.

Nothing...seems like it's choking? Here are some logs that may help:
I'm currently testing with a pool instead of a long-running single connection and seeing if that improves the issue.

@STran I suspect this may have something to do with Docker settings and/or MySQL configuration provided by the mariadb image.

Locally, on my host system mariadb, I'm able to run the init-db.js script. I'm at 23 million entries (and counting), looks like there are 24,688,684 in the new data file.

Update: the import worked, it took ~2.25 hours:

MariaDB [ipoid]> select count(*) from actor_data;
+----------+
| count(*) |
+----------+
| 24688684 |
+----------+
1 row in set (5.099 sec)

but init-db.js is still "running" even though it's not outputting anything anymore.

โžœ ps aux | grep init-db
kostajh          56193   0.0  0.2 410224224 134304 s003  S+    2:42PM  43:07.83 node init-db.js

Not sure why that's the case.

but init-db.js is still "running" even though it's not outputting anything anymore.

โžœ ps aux | grep init-db
kostajh          56193   0.0  0.2 410224224 134304 s003  S+    2:42PM  43:07.83 node init-db.js

Not sure why that's the case.

I think this is a bug that happened to be introduced by https://gerrit.wikimedia.org/r/921037, since the same happens when running it on the tiny test file.

Yes, the hang is on me which is why I marked the patch as WIP. I did want to share the pipeline fix without blocking on why I broke the end poller (fixing that is the next step to getting the patch closer to ready to review). I'm glad to hear that it's actually working! ๐ŸŽ‰ Is my understanding correct in that we'll be using these images for production? So it sounds like why the image is potentially causing problems needs to be investigated as well?

Change 921499 had a related patch set uploaded (by STran; author: STran):

[mediawiki/services/ipoid@master] Refactor init-db

https://gerrit.wikimedia.org/r/921499

Change 921500 had a related patch set uploaded (by STran; author: STran):

[mediawiki/services/ipoid@master] Better accomodate range of possible data

https://gerrit.wikimedia.org/r/921500

Change 921501 had a related patch set uploaded (by STran; author: STran):

[mediawiki/services/ipoid@master] [WIP] Process data in stream

https://gerrit.wikimedia.org/r/921501

Change 921035 abandoned by STran:

[wikimedia/security/security-api@master] Refactor init-db

Reason:

moved to Icfb6aa344796c088283bbdc386d642dd8a02109d

https://gerrit.wikimedia.org/r/921035

Change 921036 abandoned by STran:

[wikimedia/security/security-api@master] Better accomodate range of possible data

Reason:

moved to Ie8168a9f48e374a2a12357e0ed380a0e3e209432

https://gerrit.wikimedia.org/r/921036

Change 921037 abandoned by STran:

[wikimedia/security/security-api@master] [WIP] Process data in stream

Reason:

moved to I2ce3aba9bde866f707b82e1c48922c403d3504c5

https://gerrit.wikimedia.org/r/921037

Is my understanding correct in that we'll be using these images for production? So it sounds like why the image is potentially causing problems needs to be investigated as well?

Does this answer the question? From T305114#7822043:

We also can run our own MariaDB in a container, but for production purposes, it seems better to use a production ready cluster.

I know you're not doing it but I want to emphasize this is not a good idea. Containers will cause a lot of issues due to their ephemeral nature and by design are not suitable for stateful services and specially databases (ofc, local dev env is a completely different story and it's actually good for bootstrapping Mariadb/MySQL)

Change 921499 merged by jenkins-bot:

[mediawiki/services/ipoid@master] Refactor init-db

https://gerrit.wikimedia.org/r/921499

Change 921500 merged by jenkins-bot:

[mediawiki/services/ipoid@master] Better accommodate range of possible data

https://gerrit.wikimedia.org/r/921500

Change 921501 abandoned by Tchanders:

[mediawiki/services/ipoid@master] Process data in stream

Reason:

> pushed to GitLab as https://gitlab.wikimedia.org/repos/mediawiki/services/ipoid/-/merge_requests/2

https://gerrit.wikimedia.org/r/921501

Reading the documentation of the feeds, it appears that it is possible for us to request a diff for the previous day's data (or a specified date), which would simplify the process of getting updates in the db. I think it is something worth pursuing instead of trying to decompress and import such a large amount of data

Reading the documentation of the feeds, it appears that it is possible for us to request a diff for the previous day's data (or a specified date), which would simplify the process of getting updates in the db. I think it is something worth pursuing instead of trying to decompress and import such a large amount of data

We discussed this a bit in Slack. We can indeed access a diff endpoint but as @sbassett notes, it doesn't include important residential proxy data, so it's unsuitable for our current needs.

Since serviceops is done with T336163, we must consider how we are going to do the initial import in production. This is our suggestion:

  • Download the latest dump on deploy1002
  • Introduce flags to the application (or have a separate application/script) to instruct it to make the data import and exit
  • Introduce a flag to specify from which file to read from
  • Run the import as a standalone kubernetes Job (one-off)
  • Have the ability to restart the job, or continue from where it left off in case of an error (eg the node it was running died)

Given that we have the ability to provide as many resources as the pod needs, in which case, we can make it possible to load the whole dump in memory, if that would help with our current challenges.

Reading the documentation of the feeds, it appears that it is possible for us to request a diff for the previous day's data (or a specified date), which would simplify the process of getting updates in the db. I think it is something worth pursuing instead of trying to decompress and import such a large amount of data

We discussed this a bit in Slack. We can indeed access a diff endpoint but as @sbassett notes, it doesn't include important residential proxy data, so it's unsuitable for our current needs.

Do we have an estimation on what percentage this is? Put differently, e.g. could we:

  1. Import the entirety of the dataset once.
  2. Have daily diff imported, using the diff endpoint discussed above, for everything not marked residential proxy data
  3. Have a weekly/monthly/every 2 days/you name it re-import of JUST the residential proxy data ?

The above might sound naive, but I see that the amount of data to be imported is causing issues during development, it's probable that it's going to cause issues during the lifetime of the service too. Some form of divide and conquer, even if not the one I naively describe above would probably help address them.

Download the latest dump on deploy1002

Can someone do this manually? Or do you want a programmatic way of doing it? For the latter, T325630: Implement call to data vendor is not done yet.

Introduce flags to the application (or have a separate application/script) to instruct it to make the data import and exit

import-data.js does this already, looking for the file from a source specified as an environment variable

Introduce a flag to specify from which file to read from

Is the environment variable alright?

Run the import as a standalone kubernetes Job (one-off)

I don't know what a job is from kubernetes' understanding of the word, but this could be manually done by running node ./import-data.js with the feed where the script expects it to be.

Have the ability to restart the job, or continue from where it left off in case of an error (eg the node it was running died)

This is not a feature atm. Is it a blocker?

@akosiaris There's an ongoing conversation that overlaps with this one at T305724: Investigate database data invalidation questions and chunked/timed API to MySQL/MariaDB ETL. Should we continue this conversation there?

@akosiaris There's an ongoing conversation that overlaps with this one at T305724: Investigate database data invalidation questions and chunked/timed API to MySQL/MariaDB ETL. Should we continue this conversation there?

I was made aware of that, yes, we should continue that conversation there.

This ticket was created at a time when the script could not be finished on local instances at all and which has been somewhat resolved (for sufficiently powered computers with bare metal instances) "good enough" for development. It's a little less concerned with being production-ready, which the other ticket very much is and as such the discussion should be centralized there and this ticket closed out. For the questions @jijiki asked, I've moved those to a new ticket specifically focused on if an initial import needs additional development/management.

This ticket was created at a time when the script could not be finished on local instances at all and which has been somewhat resolved (for sufficiently powered computers with bare metal instances) "good enough" for development. It's a little less concerned with being production-ready, which the other ticket very much is and as such the discussion should be centralized there and this ticket closed out. For the questions @jijiki asked, I've moved those to a new ticket specifically focused on if an initial import needs additional development/management.

@STran, it seems I can't find the new task, it would be lovely if you'd please mention here or add it as a parent task, so to keep track where we are