puppetdb-api micro service dosn't work well with large queries
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	jbond
	Jul 21 2023, 6:08 PM

Description

The puppet db micro service puppetdb-api.discovery.wmnet does not handle large queries very well. This can be demonstrated by the following two queries

$ time curl -X POST http://localhost:8080/pdb/query/v4/resources --data  '{"query":  ["=", "type", "File"]}' -H 'Content-Type: application/json' &> /dev/null 
curl -X POST http://localhost:8080/pdb/query/v4/resources --data  -H  &>   2.03s user 2.85s system 1% cpu 7:17.45 total
$ time curl -X POST https://puppetdb-api.discovery.wmnet:8090/pdb/query/v4/resources --data  '{"query": ["=", "type", "File"]}' -H 'Content-Type: application/json'            
<html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx/1.14.2</center>
</body>
</html>
curl -X POST https://puppetdb-api.discovery.wmnet:8090/pdb/query/v4/resources  0.02s user 0.02s system 0% cpu 1:00.06 total

The issue is that the flask service reads the data into memory iterates and modifies it before then sending it to the client, which causes a crash probably from OOM. It would be better if we could stream the data directly to the client and modify it on the fly.

The main affct this has is that cuminunpriv is unable to lookup very generic resources or classes that are used in a lot of places. I also suspect that genral queries will be slower on cuminunpriv vs cumin (which goes directly to puppetdb)

Details

Subject	Repo	Branch	Lines +/-
puppetdb-api-microservice: need to convert current query to json	operations/puppet	production	+3 -1
puppetdb-api-microservice: redact one the puppetdb side	operations/puppet	production	+12 -11
(WIP) puppetdb-microservice: update puppetdb micro service so it streams data	operations/puppet	production	+30 -8

Customize query in gerrit

Event Timeline

jbond triaged this task as Medium priority.Jul 21 2023, 6:08 PM

jbond created this task.

Restricted Application added a project: Infrastructure-Foundations. · View Herald TranscriptJul 21 2023, 6:08 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

jbond updated the task description. (Show Details)Jul 21 2023, 6:17 PM

Change 940403 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] (WIP) puppetdb-microservice: update puppetdb micro service so it streams data

https://gerrit.wikimedia.org/r/940403

gerritbot added a project: Patch-For-Review.Jul 21 2023, 7:09 PM

jbond updated the task description. (Show Details)Jul 21 2023, 7:38 PM

SLyngshede-WMF claimed this task.Aug 23 2023, 1:35 PM

Plan of action:

Rewrite to use the pypuppetdb library (https://github.com/voxpupuli/pypuppetdb)
Use PQL to do the queries
Return only the certname to reduce the amount of data

Example, replace: resource { title = 'foobar' and type = 'file'} with: resource[certname] { title = 'foobar' and type = 'file'}

Script lives here: modules/profile/files/puppetdb/puppetdb-microservice.py in the Puppet repo. Script gets deployed to PuppetDB hosts.

We likely want to also add the group by param e.g.

resources[certname] { type = 'File' group by certname }

It may also be easier to modify the incoming request to simply add the ast extract and Group by modifiers. This may be easier then trying to transpose the incoming AST query to a PQL query

Change 951965 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] puppetdb-api-microservice: redact one the puppetdb side

https://gerrit.wikimedia.org/r/951965

Change 940403 abandoned by Jbond:

[operations/puppet@production] (WIP) puppetdb-microservice: update puppetdb micro service so it streams data

Reason:

https://gerrit.wikimedia.org/r/c/operations/puppet/+/951965

https://gerrit.wikimedia.org/r/940403

Change 951965 merged by Jbond:

[operations/puppet@production] puppetdb-api-microservice: redact one the puppetdb side

https://gerrit.wikimedia.org/r/951965

Change 952216 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] puppetdb-api-microservice: need to convert current query to json

https://gerrit.wikimedia.org/r/952216

Change 952216 merged by Jbond:

[operations/puppet@production] puppetdb-api-microservice: need to convert current query to json

https://gerrit.wikimedia.org/r/952216

Maintenance_bot removed a project: Patch-For-Review.Aug 24 2023, 2:10 PM

It seems like John fixed this.

puppetdb-api micro service dosn't work well with large queriesClosed, ResolvedPublicActions

Description

Details

Event Timeline

puppetdb-api micro service dosn't work well with large queries
Closed, ResolvedPublic
Actions