Page MenuHomePhabricator

Add API to export toolinfo v1.2 compatible data
Closed, ResolvedPublicFeature

Description

Add a /tools/toolinfo/v1.2/toolinfo.json endpoint to expose toolinfo data managed with Striker as toolinfo v1.2.2 compatible data.

Event Timeline

bd808 changed the task status from Open to In Progress.Sep 3 2022, 11:59 PM
bd808 claimed this task.
bd808 triaged this task as Medium priority.
bd808 created this task.
bd808 changed the subtype of this task from "Task" to "Feature Request".

Change 829296 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[labs/striker@master] feat: Add tools/toolinfo/v1.2/toolinfo.json endpoint

https://gerrit.wikimedia.org/r/829296

Change 829296 merged by jenkins-bot:

[labs/striker@master] feat: Add tools/toolinfo/v1.2/toolinfo.json endpoint

https://gerrit.wikimedia.org/r/829296

Change 830252 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/puppet@production] striker: Bump deployed version to 2022-09-04-055313-production

https://gerrit.wikimedia.org/r/830252

Change 830252 merged by Andrew Bogott:

[operations/puppet@production] striker: Bump deployed version to 2022-09-04-055313-production

https://gerrit.wikimedia.org/r/830252

Prod usage error:

Traceback (most recent call last):
  File "/opt/lib/python/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/opt/lib/python/site-packages/django/core/handlers/base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/opt/lib/python/site-packages/django/core/handlers/base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/srv/app/striker/tools/views/toolinfo.py", line 402, in json_v1_2
    for info in ToolInfo.objects.all().order_by('name')
  File "/srv/app/striker/tools/views/toolinfo.py", line 402, in <listcomp>
    for info in ToolInfo.objects.all().order_by('name')
  File "/srv/app/striker/tools/models.py", line 515, in toolinfo_v1_2
    self.tool,
  File "/opt/lib/python/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/opt/lib/python/site-packages/django/db/models/query.py", line 408, in get
    self.model._meta.object_name
striker.tools.models.Tool.DoesNotExist: Tool matching query does not exist.

Prod usage error:

[...snip...]
striker.tools.models.Tool.DoesNotExist: Tool matching query does not exist.

ToolInfo models do not have a foreign key relationship to Tool models because Tool models are actually LDAP records. We have a number of orphaned ToolInfo models in the prod database as a result of T170355: Figure out process for deleting an unused tool making it possible to delete tools.

find-orphan-toolinfo.py
from django.test import RequestFactory
request_factory = RequestFactory(**{
    "SERVER_NAME": "localhost",
    "wsgi.url_scheme":"https",
})
from striker.tools.models import ToolInfo
for info in ToolInfo.objects.all():
    try:
        _ = info.toolinfo_v1_2(req)
    except Exception as e:
        print(info.pk, info.tool, info.name, e)
69 my_first_tool toolforge.my-first-tool Tool matching query does not exist.
72 commons_describer toolforge.commons_describer Tool matching query does not exist.
129 mediaplaycounts toolforge.mediaplaycounts Tool matching query does not exist.
130 projanalysis toolforge.[Obsolete] projanalysis Tool matching query does not exist.
131 wsm toolforge.[Obsolete] WSM Tool matching query does not exist.
132 reportsbot toolforge.reportsbot Tool matching query does not exist.
155 andrewtesttool toolforge.andrewtesttool Tool matching query does not exist.
242 harejtest toolforge.harejtest Tool matching query does not exist.
246 test-t188680 toolforge.test-t188680 Tool matching query does not exist.
500 speed-patrolling toolforge.speed-patrolling Tool matching query does not exist.
1208 welcomer toolforge.welcomer Tool matching query does not exist.
1363 andrewtesttooltwo toolforge.andrewtesttooltwo Tool matching query does not exist.
1364 andrewtesttoolthree toolforge.andrewtesttoolthree Tool matching query does not exist.
1466 sunny00217bot toolforge.sunny00217bot Tool matching query does not exist.
1552 kiranbot3 toolforge.kiranbot3 Tool matching query does not exist.

We should at least guard against orphans causing fatals when generating the toolinfo collection. Deleting records that have no associated Tool is possible as well, but I'm a bit worried about what could happen as the result of an LDAP outage if that is implemented poorly.

Change 830323 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[labs/striker@master] fix: Handle orphan ToolInfo when generating v1.2 export

https://gerrit.wikimedia.org/r/830323

Change 830323 merged by jenkins-bot:

[labs/striker@master] fix: Handle orphan ToolInfo when generating v1.2 export

https://gerrit.wikimedia.org/r/830323

Change 830688 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/puppet@production] striker: bump container version to 2022-09-07-191738-production

https://gerrit.wikimedia.org/r/830688

Change 830688 merged by Andrew Bogott:

[operations/puppet@production] striker: bump container version to 2022-09-07-191738-production

https://gerrit.wikimedia.org/r/830688

This seems to need some performance work. Attempting to load https://toolsadmin.wikimedia.org/tools/toolinfo/v1.2/toolinfo.json now that I have fixed the orphan ToolInfo record problem ends with a "504 Gateway Timeout" coming from envoy on the cloudweb* host handling the request. This is almost guaranteed to be caused by the lookup of all LabsUser objects from LDAP. Reporting the expanded user information is neat, but maybe just too slow to accomplish.

Change 831125 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[labs/striker@master] fix: remove maintainer details from toolinfo v1.2 output

https://gerrit.wikimedia.org/r/831125

Change 831125 merged by jenkins-bot:

[labs/striker@master] fix: remove maintainer details from toolinfo v1.2 output

https://gerrit.wikimedia.org/r/831125

Change 842533 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/puppet@production] striker: Bump container version to 2022-10-03-154059-production

https://gerrit.wikimedia.org/r/842533

Change 842533 merged by Andrew Bogott:

[operations/puppet@production] striker: Bump container version to 2022-10-03-154059-production

https://gerrit.wikimedia.org/r/842533

Even with the maintainer details removed this is too slow to run at production scale. The issue seems to be that there is still an LDAP query per ToolInfo model (ToolInfo.get_tool()) and making a lot of LDAP queries in a tight loop is relatively slow. The current get_tool() call is really a filter to ensure that the Tool that owns the ToolInfo data still exists. One possible fix would be to fetch the names of all tools and use that set for filtering instead. Something like list(Tool.objects.values_list("cn", flat=True)) seems to actually be very fast in production.

Change 843577 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[labs/striker@master] fix: speed up Tool existence check for toolinfo generation

https://gerrit.wikimedia.org/r/843577

Change 843577 merged by jenkins-bot:

[labs/striker@master] fix: speed up Tool existence check for toolinfo generation

https://gerrit.wikimedia.org/r/843577

Change 844063 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/puppet@production] striker: Bump container version to 2022-10-18-161910-production

https://gerrit.wikimedia.org/r/844063

Change 844063 merged by Andrew Bogott:

[operations/puppet@production] striker: Bump container version to 2022-10-18-161910-production

https://gerrit.wikimedia.org/r/844063

I have changed the URLs used when importing to both https://toolhub-demo.wmcloud.org and https://toolhub.wikimedia.org to use the new v1.2 format export.