Page MenuHomePhabricator

Generation of "demo" Dockerfile from Toolhub's blubber.yaml broken after refactoring
Closed, DuplicatePublicBUG REPORT

Description

In https://gerrit.wikimedia.org/r/c/wikimedia/toolhub/+/707878/2/.pipeline/blubber.yaml I did a bit of refactoring to the blubber.yaml to add a "production" variant and make the previous "demo" variant inherit from "production".

As seen in https://integration.wikimedia.org/ci/blue/organizations/jenkins/wikimedia-toolhub-pipeline-publish/detail/service-pipeline-test-and-publish/3/pipeline/ and local testing with make .pipeline/demo.Dockerfile this has managed to confuse Blubber.

Here's the git diff of a fresh run of make .pipeline/demo.Dockerfile vs the prior output looks like:

diff --git i/.pipeline/demo.Dockerfile w/.pipeline/demo.Dockerfile
index a89e018..d24e6b0 100644
--- i/.pipeline/demo.Dockerfile
+++ w/.pipeline/demo.Dockerfile
@@ -1,47 +1,5 @@
 # Dockerfile for *local development*.
 # Generated by Blubber from .pipeline/blubber.yaml
-FROM docker-registry.wikimedia.org/nodejs10-devel AS prep-nodejs
-USER 0
-ENV HOME="/root"
-RUN (getent group "65533" || groupadd -o -g "65533" -r "somebody") && (getent passwd "65533" || useradd -l -o -m -d "/home/somebody" -r -g "somebody" -u "65533" "somebody") && mkdir -p "/srv/app" && chown "65533":"65533" "/srv/app" && mkdir -p "/opt/lib" && chown "65533":"65533" "/opt/lib"
-RUN (getent group "900" || groupadd -o -g "900" -r "runuser") && (getent passwd "900" || useradd -l -o -m -d "/home/runuser" -r -g "runuser" -u "900" "runuser")
-USER 65533
-ENV HOME="/home/somebody"
-WORKDIR "/srv/app"
-ENV DJANGO_SETTINGS_MODULE="toolhub.settings" PIP_DISABLE_PIP_VERSION_CHECK="on" PIP_NO_CACHE_DIR="off" PYTHONBUFFERED="1" PYTHONDONTWRITEBYTECODE="1"
-COPY --chown=65533:65533 ["package.json", "package-lock.json", "./"]
-RUN npm install
-COPY --chown=65533:65533 ["vue.config.js", "./"]
-COPY --chown=65533:65533 ["vue/", "vue/"]
-COPY --chown=65533:65533 [".git/", ".git/"]
-RUN /bin/bash "-c" "ls -alh && npm run-script build:vue"
-USER 900
-ENV HOME="/home/runuser"
-ENV NODE_ENV="development"
-
-FROM docker-registry.wikimedia.org/python3-buster:latest AS prep
-USER 0
-ENV HOME="/root"
-ENV DEBIAN_FRONTEND="noninteractive"
-RUN apt-get update && apt-get install -y "build-essential" "default-libmysqlclient-dev" "gettext" "git" "python3-dev" "python3-venv" && rm -rf /var/lib/apt/lists/*
-RUN python3 "-m" "easy_install" "pip" && python3 "-m" "pip" "install" "-U" "setuptools" "wheel" "tox" "pip"
-ENV POETRY_VIRTUALENVS_PATH="/opt/lib/poetry"
-RUN python3 "-m" "pip" "install" "-U" "poetry==1.1.7"
-RUN (getent group "65533" || groupadd -o -g "65533" -r "somebody") && (getent passwd "65533" || useradd -l -o -m -d "/home/somebody" -r -g "somebody" -u "65533" "somebody") && mkdir -p "/srv/app" && chown "65533":"65533" "/srv/app" && mkdir -p "/opt/lib" && chown "65533":"65533" "/opt/lib"
-RUN (getent group "900" || groupadd -o -g "900" -r "runuser") && (getent passwd "900" || useradd -l -o -m -d "/home/runuser" -r -g "runuser" -u "900" "runuser")
-USER 65533
-ENV HOME="/home/somebody"
-WORKDIR "/srv/app"
-ENV DJANGO_SECRET_KEY="FAKE_SECRET_FOR_PREP_BUILD" DJANGO_SETTINGS_MODULE="toolhub.settings" PIP_DISABLE_PIP_VERSION_CHECK="on" PIP_NO_CACHE_DIR="off" PYTHONBUFFERED="1" PYTHONDONTWRITEBYTECODE="1" WIKIMEDIA_OAUTH2_KEY="FAKE_KEY_FOR_PREP_BUILD" WIKIMEDIA_OAUTH2_SECRET="FAKE_TOKEN_FOR_PREP_BUILD"
-COPY --chown=65533:65533 ["pyproject.toml", "poetry.lock", "./"]
-RUN mkdir -p "/opt/lib/poetry"
-RUN poetry "install" "--no-root" "--no-dev"
-COPY --chown=65533:65533 ["./", "./"]
-COPY --chown=65533:65533 --from=prep-nodejs ["/srv/app/vue/dist", "vue/dist/"]
-RUN /bin/bash "-c" "ls -alh && poetry run ./manage.py collectstatic -c --no-input && poetry run python3 -mjson.tool staticfiles/staticfiles.json > /tmp/staticfiles.json && mv /tmp/staticfiles.json staticfiles/staticfiles.json && poetry run ./manage.py compilemessages"
-USER 900
-ENV HOME="/home/runuser"
-
 FROM docker-registry.wikimedia.org/python3-buster:latest AS demo
 USER 0
 ENV HOME="/root"

The production variant seems to generate a reasonable Dockerfile output:

# Dockerfile for *local development*.
# Generated by Blubber from .pipeline/blubber.yaml
FROM docker-registry.wikimedia.org/nodejs10-devel AS prep-nodejs
USER 0
ENV HOME="/root"
RUN (getent group "65533" || groupadd -o -g "65533" -r "somebody") && (getent passwd "65533" || useradd -l -o -m -d "/home/somebody" -r -g "somebody" -u "65533" "somebody") && mkdir -p "/srv/app" && chown "65533":"65533" "/srv/app" && mkdir -p "/opt/lib" && chown "65533":"65533" "/opt/lib"
RUN (getent group "900" || groupadd -o -g "900" -r "runuser") && (getent passwd "900" || useradd -l -o -m -d "/home/runuser" -r -g "runuser" -u "900" "runuser")
USER 65533
ENV HOME="/home/somebody"
WORKDIR "/srv/app"
ENV DJANGO_SETTINGS_MODULE="toolhub.settings" PIP_DISABLE_PIP_VERSION_CHECK="on" PIP_NO_CACHE_DIR="off" PYTHONBUFFERED="1" PYTHONDONTWRITEBYTECODE="1"
COPY --chown=65533:65533 ["package.json", "package-lock.json", "./"]
RUN npm install
COPY --chown=65533:65533 ["vue.config.js", "./"]
COPY --chown=65533:65533 ["vue/", "vue/"]
COPY --chown=65533:65533 [".git/", ".git/"]
RUN /bin/bash "-c" "ls -alh && npm run-script build:vue"
USER 900
ENV HOME="/home/runuser"
ENV NODE_ENV="development"

FROM docker-registry.wikimedia.org/python3-buster:latest AS prep
USER 0
ENV HOME="/root"
ENV DEBIAN_FRONTEND="noninteractive"
RUN apt-get update && apt-get install -y "build-essential" "default-libmysqlclient-dev" "gettext" "git" "python3-dev" "python3-venv" && rm -rf /var/lib/apt/lists/*
RUN python3 "-m" "easy_install" "pip" && python3 "-m" "pip" "install" "-U" "setuptools" "wheel" "tox" "pip"
ENV POETRY_VIRTUALENVS_PATH="/opt/lib/poetry"
RUN python3 "-m" "pip" "install" "-U" "poetry==1.1.7"
RUN (getent group "65533" || groupadd -o -g "65533" -r "somebody") && (getent passwd "65533" || useradd -l -o -m -d "/home/somebody" -r -g "somebody" -u "65533" "somebody") && mkdir -p "/srv/app" && chown "65533":"65533" "/srv/app" && mkdir -p "/opt/lib" && chown "65533":"65533" "/opt/lib"
RUN (getent group "900" || groupadd -o -g "900" -r "runuser") && (getent passwd "900" || useradd -l -o -m -d "/home/runuser" -r -g "runuser" -u "900" "runuser")
USER 65533
ENV HOME="/home/somebody"
WORKDIR "/srv/app"
ENV DJANGO_SECRET_KEY="FAKE_SECRET_FOR_PREP_BUILD" DJANGO_SETTINGS_MODULE="toolhub.settings" PIP_DISABLE_PIP_VERSION_CHECK="on" PIP_NO_CACHE_DIR="off" PYTHONBUFFERED="1" PYTHONDONTWRITEBYTECODE="1" WIKIMEDIA_OAUTH2_KEY="FAKE_KEY_FOR_PREP_BUILD" WIKIMEDIA_OAUTH2_SECRET="FAKE_TOKEN_FOR_PREP_BUILD"
COPY --chown=65533:65533 ["pyproject.toml", "poetry.lock", "./"]
RUN mkdir -p "/opt/lib/poetry"
RUN poetry "install" "--no-root" "--no-dev"
COPY --chown=65533:65533 ["./", "./"]
COPY --chown=65533:65533 --from=prep-nodejs ["/srv/app/vue/dist", "vue/dist/"]
RUN /bin/bash "-c" "ls -alh && poetry run ./manage.py collectstatic -c --no-input && poetry run python3 -mjson.tool staticfiles/staticfiles.json > /tmp/staticfiles.json && mv /tmp/staticfiles.json staticfiles/staticfiles.json && poetry run ./manage.py compilemessages"
USER 900
ENV HOME="/home/runuser"

FROM docker-registry.wikimedia.org/python3-buster:latest AS production
USER 0
ENV HOME="/root"
ENV DEBIAN_FRONTEND="noninteractive"
RUN apt-get update && apt-get install -y "build-essential" "default-libmysqlclient-dev" "gettext" "git" "python3-dev" "python3-venv" && rm -rf /var/lib/apt/lists/*
RUN python3 "-m" "easy_install" "pip" && python3 "-m" "pip" "install" "-U" "setuptools" "wheel" "tox" "pip"
ENV POETRY_VIRTUALENVS_PATH="/opt/lib/poetry"
RUN python3 "-m" "pip" "install" "-U" "poetry==1.1.7"
RUN (getent group "65533" || groupadd -o -g "65533" -r "somebody") && (getent passwd "65533" || useradd -l -o -m -d "/home/somebody" -r -g "somebody" -u "65533" "somebody") && mkdir -p "/srv/app" && chown "65533":"65533" "/srv/app" && mkdir -p "/opt/lib" && chown "65533":"65533" "/opt/lib"
RUN (getent group "900" || groupadd -o -g "900" -r "runuser") && (getent passwd "900" || useradd -l -o -m -d "/home/runuser" -r -g "runuser" -u "900" "runuser")
USER 65533
ENV HOME="/home/somebody"
WORKDIR "/srv/app"
ENV DJANGO_SETTINGS_MODULE="toolhub.settings" PIP_DISABLE_PIP_VERSION_CHECK="on" PIP_NO_CACHE_DIR="off" PYTHONBUFFERED="1" PYTHONDONTWRITEBYTECODE="1"
COPY --chown=65533:65533 ["pyproject.toml", "poetry.lock", "./"]
RUN mkdir -p "/opt/lib/poetry"
RUN poetry "install" "--no-root" "--no-dev"
COPY --chown=65533:65533 --from=prep ["/srv/app", "."]
USER 900
ENV HOME="/home/runuser"
ENTRYPOINT ["/usr/local/bin/poetry", "run", "python3", "manage.py", "runserver", "--noreload", "--nostatic", "0.0.0.0:8000"]

LABEL blubber.variant="production" blubber.version="0.8.0+459234d"

Event Timeline

Is there a place to look for crash logs from blubberoid? That demo output looks like it just died in the middle of doing things. I'll see if I can run locally and find any debugging output that gives a better clue of what is happening.

Is there a place to look for crash logs from blubberoid? That demo output looks like it just died in the middle of doing things. I'll see if I can run locally and find any debugging output that gives a better clue of what is happening.

I see this dashboard https://logstash.wikimedia.org/app/dashboards#/view/7f883390-fe76-11ea-b848-090a7444f26c?_g=h@42b0d52&_a=h@4506114
but there don't seem to be any logs for blubberoid in the last hour. I also did an ssh into contint1001 to get some logs via kubectl, and didn't see any on the pods I checked, but there are quite a few pods and I'm not sure which cluster the requests are going to.

Using a local build of blubber, the demo Dockerfile output is "complete" but still not right:

$ ./blubber ~/projects/wmf/wikimedia/toolhub/.pipeline/blubber.yaml demo
FROM docker-registry.wikimedia.org/python3-buster:latest AS demo
USER 0
ENV HOME="/root"
ENV DEBIAN_FRONTEND="noninteractive"
RUN apt-get update && apt-get install -y "build-essential" "default-libmysqlclient-dev" "gettext" "git" "python3-dev" "python3-venv" && rm -rf /var/lib/apt/lists/*
RUN python3 "-m" "easy_install" "pip" && python3 "-m" "pip" "install" "-U" "setuptools" "wheel" "tox" "pip"
ENV POETRY_VIRTUALENVS_PATH="/opt/lib/poetry"
RUN python3 "-m" "pip" "install" "-U" "poetry==1.1.7"
RUN (getent group "65533" || groupadd -o -g "65533" -r "somebody") && (getent passwd "65533" || useradd -l -o -m -d "/home/somebody" -r -g "somebody" -u "65533" "somebody") && mkdir -p "/srv/app" && chown "65533":"65533" "/srv/app" && mkdir -p "/opt/lib" && chown "65533":"65533" "/opt/lib"
RUN (getent group "900" || groupadd -o -g "900" -r "runuser") && (getent passwd "900" || useradd -l -o -m -d "/home/runuser" -r -g "runuser" -u "900" "runuser")
USER 65533
ENV HOME="/home/somebody"
WORKDIR "/srv/app"
ENV DB_NAME="/dev/shm/toolhub.sqlite3" DJANGO_SETTINGS_MODULE="toolhub.settings" PIP_DISABLE_PIP_VERSION_CHECK="on" PIP_NO_CACHE_DIR="off" PYTHONBUFFERED="1" PYTHONDONTWRITEBYTECODE="1"
COPY --chown=65533:65533 ["pyproject.toml", "poetry.lock", "./"]
RUN mkdir -p "/opt/lib/poetry"
RUN poetry "install" "--no-root" "--no-dev"
COPY --chown=65533:65533 --from=prep ["/srv/app", "."]
USER 900
ENV HOME="/home/runuser"
ENTRYPOINT ["/bin/bash", "-c", "poetry run python3 manage.py migrate && poetry run python3 manage.py createinitialrevisions && poetry run python3 manage.py loaddata toolhub/fixtures/demo.yaml && poetry run python3 manage.py crawl && poetry run python3 manage.py runserver --noreload --nostatic 0.0.0.0:8000"]

LABEL blubber.variant="demo" blubber.version="0.8.0+459234d"

I get identical output from curl -s -H 'content-type: application/yaml' --data-binary @/Users/bd808/projects/wmf/wikimedia/toolhub/.pipeline/blubber.yaml https://blubberoid.wikimedia.org/v1/demo, so whatever is going wrong with the make .pipeline/demo.Dockerfile that leaves a truncated file is a different problem.

The make .pipeline/demo.Dockerfile output matches this too. I was confused by the git diff output I pasted in the task description. Fundamentally it looks like the dep graph solver in blubber is not liking the input file. There really isn't any debug logging to turn on that I can see, so debugging may actually involve using some kind of stepping debugger or adding logging.

Change 708355 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[wikimedia/toolhub@main] build: disable \"demo\" variant publish step

https://gerrit.wikimedia.org/r/708355

Change 708355 merged by jenkins-bot:

[wikimedia/toolhub@main] build: disable \"demo\" variant publish step

https://gerrit.wikimedia.org/r/708355

I am going to switch the dependency order of this task and T278503: Build and push Docker containers on merge. The more that I think about this the "demo" variant is fairly low value really. The things it changes from the "production" variant are only environment vars and entrypoint commands. These are things which can be easily manipulated by deployment tooling (docker, docker-compose, kubernetes).

I believe this was fixed earlier in the week. It was indeed an issue with the dependency graph not including copies dependencies that were the result of an includes.