Run non-interactive commands on Toolforge kubernetes webservices
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Tarrow
	Jul 5 2017, 7:56 AM

Description

It would be good to be able to run arbitrary non-interactive commands on kubernetes webservices.

Currently it you need to use the webservice --backend=kubernetes <type> shell command but this is always interactive.

For example to update the dependencies of a python webservice running on kubernetes you need to do something like this:

webservice --backend=kubernetes python shell

<wait for interactive shell on kubernetes controlled instance>

webservice-python-bootstrap

It would be nice to have something like:
webservice --backend=kubernetes python shell -- webservice-python-bootstrap

Details

	Subject	Repo	Branch	Lines +/-
	Make `webservice shell` scriptable	operations/software/tools-webservice	master	+55 -31

Customize query in gerrit

Related Objects

Mentioned In: T358999: [webservice,toolforge-cli] `toolforge webservice TYPE shell -- something` does not pass extra cli arguments like `webservice TYPE shell -- something` does
T174769: Make it less cumbersome to bootstrap and update python webservices
T190696: [Gsoc 2018] Proposal for Toolforge webservice command Improvement
T190638: GSoC 2018 proposal for Improvements for the Toolforge 'webservice' command
T177603: Proposal: Improvements for the Toolforge 'webservice' command
T175768: Improvements for the Toolforge 'webservice' command

Event Timeline

Tarrow created this task.Jul 5 2017, 7:56 AM

Restricted Application edited projects, added Cloud-Services; removed Toolforge. · View Herald TranscriptJul 5 2017, 7:56 AM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Tarrow for now we are in a holding pattern on kubernetes features we can support but I think the idea is solid when we start gaining more traction.

bd808 edited projects, added Toolforge; removed Cloud-Services.Sep 14 2017, 5:18 AM

bd808 mentioned this in T175768: Improvements for the Toolforge 'webservice' command.

Sowjanyavemuri mentioned this in T177603: Proposal: Improvements for the Toolforge 'webservice' command.Oct 15 2017, 9:05 PM

Nehajha mentioned this in T190638: GSoC 2018 proposal for Improvements for the Toolforge 'webservice' command.Mar 25 2018, 7:33 AM

djff mentioned this in T190696: [Gsoc 2018] Proposal for Toolforge webservice command Improvement.Mar 27 2018, 5:41 AM

Legoktm renamed this task from Run non-interactive commands on labs kubernetes webservices to Run non-interactive commands on Toolforge kubernetes webservices.May 31 2018, 5:45 PM

Legoktm mentioned this in T174769: Make it less cumbersome to bootstrap and update python webservices.

Legoktm updated the task description. (Show Details)

JeanFred subscribed.Feb 11 2019, 8:53 PM

LucasWerkmeister subscribed.Apr 7 2019, 12:19 PM

Jakob_WMDE subscribed.Jun 7 2019, 3:44 PM

Michael subscribed.Jul 1 2019, 9:24 AM

Proof of concept using kubectl directly against the 2020 Kubernetes cluster

$ /usr/bin/kubectl run interactive \
  --image=docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest \
  --restart=Never --command=true --env=HOME=$HOME --labels='toolforge=tool' \
  --rm=true --stdin=true --tty=true \
  -- ls /etc
adduser.conf            gss            mke2fs.conf        resolv.conf
alternatives            gtk-3.0        modules-load.d     rmt
apt                     host.conf      motd               securetty
bash.bashrc             hostname       mtab               security
bash_completion.d       hosts          mysql              selinux
bindresvport.blacklist  ImageMagick-6  nanorc             shadow
binfmt.d                init.d         novaobserver.yaml  shadow-
ca-certificates         inputrc        nsswitch.conf      shells
ca-certificates.conf    issue          opt                skel
cron.daily              issue.net      os-release         ssl
dbus-1                  kernel         pam.conf           subgid
debconf.conf            ldap           pam.d              subuid
debian_version          ldap.conf      passwd             sysctl.d
default                 ldap.yaml      passwd-            systemd
deluser.conf            ld.so.cache    perl               terminfo
dhcp                    ld.so.conf     profile            timezone
dictionaries-common     ld.so.conf.d   profile.d          tmpfiles.d
dpkg                    libaudit.conf  python3            ucf.conf
emacs                   locale.alias   python3.7          update-motd.d
environment             locale.gen     rc0.d              vim
fonts                   localtime      rc1.d              wmcs-project
fstab                   login.defs     rc2.d              X11
gai.conf                logrotate.d    rc3.d              xattr.conf
group                   machine-id     rc4.d              xdg
group-                  mailcap        rc5.d
gshadow                 mailcap.order  rc6.d
gshadow-                mime.types     rcS.d
pod "interactive" deleted

Change 621776 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/software/tools-webservice@master] Make webservice shell scriptable

https://gerrit.wikimedia.org/r/621776

gerritbot added a project: Patch-For-Review.Aug 21 2020, 5:45 PM

Stashbot added a comment.Sep 17 2020, 9:55 PM

This comment was removed by bd808.

In T169695#6403030, @gerritbot wrote:

Change 621776 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/software/tools-webservice@master] Make webservice shell scriptable

https://gerrit.wikimedia.org/r/621776

This change works, but it is a bit less stable than the current method. When the interactive shell is left open for a long period of time (think hours, not minutes), kubectl can get disconnected from the running pod's container. When this happens, the pod is "leaked" in the tool's Kubernetes namespace and must be manually killed (kubectl delete pod ...). This could be confusing for folks who are not really aware of how webservice and kubectl interact yet as with a few leaks their namespace's quota for running new pods would be exhausted.

I'm not sure what balance to aim for is between making it possible to script webservice shell ... commands from the bastions (which the patch does) and making the webservice ... experience as intuitive as possible. @Tarrow, @Legoktm, @zhuyifei1999, @Bstorm: what are your thoughts?

Change 621776 merged by jenkins-bot:

[operations/software/tools-webservice@master] Make `webservice shell` scriptable

https://gerrit.wikimedia.org/r/621776

Maintenance_bot removed a project: Patch-For-Review.Sep 26 2021, 3:10 PM

tools.ldap@tools-sgebastion-08:~$ webservice python3.9 shell -- python3 --version
Python 3.9.2
Exception ignored in: <function WeakValueDictionary.__init__.<locals>.remove at 0x7f1d0e918ae8>
Traceback (most recent call last):
  File "/usr/lib/python3.5/weakref.py", line 117, in remove
TypeError: 'NoneType' object is not callable

The error appears to be coming from webservice itself.

In T169695#7466295, @Legoktm wrote:

tools.ldap@tools-sgebastion-08:~$ webservice python3.9 shell -- python3 --version
Python 3.9.2
Exception ignored in: <function WeakValueDictionary.__init__.<locals>.remove at 0x7f1d0e918ae8>
Traceback (most recent call last):
  File "/usr/lib/python3.5/weakref.py", line 117, in remove
TypeError: 'NoneType' object is not callable

The error appears to be coming from webservice itself.

My current untested guess is that this error comes from the Popen.wait() and it may be very specifically a python 3.5 bug https://github.com/python/cpython/commit/9cd7e17640a49635d1c1f8c2989578a8fc2c1de6

Just noting here that I'm seeing this WeakValueDictionary as of late whenever I exit a webservice shell. It's not deterministic, but happens most times.

home$ ssh tools-login

krinkle at tools-sgebastion-07.tools.eqiad.wmflabs in ~$ become intuition

# kubernetes, php7.3
tools.intuition at tools-sgebastion-07.tools.eqiad.wmflabs in ~$ webservice shell

tools.intuition at shell-1641484967$ echo 1
1
# up arrow, ctrl-C, then type exit
tools.intuition at shell-1641484967$ echo 1^C
tools.intuition at shell-1641484967$ exit
logout
pod tool-intuition/shell-1641484841 terminated (Error)
Exception ignored in: <function WeakValueDictionary.__init__.<locals>.remove at 0x7f911920fc80>
Traceback (most recent call last):
  File "/usr/lib/python3.5/weakref.py", line 117, in remove
TypeError: 'NoneType' object is not callable

Other times, it prints a line about a pod being terminated:

[16:03 UTC] tools.intuition at shell-1641484999 in ~
$ exit
logout
pod tool-intuition/shell-1641484999 terminated (Error)

[16:03 UTC] tools.intuition at tools-sgebastion-07.tools.eqiad.wmflabs in ~
38s 130 $

And rarely, it is quiet, it just exists quietly, which is what I'd expect other times:

tools.intuition at tools-sgebastion-07.tools.eqiad.wmflabs in ~$ webservice shell
$
tools.intuition at shell-1641485197 in ~$ exit
logout
tools.intuition at tools-sgebastion-07.tools.eqiad.wmflabs in ~$

In T169695#7601882, @Krinkle wrote:

Just noting here that I'm seeing this WeakValueDictionary as of late whenever I exit a webservice shell. It's not deterministic, but happens most times.

It is a python bug in the version of python shipped with Buster. It's harmless. See T169695#7466946 for details.

bd808 mentioned this in T358999: [webservice,toolforge-cli] `toolforge webservice TYPE shell -- something` does not pass extra cli arguments like `webservice TYPE shell -- something` does.Mar 3 2024, 4:45 PM

Albertoleoncio subscribed.Mon, Apr 15, 10:20 PM

https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework does this work now.

Restricted Application added a project: User-bd808. · View Herald TranscriptWed, Apr 17, 10:06 PM

Run non-interactive commands on Toolforge kubernetes webservicesClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

Run non-interactive commands on Toolforge kubernetes webservices
Closed, ResolvedPublic
Actions