Page MenuHomePhabricator

Installed python modules not persisted across restarts of a user's PAWS server
Closed, InvalidPublic

Description

I am using colorama module for python3 along with pywikibot. Using the following command to install pip install colorama After few hours of shutdown, when i return, it says that the colorama is not installed as throws in the terminal "ModuleNotFoundError". How to fix this issue?

Sample code ;-

#bringing the needed library modules
import csv, time, subprocess, pywikibot, re
#bringing the color library modules pip install colorama
from colorama import init, Back, Style
#init(autoreset=True)

#reading time-start
WAIT_TIME = 5
		
#reading the CSVfile as input file
with open('0-123-wikt.csv', 'r') as csvfile:
	reader = csv.reader(csvfile,delimiter="~")
	for row in reader:
#removing the first line of the csv        
#		subprocess.call("sed -i `` 1d 0-123-wikt.csv",shell=True)
#removing white spaces in the wiktHeader
		wiktHeader1 = row[0].strip()#.decode('utf-8')
		print ('\n' + 'step-1: ta.wiktHeader = ' + Back.BLUE + wiktHeader1 + Style.RESET_ALL )

Event Timeline

bd808 added a subscriber: bd808.Jul 13 2018, 6:31 PM

I was able to reproduce with this process:

  1. go to https://paws.wmflabs.org/
  2. open a new Terminal session
  3. pip install colorama
  4. python -c 'import colorama; print(dir(colorama))'
  5. close Terminal session
  6. click "Stop My Server" button # If I did not do this, things kept working in the next terminal session
  7. click "Start My Server" button
  8. python -c 'import colorama; print(dir(colorama))'
  9. ModuleNotFoundError: No module named 'colorama'

The explicit stop/start just kept me from needing to wait for PAWS to shutdown my idle pod to get the reproduction.

I'm kind of wondering if this "clean slate" that happens when a new pod is started is intentional design however. If the same virtualenv was persisted to long term storage it would grow to contain arbitrary libraries over time that the user might forget they had installed. This would in turn make their environment increasingly more unique and more difficult to ensure that they are able to share their process and results with others.

If you were working in a notebook rather than an interactive shell, you would typically add something like this to the start of your work to make sure that colorama was available:

# Install colorama in the current Jupyter kernel
import sys
!{sys.executable} -m pip install colorama

This will either install the package or print something like Requirement already satisfied: colorama in /srv/paws/lib/python3.6/site-packages. I think the same workflow should reasonably be expected for terminal sessions. Every time you start your terminal you should install the libraries you will need (probably from a requirements.txt file or something similar). This will either install the libraries or exit gracefully because they are already installed.

bd808 renamed this task from PAWS needs repeated installation to avoid ModuleNotFoundError for colorama module in python3 to Installed python modules not persisted across restarts of a user's PAWS server.Jul 13 2018, 6:32 PM
Chicocvenancio closed this task as Invalid.Jul 18 2018, 8:20 PM
Chicocvenancio added a subscriber: Chicocvenancio.

@Info-farmer that is expected behavior. Any restart of the user server means a return to a fresh state. I understand this creates some inconvenience and the time PAWS ran without an idle server restart does mean this broke some users workflow. This is a necessary step to have PAWS working reliably, however.

Besides bd808's suggestion for notebooks, you can also use venv to create a python virtual environment and activate it to not go though the package install each time. See https://docs.python.org/3/library/venv.html for details.