When running run-pipeline.sh, we get an import-error in utils.py
from src.MySqlDict import MySqlDict ModuleNotFoundError: No module named 'src'
This happens, for example, when we call generate_backtesting_eval.py when importing functions from utils.py
from utils import getLinks from utils import process_page
The reason is that the python-script is called from src/scripts/ such that the path src.MySqlDict (in utils.py) is not found and the import fails.
This was introduced in this patch: https://gerrit.wikimedia.org/r/c/research/mwaddlink/+/677835
A quick fix would be to add an additional path to the python system-paths in utils.py via:
import sys sys.path.append("../../") from src.MySqlDict import MySqlDict
However, I think there should be a better fix to make sure we define the python-paths properly for importing functions.
@Tgr suggested to add to the scripts something like :
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )" export PYTHONPATH=$(dirname $(dirname "$DIR"))