📊The dataset_metrics notebook should run as a python kernel
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	gmodena
	Mar 26 2021, 12:03 PM

Description

Currently this notebook uses a PySpark kernel to initialise a SparkSession and pre-populate the global notebook state with a spark object. This approach is being deprecated in favour of manual SparkSession initialisation (e.g. via wmfdata).

We should update and future proof the notebook.

Acceptance criteria
The Jupyter kernel is set as Python
SparkSession is manually initialised in a notebook cell.

Notes
We already do this in algorithm.ipynb, which can serve as an example.

Event Timeline

gmodena created this task.Mar 26 2021, 12:03 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 26 2021, 12:03 PM

gmodena moved this task from Backlog to Done on the Platform Team Workboards (Image Suggestion API) board.Apr 6 2021, 6:12 PM

• sdkim closed this task as Resolved.Jun 1 2021, 5:59 PM

• sdkim claimed this task.

📊The dataset_metrics notebook should run as a python kernelClosed, ResolvedPublicActions

Description

Event Timeline

📊The dataset_metrics notebook should run as a python kernel
Closed, ResolvedPublic
Actions