Identify
- Data transformations that have been copy/pasted around notebooks
- Input data that exist either in a production dag, or should exist in one
- Code exists in another pipeline or a different python package, which could be shared for risk observatory
Document findings, and review with Pablo to verify correctness.