Page MenuHomePhabricator

Extract parsing functions into a testable module
Closed, ResolvedPublic

Description

The logic for extracting data from WDumper HTML pages is currently embedded in notebook cells alongside unrelated code, making it untestable and hard to reuse. This task moves those functions into a dedicated module with tests that verify their behaviour independently of any HTTP calls, and updates the notebook to use them.

Acceptance Criteria:

  • src/wdumps_scraper/parsing.py contains extract_last_id(html: str) -> int, extract_name(html: str) -> str, and extract_filters(html: str) -> dict
  • Each function has at least one passing test in tests/test_parsing.py using a plain HTML string — no network calls, no global state
  • extract_filters returns an empty dict (not "") when no filter data is present
  • The notebook calls these functions instead of containing the extraction logic inline, and produces the same output as before
  • ruff check src/ tests/ and pytest both pass

Related Objects

Event Timeline