Page MenuHomePhabricator

Use type hints to validate user configuration variables
Open, Stalled, MediumPublic

Description

Refactor _check_user_config_types to infer allowed types from type hints instead of runtime types. Remove hardcoded special cases.

Idea:

>>> typehints = getattr(config, '__annotations__', {})
>>> typehints['putthrottle']
'int | float'  # <-- to be parsed

Event Timeline

Xqt triaged this task as Medium priority.Feb 2 2026, 10:06 AM

Change #1235838 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] Tests: Improvements or workarounds for tests

https://gerrit.wikimedia.org/r/1235838

Change #1235838 merged by jenkins-bot:

[pywikibot/core@master] Tests: Improvements or workarounds for tests

https://gerrit.wikimedia.org/r/1235838

@Xqt I have started analyzing the codebase for this refactor.

Since from future import annotations is imported at the top of config.py, annotations will store types as strings (e.g., 'int | float') rather than actual type objects.

My proposed plan:

Standardize: Explicitly add the type hint to socket_timeout (currently tuple | int | float) so we can remove its hardcoded exception block.

Resolve: In _check_user_config_types, I will iterate through globals()['__annotations__'].

Parse: I plan to implement a lightweight resolution logic to convert the annotation strings back to types:

    Split Unions by |.

    Strip generic subscripts (e.g., treat dict[str, str] as just dict for runtime checking) to avoid complex parsing of nested types.

    Resolve the names using globals() (for imported types) and builtins (for int, str, etc.).

Does this approach regarding stripping generics and manual string resolution align with your expectations, or is there an existing internal utility I should use instead?

You plan looks good. Another idea would be to remove the from __future__ import annotations line. In that case, you get the classes directly without parsing. For example:

>>> hint = config.__annotations__['put_throttle']
>>> hint.__args__
(<class 'int'>, <class 'float'>)
>>> config.__annotations__['authenticate']
>>> hint.__origin__
<class 'dict'>

You should test __origin__ first. This will fail with AttributeError for UnionTypes. In which case __args__ is the correct choice.

Unfortunately, we still support Python 3.9 (until the end of this year) and PEP604 was implemented in Python 3.10. Therefore, for Python 3.9 compatibility, all unions must be written as Union[] or Optional[].

The import is per file and there are only 11 occurrences to fix in config.py (you can check with pre-commit run mypy -a for this check) . It's up to you which approach is the best : either use the annotations directly or implement your own parser. I propose using the first approach, because deferred evaluation (without anotations import) is the default in Python 3.14+, and storing annotation as strings will be eventually be removed [1].

That makes perfect sense. I will proceed with the first approach (removing the future import).

My Plan:

Remove from __future__ import annotations.

Update all type hints in config.py to use Union[...] and Optional[...] instead of | to maintain Python 3.9 compatibility.

Implement _check_user_config_types using __origin__ and __args__ introspection as suggested.

I'll get started on the refactor now.

There is another thing when removing the future import:

if TYPE_CHECKING:
    _DabComDict = collections.defaultdict[str, dict[str, str]]

leads to an error. But you can remove the TYPE_CHECKING condition then.

Use pre-commit to check your changes. pre-commit run -a for all hooks and pre-commit run mypy -a for type checking only.

Implement _check_user_config_types using __origin__ and __args__ introspection as suggested.

Maybe you should keep the old behaviour as fallback.

Change #1237281 had a related patch set uploaded (by Anotida Expected; author: Anotida Expected):

[pywikibot/core@master] Refactor: Use runtime introspection for config type checking

https://gerrit.wikimedia.org/r/1237281

@Xqt can you give advice which issue to work on next , seems no good-first-issue open in pywikibot.

Hi @Xqt, thanks for the detailed guidance.
I see that T416145 is already in progress.
Please let me know which task you’d recommend I pick up next, and I’ll proceed step by step.

Please let me know which task you’d recommend I pick up next, and I’ll proceed step by step.

Your T415976 is not completed. Are you still working on it?

@Xqt can you give advice which issue to work on next , seems no good-first-issue open in pywikibot.

https://phabricator.wikimedia.org/maniphest/query/TEgpKROYaSfD/#R

Hi @Xqt,

Thanks for checking.

I reverted my local changes for T415976 and ran the addwikis tool as suggested:
python pwb.py addwikis -family:wikisource ur

The tool reports that 'ur' is already present in the wikisource family and that
there are no wikis to add, so it does not generate any changes.

I am not actively working on T415976 anymore. Please let me know if you would
like me to take any further action there, or if the task can be considered
complete from my side.

@Xqt, Patch set 4 for T416145 is ready for review! It was a bit of a challenge, but your guidance made the process much smoother, and I really enjoyed the learning curve. I’ll keep an eye on the workboard for my next task, but please let me know if there are any specific priorities you’d like me to tackle next.

JJMC89 subscribed.

I'm not a fan of using type annotations at runtime. That said, this needs tests that cover all branches of _check_user_config_types if you're going to implement it.

Postponed until Pywikibot 12 is released and Python 3.9 is dropped