Page MenuHomePhabricator

Sockpuppet API: Ensure data consistency across service instances
Closed, DeclinedPublic1 Estimated Story Points

Description

In the current data model, global data structures are used to hold state (input datasets),

These are mutated at run time, for example when new edit data becomes available. This means that these structures are both storage,
and a cache for hot changes. To the best of my knowledge, these caches are never flushed and would be regenerated at application restarts.

The use of globals in flask is not thread, nor process safe. Generally speaking, wsgi (like) serves will spawn multiple processes and the state won't be shared. We should ensure that, once the backend moves to a database, these updates are recorded and shared across the services hosts.
We should also make sure that re-runs of the ETL pipeline produce datasets consistent with the application caches (arguably, we should make the service stateless).

Event Timeline

Naike set the point value for this task to 1.Dec 10 2020, 2:18 PM
CCicalese_WMF renamed this task from Ensure data consistency across service instances to Sockpuppet API: Ensure data consistency across service instances.Feb 24 2021, 4:27 PM
Aklapper added a subscriber: hnowlan.

@hnowlan: Removing task assignee as this open task has been assigned for more than two years - See the email sent to task assignee on Feburary 22nd, 2023.
Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome! :)
If this task has been resolved in the meantime, or should not be worked on by anybody ("declined"), please update its task status via "Add Action… 🡒 Change Status".
Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator. Thanks!