This question has arisen a few times and been discussed on other tasks: T334623, T353953.
On those tasks, we've discussed separate solutions for each feature, since CheckUser and AbuseFilter save user information and/or create logs using their own tables. However, some features log via core's `ManualLogEntry` (e.g. SpamBlacklist - T358806), so some solution is needed centrally.
==== Background
With temp accounts enabled, it is possible for a person who is logged out to do something on the wiki that results in a log being made, of which they are the performer, but doesn't result in a temporary account being made. Who should be logged as the performer?
We can't log the IP (the current status quo), because we can't save a new IP actor in MediaWiki (ActorStore throws an error).
We can't log a temporary account because it hasn't been made yet.
==== Possible solutions
**Log the temporary user as the performer**
Options 2 and 3 from T334623#9653885:
>>! In T334623#9653885, @kostajh wrote:
> **2. Create account at beginning of `internalAttemptSave` and don't worry about detached user accounts**
>
> - Create account at beginning of `internalAttemptSave`, before any hooks/constraints
>
> Advantages:
>
> - Creates a user for an edit attempt which allows for easier logging constraint checks
>
> Disadvantages:
>
> - Something on the order of 55k accounts created per month, of which some percentage won't be used at all
> - Poor UX for users who failed to edit (but are actually logged-in). They'll see some form of CentralAuth / session errors, unless we implement some fix around it. They'll also not be able to visibly see that they're logged in on receiving an error (unless we do something about that too)
> - Inability for these users to be seen as the same user when visiting other wikis
>
> **3. Create account at beginning of `internalAttemptSave` and find way to perform top-level login on failure**
>
> - Create account at beginning of `internalAttemptSave`, before any hooks/constraints
> - Perform a top-level login on failed edits
>
> Advantages:
>
> - Same as 2, but we also ensure the user is attached to loginwiki, and that there is an improved UX in failure modes
>
> Disadvantages:
>
> - Something on the order of 55k accounts created per month, of which some percentage won't be used at all
> - Wikitext editors lose their changes on failed attempts (due to top level redirect)
> - Modifying EditPage.php to support parsing error messages into query params for top-level login will be painful and clunky. Same goes for API driven editing.
> - Would need to modify wikitext editor UX and API driven editing (VisualEditor/DiscussionTools/Wikibase) UX to support the top-level redirect workflows on edit failures
>
> ---
>
> It may be the case in the future that CentralAuth's SSO changes (T345249, T348388), in which case option 3 becomes more palatable, and would require less refactoring.
**Log the IP actor as the performer**
This would involve saving the IP address to core's actor table. I don't think this is a good idea, but including it here to be exhaustive.
The actor table would need to be regularly purged to remove the names. MediaWiki would go back to saving IP addresses by default (whereas without this, you'd only save IP addresses in special extensions like CheckUser and AbuseFilter). Anything that joins on the actor table would need to be updated to handle a missing name.
**Define a new type of actor**
This new type of actor would have a row in the actor table, but not in the user table (like IP actors, system actors). This idea is explored in T334623#9656689.
Conceptually, we do have a new type of actor: one that represents a real person (so can't be a system user[1]), who does something loggable (so must have a row in `actor`), but can't be an IP actor (because the IP can only be kept temporarily), and can't be a temp user (because they didn't do a `$wgAutoCreateTempUser['actions']` action).
Actor/user "types" in MediaWiki are currently defined by checking their usernames (until T336176) - either by matching them to a regex (IP, temp) or a config (system), so in practice this might mean being able to recognize a new pattern. Or solving T336176 first.
----
[1] We explored representing these actions using a new system user, but decided that it would be confusing to have a single system user represent a very large number of actions performed by lots of real users - e.g. CheckUser checks would reveal many unrelated IP addresses for what looked like one single user, and analytics would find that one user did a high proportion of actions.