Page MenuHomePhabricator

Find out additional datapoints on deleted wikis
Closed, ResolvedPublic

Description

To make a final decision on whether we should implement the feedback feature uüon instance deletion, one last datapoint is needed as Tom suggested in the parent ticket.

We thought we should ask you (@Charlie_WMDE) but maybe it's also meaningful to split by the usernames who did the deleting. For example more than 95% of the Jan 2023 deletion spike was due to a single user

I would like to see the deleted wikis split by usernames starting at 10-2023 up until now

Event Timeline

I'm not entirely sure what exactly the desired outcome here is:

  • Would this mean that when a user deleted 12 wikis in a month, we only count one wiki?
  • Does this mean we keep the old counting, but additionaly provide the number of users for the deleted wikis?
  • Something else?

add number of users, only for the inactive deletions

Here's the updated script

use App\Wiki;
use Carbon\Carbon;
use Carbon\CarbonPeriod;
use Illuminate\Support\Facades\Http;

$startDate = Wiki::select('created_at')->get()->first()->created_at;

$result = [];
$startDate->setDay(1);

foreach (CarbonPeriod::create($startDate, '1 month', Carbon::today()) as $month) {
    $id = $month->format('m-Y');
    $result[$id] = [];
    $wikisDeletedInMonth = Wiki::withTrashed()
        ->with('wikiManagers')
        ->whereRelation('wikiManagers', 'email', 'not like', '%wikimedia%')
        ->where([
            ['deleted_at', '<>', null],
            ['deleted_at', '>=', $month->startOfMonth()->toDateString()],
            ['deleted_at', '<=', $month->endOfMonth()->toDateString()],
        ])
        ->get();
    $result[$month->format('m-Y')]['count'] = $wikisDeletedInMonth->count();

    $wikisWhereAllManagersInactive = [];
    $wikisWhereManagersStillActive = [];
    foreach ($wikisDeletedInMonth as $wiki) {
        foreach ($wiki->wikiManagers()->get() as $user) {
            $matches = Wiki::whereRelation('wikiManagers', 'email', '=', $user->email)->get();
            foreach ($matches as $match) {
                try {
                    $res = Http::get('https://'.$match->domain.'/w/api.php?action=query&list=recentchanges&format=json');
                    $lastEdited = data_get($res->json(), 'query.recentchanges.0.timestamp');
                    if ($lastEdited) {
                        if (Carbon::now()->subDays(30) < Carbon::parse($lastEdited)) {
                            $wikisWhereManagersStillActive[] = $wiki;
                            continue 3;
                        }
                    }
                } catch (Exception $ex) {
                    // pass - wiki probably does not resolve
                }
            }
        }
        $wikisWhereAllManagersInactive[] = $wiki;
    }

    $inactiveManagers = [];
    foreach ($wikisWhereAllManagersInactive as $wiki) {
        foreach ($wiki->wikiManagers()->get() as $manager) {
            $inactiveManagers[] = $manager->email;
        }
    }

    $result[$id]['active_managers'] = count($wikisWhereManagersStillActive);
    $result[$id]['inactive_managers'] = count($wikisWhereAllManagersInactive);
    $result[$id]['unique_inactive_users'] = count(array_unique($inactiveManagers));
}

$output = ["month,deletions,active_managers,inactive_managers,unique_inactive_users"];
foreach ($result as $month => $stats) {
    $output[] = implode(
        ',',
        [
            $month,
            $stats['count'],
            $stats['active_managers'],
            $stats['inactive_managers'],
            $stats['unique_inactive_users'],
        ]
    );
}
echo implode(PHP_EOL, $output).PHP_EOL;

which returns the following data

month,deletions,with_active_managers,with_inactive_managers,unique_inactive_managers
02-2022,0,0,0,0
03-2022,0,0,0,0
04-2022,0,0,0,0
05-2022,0,0,0,0
06-2022,17,11,6,2
07-2022,3,3,0,0
08-2022,3,2,1,1
09-2022,4,2,2,2
10-2022,5,4,1,1
11-2022,5,1,4,2
12-2022,5,4,1,1
01-2023,71,70,1,1
02-2023,3,0,3,3
03-2023,10,5,5,4
04-2023,3,1,2,1
05-2023,1,0,1,1
06-2023,5,3,2,2
07-2023,3,1,2,2
08-2023,7,3,4,3
09-2023,3,0,3,2
10-2023,13,1,12,9
11-2023,8,0,8,8
12-2023,15,0,15,7
01-2024,11,2,9,5
02-2024,13,6,7,3

image.png (806×1 px, 37 KB)

@Charlie_WMDE I augmented the statistics to also include the number of "unique managers of the deleted wikis with inactive managers". I.e. if the green bar is as high as the yellow one it means "many users deleted one wiki each", if the green bar is significantly lower it means "one users deleted plenty of wikis"

awesome, thank you so much! that's exactly what i was looking for