As shown by T149049 and T149021, it is surprisingly difficult to get accurate stats about article creation rates or who creates articles. There are several reasons for this:
- We don't log it in the logging table.
- The page table doesn't include information about when an article was created or by whom.
- The revision and recentchanges tables don't include deleted revisions.
To get around these limitations, it is usually necessary to run expensive queries across several large tables, or even aggregate data from several different queries. Considering the importance of this information, it would make a lot of sense if we would just start logging article creation events through EventLogging.
Every time a new content-namespace page is created it should record the following information:
- Page ID
- Initial page title (including namespace if present)
- Username of the page creator
- Edit count of the page creator
- Age of the page creator account in days
- Does the page creator have the autopatrol right?
- Whether or not the page is a redirect
- Initial size of the page
This should make future research about article creation much easier.