Page MenuHomePhabricator

move imageuncat's uploadedYesterday to pagegenerators
Closed, ResolvedPublic

Description

imageuncat has an argument -yesterday that fetches uploads from the previous day. this functionality should be provided by a command line argument provided by pagegenerators, so that any script can use this.

pagegenerators supports -logevents:upload to fetch uploads. This command line argument allows selecting by username, and/or a maximum number to fetch, but there is no way to select a time period.

(Note the maximum number to fetch should be deprecated: T128981: -logevents command line argument syntax allows specifying limit, however -limit exists for that purpose)

Somehow, we need to allow the user to specify a time period of log events they are interested in.

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:19 AM
bzimport set Reference to bz65192.
bzimport added a subscriber: Unknown Object (????).

Agree. This was my longer term idea too.

jayvdb subscribed.

Once the LogpagesPageGenerator page generator (T76555) is ported from compat, there is probably no need for a specialise generator, and imageuncat can be revised to use the generic generator with custom arguments.

So, LogeventsPageGenerator is now in pagegenerators, however it have start and end date parameters, which means it cant be used instead of uploadedYesterday without enhancement.

So, start and end date parameters need to be added to LogeventsPageGenerator, and then it can be used to achieve the same result as uploadedYesterday.

Also, RecentChangesPageGenerator has a changetype=log , start and end date parameters, and can be restricted to namespace=6 (File). It would be useful to investigate whether it can be used to produce similar results as uploadedYesterday . i.e. is 'imageuncat -yesterday' a subset of 'imageuncat -recentchanges'.

If new parameters are added to LogeventsPageGenerator, the reverse parameter should be added as well.

Also, should the deprecated LogpagesPageGenerator be updated as well?

If new parameters are added to LogeventsPageGenerator, the reverse parameter should be added as well.

feel free to add more than is necessary ;-)

Also, should the deprecated LogpagesPageGenerator be updated as well?

no need. It only exists to keep old code chugging along.

Should recentChanges be moved as well or just uploadedYesterday?

Also, should uploadedYesterday be renamed to UploadedYesterdayPageGenerator or something more generic that would allow the user to specify a range?

Well, ideally neither is 'moved' as such, but -yesterday and -recentchanges are replaced with standard pagegenerator arguments where possible to achieve the same result, e.g." -uploadlog" and generic pagegenerators / filters are used where not possible.

For providing backwards compatibility for '-yesterday' , you can do something like

if arg == '-yesterday':
    gen.handleArg('-uploadlog')

Then you need to restrict the date range of the upload log to 'yesterday'. To restrict the date, add an EdittimeFilterPageGenerator after calling getCombinedGenerator.

However currently EdittimeFilterPageGenerator will continue to consume all entries outside of the date range(which could take forever); we need an option to tell it to stop when it first encounters a date outside of the range.

Replacing the custom -recentchanges might be a bit harder to do, so tackle that after you've done -yesterday, or I was planning to create it as another task. We could replace it with standard arguments "-recentchanges -ns:6", but the delay=120 means the current implementation is not fetching a lot of records from the start of the recentchanges log.

(and give it a decent module docstring)

To restrict the date, add an EdittimeFilterPageGenerator after calling getCombinedGenerator.

If start and end were added to LogeventsPageGenerator wouldn't it cost less calls to the API using those parameters instead of EdittimeFilterPageGenerator?

To restrict the date, add an EdittimeFilterPageGenerator after calling getCombinedGenerator.

If start and end were added to LogeventsPageGenerator wouldn't it cost less calls to the API using those parameters instead of EdittimeFilterPageGenerator?

yes ;-)

Change 275204 had a related patch set uploaded (by AbdealiJK):
imageuncat: Use LogpagesPageGenerator

https://gerrit.wikimedia.org/r/275204

Change 275204 merged by jenkins-bot:
imageuncat: Deprecate -yesterday for -logevents

https://gerrit.wikimedia.org/r/275204