Page MenuHomePhabricator

Integrate PetScan functionality into Cat-a-lot for generating image lists
Open, Needs TriagePublicFeature

Description

PetScan is a tool enabling users to generate targeted lists from Wikipedia and related Wikimedia projects based on specific criteria. As a user of Cat-a-lot, I would like to integrate PetScan queries directly into Cat-a-lot for managing and categorizing Wikimedia Commons images more efficiently. Ie.

  • Enter a PetScan imagelist queries when Cat-a-lot is enabled from preferences
  • Fetch and display the resulting image list from PetScan as thumbnails.
  • Select images from the thumbnail grid for categorization using the existing Cat-a-lot interface.

Step 1: Minimal UI

  • Using userscript/Cat-a-lot script create a form on the page Special:BlankPage/Cat-a-lot that includes:
    • A text input field for PetScan query ID (example: 33373583 )
    • A submit button
  • Upon submission:
  • Add link to prefilled PerScan query
  • Display the retrieved images as a thumbnail grid on the page Special:BlankPage/Cat-a-lot 
    • Default limit: 100 images
    • Allow users to increase this limit
  • Users select images via Cat-a-lot and perform normal categorization actions to the images.

For getting Cat-a-lot’s selection to work it will need to have:

  1. image grid should similar html structure than in existing pages (tms gallery, category)
  2. set correct CAL.searchmode value when page is loaded ( Line 1804 )
  3. Update findAllLabels() function (Line 432)
  4. Check that getMarkedLabels() works and update if needed (Line 484)

Step 2: Develop a simple UI for PetScan parameters:

  • Allow defining category intersections (GET parameter name categories, format string, one categoryname per line)
  • Allow defining negative categories (GET parameter name negcats , format string, one categoryname per line )
  • Allow defining default category query depth (GET parameter name depth , format integer )
  • PetScan query parameters from UI will override the paremeters in query ID.

The PetScan query parameters names can be found by submitting query in PetScan and then checking links in the text at the result form:

Link to a pre-filled form for the query you just ran with and without auto-run. PSID is 33374035.

This task would be starting the work for answering to T195575.

Event Timeline

Zache updated the task description. (Show Details)
Zache updated the task description. (Show Details)
Zache updated the task description. (Show Details)
Zache updated the task description. (Show Details)
Zache updated the task description. (Show Details)

For documentation. If new page-types are added and pages are simple (ie normal wikipages, not dynamically generated using javascript). Following places will need to be updated

  1. Setting correct` CAL.searchmode` variable when page is loaded ( Line 1804 )
  2. Setting CSS classes to correct places in` findAllLabels()` function (Line 432)
  3. Getting selected items using getMarkedLabels() (Line 484)

About implementation: It could also be done by creating a separate script responsible for toggling between the Special:Search list/image grid and applying filters, accompanied by smaller modifications to MediaWiki:Cat-a-lot.js to support new page type. The relationship between these two scripts would be that Cat-a-lot would load the new script as well, although the new script could also be used independently.

About how to identify the current page via JavaScript:

Page-specific variables are defined on every page and can be viewed in the <head> section of the page's source code. These variables can be accessed using mw.config.get. For example:

mw.config.get('wgNamespaceNumber');
Some relevant variable examples:

"wgCanonicalNamespace": "Special",
"wgCanonicalSpecialPageName": "Search",
"wgNamespaceNumber": -1,
"wgPageName": "Special:Search",
"wgTitle": "Search"

Documentation for variables: https://www.mediawiki.org/wiki/Manual:Interface/JavaScript

Tacsipacsi subscribed.

About implementation: It could also be done by creating a separate script responsible for toggling between the Special:Search list/image grid and applying filters

I guess you meant to write this on T389735, not here?

Hi @Zache
This sounds like an exciting task! I'd love to contribute by integrating PetScan queries into Cat-a-lot for more efficient image categorization. I’ll start by working on the minimal UI, fetching images via PetScan, and ensuring compatibility with Cat-a-lot’s selection system. Looking forward to feedback and collaboration!

Hello @Zache @Tacsipacsi I am currently working on this task and this is my report so far

Screenshot 2025-03-29 at 11.27.31.png (1×3 px, 560 KB)

PetScan Cat-a-lot Integration: Completion Report

I've was able to complete the PetScan and Cat-a-lot integration. The userscript now allows users to enter PetScan query IDs on the Special:BlankPage/Cat-a-lot page, load images from PetScan, and categorize them using Cat-a-lot.

The script includes all requested features:

  • Input field for PetScan query IDs
  • Advanced options for categories, negative categories, and depth
  • Image limit control
  • Thumbnail grid display
  • Link to prefilled PetScan query

You can test the script here:
https://commons.wikimedia.org/wiki/User:Nenyee/PetScan-Cat-a-lot-Integration.js

To use it, simply add this line to your common.js page:

mw.loader.load('//commons.wikimedia.org/w/index.php?title=User:Nenyee/PetScan-Cat-a-lot-Integration.js&action=raw&ctype=text/javascript');
mw.loader.load('//commons.wikimedia.org/w/index.php?title=User:Nenyee/cat-a-lot-patch.js&action=raw&ctype=text/javascript');

Then visit https://commons.wikimedia.org/wiki/Special:BlankPage/Cat-a-lot to try it out.

I would appreciate if you could review the script and provide feedback on its functionality and any potential improvements.

Neyehh attached a referenced file: Unknown Object (File). (Show Details)Mar 28 2025, 10:44 AM

@Neyehh , Thanks for this, couple of notes regarding testing:

GM_xmlhttpRequest
currently it requires requres GM_xmlhttpRequest to be working. It should be possible to get it working directly with native browser functions (i tested with Safari and Chrome that there is no CORS limitations)

mw.loader.load
In your description the loading line is:

mw.loader.load('https://commons.wikimedia.org/wiki/User:Nenyee/PetScan-Cat-a-lot-Integration.js');

However, it will require also action=raw and ctype (content-type) parameters:

mw.loader.load('//commons.wikimedia.org/w/index.php?title=User:Nenyee/PetScan-Cat-a-lot-Integration.js&action=raw&ctype=text/javascript');

@Neyehh , Thanks for this, couple of notes regarding testing:

GM_xmlhttpRequest
currently it requires requres GM_xmlhttpRequest to be working. It should be possible to get it working directly with native browser functions (i tested with Safari and Chrome that there is no CORS limitations)

mw.loader.load
In your description the loading line is:

mw.loader.load('https://commons.wikimedia.org/wiki/User:Nenyee/PetScan-Cat-a-lot-Integration.js');

However, it will require also action=raw and ctype (content-type) parameters:

mw.loader.load('//commons.wikimedia.org/w/index.php?title=User:Nenyee/PetScan-Cat-a-lot-Integration.js&action=raw&ctype=text/javascript');

I see that now. Will correct this according.

Hello @Zache @Tacsipacsi I am currently working on this task and this is my report so far

Screenshot 2025-03-29 at 11.27.31.png (1×3 px, 560 KB)

PetScan Cat-a-lot Integration: Completion Report

I've was able to complete the PetScan and Cat-a-lot integration. The userscript now allows users to enter PetScan query IDs on the Special:BlankPage/Cat-a-lot page, load images from PetScan, and categorize them using Cat-a-lot.

The script includes all requested features:

  • Input field for PetScan query IDs
  • Advanced options for categories, negative categories, and depth
  • Image limit control
  • Thumbnail grid display
  • Link to prefilled PetScan query

You can test the script here:
https://commons.wikimedia.org/wiki/User:Nenyee/PetScan-Cat-a-lot-Integration.js

To use it, simply add this line to your common.js page:

mw.loader.load('//commons.wikimedia.org/w/index.php?title=User:Nenyee/PetScan-Cat-a-lot-Integration.js&action=raw&ctype=text/javascript');
mw.loader.load('//commons.wikimedia.org/w/index.php?title=User:Nenyee/cat-a-lot-patch.js&action=raw&ctype=text/javascript');

Then visit https://commons.wikimedia.org/wiki/Special:BlankPage/Cat-a-lot to try it out.

I would appreciate if you could review the script and provide feedback on its functionality and any potential improvements.

@Zache I have made the necessary alterations
I removed the GM_xmlhttpRequest dependency and it now uses native browser fetch API for all requests and I added the required parameters. I believe it should work well now. Looking forward to your review when you get the chnace.

Thanks, it looks pretty cool! Just some notes

Interface loading
Currently the Petscan interface loads itself on every page. It should load iteself only when it is in Special:BlankPage/Cat-a-lot . You can do the page test like this:

if (mw.config.get("wgPageName") !== "Special:BlankPage/Cat-a-lot") { return; }

Basic tab
Create new Petscan Query link doesn't work (ie. incorrect parameter values). You can use this to open Petscan query UI with Wikimedia Commons preselected
https://petscan.wmcloud.org/?project=wikimedia&language=commons

Categories tab
1.) Currently Categories (one per line) breaks if there is more than one Category. Reason is that it uses %0D%0A as separator and not %7C (%0D = carriage return ; %0A = line feed per w3school) I just checked from PetScan generated url:s what is the correct syntax.

2.) Category Combination
Category Combination selector could be moved to between Category and negative categories box as it only effects to categories. Negative categories are always "AND". It could also use same labels than Petscan so we can use directly Petstan translations if needed. (ie. Combination, Intersection, Union )

3.) Default depth could be decreased to 2 ( it is too easy to have slow queries with 4 )

Templates&links
1.) Template selectors parameter names are incorrect. Correct url parameter names are:

  • Has all of these templates = templates_yes
  • Has any of these templates = templates_any
  • Has none of these templates = templates_no

2.) Filter names. We could again use same labels than in Petscan so we can use Petscan's translations if needed

3.) Property Filters (P number and value) - ( this would be good if it would work reliably in Petscan for Wikimedia Commons SDC, however it doesn't so lets remove this )

Saved queries
It saves something to localStorage, but it doesn't seem to load all parameters correctly. For example it didn't load or save the templates tab values.

Settings
If thumbnail size is changed it doesn't increase the container size in result list and image will overflow.

Result list
The cat-a-lot seaction doesn't work because the selector in Cat-a-lot.js findAllLabels() is incorrect. It should work if change it like this (example diff)

this.labels = this.labels.add( $( 'ul.gallery li.gallerybox' ) );

List doesn't need extra Select all/ Deselect all buttons as the Cat-a-lot does it already.

Also Cat-a-lot main UI is broken (it overflows because there is too many categories). I hope that this is fixed in T388118 already so it is done there.

Thanks, it looks pretty cool! Just some notes

Interface loading
Currently the Petscan interface loads itself on every page. It should load iteself only when it is in Special:BlankPage/Cat-a-lot . You can do the page test like this:

if (mw.config.get("wgPageName") !== "Special:BlankPage/Cat-a-lot") { return; }

Basic tab
Create new Petscan Query link doesn't work (ie. incorrect parameter values). You can use this to open Petscan query UI with Wikimedia Commons preselected
https://petscan.wmcloud.org/?project=wikimedia&language=commons

Categories tab
1.) Currently Categories (one per line) breaks if there is more than one Category. Reason is that it uses %0D%0A as separator and not %7C (%0D = carriage return ; %0A = line feed per w3school) I just checked from PetScan generated url:s what is the correct syntax.

2.) Category Combination
Category Combination selector could be moved to between Category and negative categories box as it only effects to categories. Negative categories are always "AND". It could also use same labels than Petscan so we can use directly Petstan translations if needed. (ie. Combination, Intersection, Union )

3.) Default depth could be decreased to 2 ( it is too easy to have slow queries with 4 )

Templates&links
1.) Template selectors parameter names are incorrect. Correct url parameter names are:

  • Has all of these templates = templates_yes
  • Has any of these templates = templates_any
  • Has none of these templates = templates_no

2.) Filter names. We could again use same labels than in Petscan so we can use Petscan's translations if needed

3.) Property Filters (P number and value) - ( this would be good if it would work reliably in Petscan for Wikimedia Commons SDC, however it doesn't so lets remove this )

Saved queries
It saves something to localStorage, but it doesn't seem to load all parameters correctly. For example it didn't load or save the templates tab values.

Settings
If thumbnail size is changed it doesn't increase the container size in result list and image will overflow.

Result list
The cat-a-lot seaction doesn't work because the selector in Cat-a-lot.js findAllLabels() is incorrect. It should work if change it like this (example diff)

this.labels = this.labels.add( $( 'ul.gallery li.gallerybox' ) );

List doesn't need extra Select all/ Deselect all buttons as the Cat-a-lot does it already.

Also Cat-a-lot main UI is broken (it overflows because there is too many categories). I hope that this is fixed in T388118 already so it is done there.

@Zache Thank you for this, I was quite stuck trying to figure some of them out and i think this is helpful. I will make the updates accordingly.

No problem. Just FYI, I had incorrect link in the text

https://petscan.wmcloud.org/?psid=33373583&categories=Helsinki%7CSuomenlinna&depth=4&limit=100

It should be

https://petscan.wmcloud.org/?psid=33373583&categories=Helsinki%0D%0ASuomenlinna&depth=4&limit=100

@Zache Noted!
I have made the necessary adjustments to the PetScan Cat-a-lot Integration script based on your feedback.

  1. Page-specific loading: Added a proper check at the beginning to only run on Special:BlankPage/Cat-a-lot
  2. Fixed PetScan query link: The link now correctly points to https://petscan.wmcloud.org/?project=wikimedia&language=commons
  3. Category separator fix: I am now using pipe (|) instead of line breaks (%0D%0A) as separators
  4. Category combination moved: The combination selector is now positioned between the categories and negative categories sections
  5. Default depth reduced: Changed from 4 to 2 to prevent slow queries
  6. Template parameter names fixed: Using the correct parameter names (templates_yes, templates_any, templates_no)
  7. Property filters removed: The P number and value filters have been removed
  8. Saved queries improved: The script now properly saves and loads all parameters
  9. Thumbnail container sizing: The container size is now adjusted based on thumbnail size
  10. Cat-a-lot selector fixed: The selector in findAllLabels() now uses ul.gallery li.gallerybox

I believe it works well now. Thanks a lot for the help, please review the changes I have made

Tested and some notes

  • Hmm, it seems that limit parameter name is output_limit=200 and not limit
  • Cat-a-lot selector is broken (it was fixed, but it is not currently in use in latest version?)

-The underscores in the filenames of the image files under the images could be changed to spaces so that the filenames wrap.

  • Category separator fix is still broken, it was vice versa. Correct one is %0D%0A and not | (same for templates)
  • Autoload last query doesn't work for some reason (i.e., it will stuck on loading).
  • There is some visual glitches with chrome ( empty line under the tabs, and Petscan id field is too wide for the box)

Both last state autoload and UI glitches are visible in screenshot

Screenshot 2025-04-01 at 17.56.46.png (1×2 px, 369 KB)

Thanks for this!
-I made a temporary change to the cat-a-lot and forgot to undo it, hence the selector stopped working correctly. I have fixed that

  • I've changed the separator from pipe (|) to %0D%0A
  • I've changed limit to output_limit as requested
  • I have replaced the undeercsore with spacing and it now wraps properly
  • I have also fixed the auto load issue
  • For the empty line under tabs, I chnaged the margin-bottom from 15px to 0 of the .petscan-tab-buttons
  • I added a max-width of 250px to limit the width of the PetScan ID field

-I alos removed the line of code that shows the state/status message of the autoload
Please let me know if these fixes were effective. I appreciate the feedback!

@Neyehh, there is some UI bugs with Cat-a-lot (ie, it is moving out of the screen; i hope that it is fixed by T388118).

Another issue is that when Cat-a-lot is already enabled and then the list is refreshed (i.e., new Petscan search), selecting images doesn't work anymore. This is a similar problem to the one in T389735#10729697, and fixing it would require adding an API function to the Cat-a-lot core which would trigger refreshing Cat-a-lot CSS classes and event handlers after the screen content has been changed.

Beyond that, this is actually a pretty awesome proof-of-concept script. If you want to improve it further, I have some ideas such as parsing Petscan parameters from URL, but generally I am more than happy with the current version also. (This is clearly useful for testing further ideas.) Thank you.

@Zache Thanks for the feedback! Working on it. I would love to hear the ideas for this too.

I think future direction could be to make this independ gadget which would work as "category intersection" filtering tool. For this I would add following features

1 Add support for all Petscan url parameters

First, add passthrough support for Petscan URL parameters so that the script would read them from the URL. The use case for this is that the page could be linked from category pages.

Example:

And then pass them to petscan url

So the idea is first that it would just blindly read and forward parameters to Petscan if they are defined.

2 Add UI for showing the parameters
Just add the possibility in the UI to see what parameters are defined.

3 Make UI for editing parameters
This could work so that there is an "advanced search" style UI where users can add and edit filters using + (similarly to how the templates & links are currently working, but users could also choose filter type when adding it).

4 Add link to tools menu
Add Special:BlankPage/Cat-a-lot link with prefilled url-parameters to tools menu if user is in the category page.

@Zache These are actually some much better ideas. Thank you. I will be working on it this week.

@Zache I've enhanced the PetScan Cat-a-lot integration script with these key improvements:

  1. Added support for all PetScan URL parameters through a new dedicated tab
  2. Created a UI for viewing and editing parameters with descriptions and suggestions
  3. Added a link to the tools menu on category pages that pre-fills the category name

The script now allows users to paste any PetScan URL and parse all its parameters, add/edit individual parameters, and save queries for later use. It maintains full compatibility with the existing Cat-a-lot functionality while making it much easier to work with complex PetScan queries.

Am I on track with what you needed, or would you prefer a simpler approach to the integration based on the screenshot?

Screenshot 2025-04-19 at 15.13.09.png (1×2 px, 448 KB)

I changed the assigned to @adiba_anjum as we are currently continuing this as followup for T397849. Currently one substantial difference is that UI is done with Codex. Latest version can be tested adding these to common.js

mw.loader.load('//commons.wikimedia.org/w/index.php?title=User:Adiba_anjum3/PetScan-Cat-a-lot-T389739.js&action=raw&ctype=text/javascript');
mw.loader.load('//commons.wikimedia.org/w/index.php?title=User:Adiba_anjum3/cat-a-lot-petscanUI.js&action=raw&ctype=text/javascript');

And UI opens at https://commons.wikimedia.org/wiki/Special:BlankPage/Cat-a-lot

I will create subtickets for the followup development.