Page MenuHomePhabricator

Modify script to take image from database and upload images directly to PhotoDNA
Closed, InvalidPublic3 Estimated Story Points

Description

Right now, we only upload URLs to PhotoDNA. This makes it impossible to debug locally ( PhotoDNA can't access a local URL ) and generally makes the service more flaky. To do this we need to:

[x ] Create an additional parameter to the maintenance script (--send-image-contents, default to `true) that allows the user to specify uploading as images instead of URLs
All the user has to do to have an image within their database, and give the name of the image with the --file-name parameter, which directs the program to use just that one file set in --file-name
[ x] Update the handleMedia function to allow processing an image
[x ] modify the request to accept an image OR a url and prepare the request body appropriately

This is done in T339988

Event Timeline

Question: should we also modify the email to include the image instead of the URL?

I don't think so, because if there's problematic content, we wouldn't want it to propagate further (e.g. into people's inboxes).

additionally it would be illegal, as these are forbidden to be distributed to anyone for any reason

Are we trying to turn this into something it is not? This is not going to be used for anything other than checking items in Wikimedia Foundation (WMF) database to verify content is AOK. It should be run within a cron, or whatever WMF uses instead. I think this is moving beyond what we are trying to do. I just want to confirm that this is what we want to spend time on, as there is already a way for a developer to get a test image into their database. And, I thought, that the reason that was done was so a developer could test locally.

Are we trying to turn this into something it is not? This is not going to be used for anything other than checking items in Wikimedia Foundation (WMF) database to verify content is AOK. It should be run within a cron, or whatever WMF uses instead. I think this is moving beyond what we are trying to do. I just want to confirm that this is what we want to spend time on, as there is already a way for a developer to get a test image into their database. And, I thought, that the reason that was done was so a developer could test locally.

@ERayfield I'm not sure if I follow this, could you please rephrase it? Do you mean, we should not also support sending the image URL, and instead only allow sending the image contents? If so, I think there's value in continuing to support both use cases, in case we need to move back to sending URLs for as-yet-unknown reasons.

maintenance\includes\ModerateExistingFilesHelper

public function processSingle(
		string $fileName,
		IDatabase $db,
		bool $old ): bool {
		$retBool = false;
		// ON HOLD FOR NOW T336576
		// TODO (T336576) https://phabricator.wikimedia.org/T336576
		// TODO OF PROJECT OF JUST GETTING THE THING TO RUN PROPERLY
		// if there is an image located via url, but not in the database
		// this is where we would check and verify the image
		// is valid and not malicious
		// if ( str_starts_with( $fileName, 'http' ) ) {
		// $img = file_get_contents($fileName);
		// echo $img;
		// } else {

Create an additional parameter to the maintenance script (--send-image-contents, default to `true) that allows the user to specify uploading as images instead of URLs
I do not think adding this parameter would be of use. This is not a public-facing code, it is a script to verify images are free of particular content, and notify TPTB (the powers that be) of images that PhotoDNA flagged. If a complete URI is used (and is located in the database), yes the content may be processed if it is in a proper format. There is no need to specify what to send to PhotoDNA, IMHO.

JKieserman renamed this task from Modify script to allow for uploading images directly to PhotoDNA to Modify script to take image from database and upload images directly to PhotoDNA.Jul 24 2023, 2:18 PM
This comment has been deleted.

modify the email to include the prefixed Title name - what title? title of image? title of person being emailed? needs clarification

ERayfield updated the task description. (Show Details)

This is done in T339988

Change 949091 had a related patch set uploaded (by Jsn.sherman; author: Jsn.sherman):

[mediawiki/extensions/MediaModeration@master] [DNM] gerrit dependent patch example - parent

https://gerrit.wikimedia.org/r/949091

Change 949091 abandoned by Jsn.sherman:

[mediawiki/extensions/MediaModeration@master] [DNM] gerrit dependent patch example - parent

Reason:

https://gerrit.wikimedia.org/r/949091