[XL] Estimate coverage of image suggestions at different confidence levels
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Cparle
	May 27 2021, 7:34 PM

Description

We want to get estimates of how many total unillustrated articles on each of the relevant wikis will have an image recommended by the new pipeline, for different levels of likelihood-that-an-image-is-good in the recommendation. This is necessary for us to make a decision about which confidence score cutoff to use in making the recommendations. In general, we want the highest confidence score possible, but if there aren't enough recommendations at a high score, we will consider using a lower score.

The wikis are:
pt
ru
id

The likelihood-that-an-image-is-good levels we want to measure are 0.9, 0.8, 0.7

Acceptance criteria:

Document the number of suggestions for unillustrated articles in the above wikis at the 0.9 confidence level
Work with product management to evaluate whether that number is sufficient
If not, measure again at the 0.8 level, etc.

Related Objects
Search...

Status	Assigned	Task
Resolved	CBogen	T299781 [EPIC] Image suggestions backend
Resolved	CBogen	T281582 [EPIC] Develop a confidence score for MediaSearch results
Resolved	CBogen	T292142 [EPIC] Image Suggestions Notifications for More Experienced Contributors
Resolved	CBogen	T283865 [XL] Estimate coverage of image suggestions at different confidence levels

Event Timeline

Cparle created this task.May 27 2021, 7:34 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 27 2021, 7:34 PM

CBogen edited projects, added Structured-Data-Backlog (Current Work); removed Structured-Data-Backlog.Jun 14 2021, 4:50 PM

CBogen moved this task from Incoming to Ready for Estimation on the Structured-Data-Backlog (Current Work) board.Jun 14 2021, 5:04 PM

CBogen renamed this task from Estimate mediasearch coverage of image recommendations at different confidence levels to [XL] Estimate mediasearch coverage of image recommendations at different confidence levels.Jun 30 2021, 4:40 PM

CBogen moved this task from Ready for Estimation to Ready for Development on the Structured-Data-Backlog (Current Work) board.

CBogen added a parent task: T281582: [EPIC] Develop a confidence score for MediaSearch results.Aug 5 2021, 5:37 PM

We're probably not going to be suggesting images only for unillustrated articles (at least for structured data), and because our target for images-added has changed, this information is probably no longer very useful for us

CBogen reopened this task as Open.Oct 1 2021, 5:08 PM

CBogen updated the task description. (Show Details)

CBogen added a parent task: T292142: [EPIC] Image Suggestions Notifications for More Experienced Contributors.

CBogen edited projects, added Structured-Data-Backlog; removed Structured-Data-Backlog (Current Work).

CBogen mentioned this in T292147: [L] Send Image Suggestions notifications to experienced users.Oct 1 2021, 7:04 PM

CBogen updated the task description. (Show Details)Oct 4 2021, 4:53 PM

CBogen updated the task description. (Show Details)

CBogen edited projects, added Structured-Data-Backlog (Current Work); removed Structured-Data-Backlog.Oct 4 2021, 4:56 PM

Blocked on T283869

CBogen renamed this task from [XL] Estimate mediasearch coverage of image recommendations at different confidence levels to [XL] Estimate coverage of image recommendations at different confidence levels.Mar 14 2022, 4:21 PM

CBogen updated the task description. (Show Details)

CBogen moved this task from Blocked to Ready for Development on the Structured-Data-Backlog (Current Work) board.

CBogen renamed this task from [XL] Estimate coverage of image recommendations at different confidence levels to [XL] Estimate coverage of image suggestions at different confidence levels.Mar 14 2022, 4:24 PM

CBogen updated the task description. (Show Details)

Cparle updated the task description. (Show Details)Mar 14 2022, 5:33 PM

We'll need at least a preliminary dataset from to do this work

Moved this into blocked - it should be quite easy to do once we have T299789 done, so there's no point in wasting effort doing it before then

Cparle moved this task from Ready for Development to Blocked on the Structured-Data-Backlog (Current Work) board.Mar 21 2022, 10:29 AM

CBogen updated the task description. (Show Details)Mar 23 2022, 5:33 PM

CBogen moved this task from Blocked to Ready for Development on the Structured-Data-Backlog (Current Work) board.May 2 2022, 4:33 PM

Confidence >= 90%

wiki	pages_with_suggestions
ptwiki	7274
idwiki	4589
ruwiki	2562

Confidence >= 80%

wiki	pages_with_suggestions
ptwiki	126607
idwiki	64413
ruwiki	101126

Confidence >= 70%

wiki	pages_with_suggestions
ptwiki	129440
idwiki	66243
ruwiki	104690

Ok to resolve this @CBogen ?

Cparle moved this task from Ready for Development to Doing on the Structured-Data-Backlog (Current Work) board.May 4 2022, 10:13 AM

In T283865#7902547, @Cparle wrote:

Ok to resolve this @CBogen ?

@Cparle which confidence level are we using in the current iteration of the data pipeline?

also just tagging @SWakiyama so she's aware.

Once you answer this, we can close the ticket, thanks!

@Cparle which confidence level are we using in the current iteration of the data pipeline?

We're writing suggestions at each confidence level (0.7, 0.8, 0.9), and leaving the decision on which to use up to the client. Does that answer your question?

It does, thanks! I think that then @SWakiyama needs to make a call based on the information gathered for this ticket which confidence level to use in T292147 (AC #3). I'll follow up with her offline and will close this ticket.

Thanks Cormac! We'll use >= 80% when making suggestions.

[XL] Estimate coverage of image suggestions at different confidence levelsClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Confidence >= 90%

Confidence >= 80%

Confidence >= 70%

[XL] Estimate coverage of image suggestions at different confidence levels
Closed, ResolvedPublic
Actions

Related Objects
Search...