Implement prioritization of request processing
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	Halfak
	Oct 18 2016, 10:36 PM

Description

Worker Queue = WQ

WQ = 0-10 -- Score all requests
WQ = 10-25 -- Score white list and requests with an email in the user-agent
WQ > 25 -- Score white list

Related Objects

Mentioned In: T148347: [Discuss] DOS attacks on ORES. What to do?

Event Timeline

Halfak created this task.Oct 18 2016, 10:36 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 18 2016, 10:36 PM

I'm thinking of extending the scoring_system config with a block for matching user-agents.

queue_maxsize: 100 # pending tasks
queue_thresholds:
  everyone:
    n: 25
    user-agent: ''
  email_in_user-agent:
    n: 25
    user-agent: '\S+@\S+'
  white-list:
    n: 100
    user-agent: 'precached|ChangePropagation|MW_api.php'

I don't like the idea that a scoring system should be able to take a user-agent. I think we'll want an external prioritization system that passes priorities into the score() method.

OK. New idea. I think that ORES should independently flag the priority of requests and that a scoring system could then use those priorities to decide when it is too overloaded. E.g.:

ores:
  priority:
    everyone:
      user-agent: ''
    email_in_user-agent:
      user-agent: '\S+@\S+'
    white-list:
      user-agent: 'precached|ChangePropagation|MW_api.php'

...

scoring_systems:
  celery_queue:
    queue_maxsize: 100
    queue_thresholds:
      everyone: 10
      email_in_user-agent: 25
      white-list: 100

Halfak mentioned this in T148347: [Discuss] DOS attacks on ORES. What to do?.Oct 24 2016, 5:01 PM

Halfak removed Halfak as the assignee of this task.Oct 27 2016, 2:39 PM

Halfak edited projects, added Machine-Learning-Team; removed Machine-Learning-Team (Active Tasks).

Halfak triaged this task as Medium priority.Oct 27 2016, 2:42 PM

Halfak moved this task from Unsorted to Maintenance/cleanup on the Machine-Learning-Team board.

As a fun side-project, I'm studying distributed systems theory and it seems, in services using several queues is highly encouraged rather than using prioritization in one queue. I think that's why celery has virtually no support of prioritization and everywhere people say "use several queues instead". An example is Out-of-band data.

I hereby suggest declining this. We don't have capacity problems anymore and if needed we should dedicate fast lanes (queues) for important jobs and not prioritize requests. If there's no objection by next Monday, I will close this.

Ladsgroup closed this task as Declined.Jul 31 2018, 7:16 AM

Implement prioritization of request processingClosed, DeclinedPublicActions

Description

Related Objects

Event Timeline

Implement prioritization of request processing
Closed, DeclinedPublic
Actions