Page MenuHomePhabricator

Set a timeout for regex parsing in the Eventlogging processors
Closed, DeclinedPublic


As a follow up for T200630, the best and most flexible solution seems to be to add a maximum execution time limit while parsing (applying various regexes) to the UA string.

An initial investigation from @Milimetric looks promising:

Event Timeline

elukey created this task.

Using a contextmanager decorator and the built-in TimeoutError exception (if Py3 only) the code could be much simpler, something like:

import signal

from contextlib import contextmanager

def timeout_handler(signum, frame):
    raise TimeoutError

def timeout(seconds):
    prev_handler = signal.signal(signal.SIGALRM, timeout_handler)  # Store the previous signal handler
    signal.alarm(seconds)  # Set the timeout

        yield  # Yield to the context

        signal.alarm(0)  # Reset the alarm
        signal.signal(signal.SIGALRM, prev_handler)  # Restore the previous handler

### Usage
    with timeout(3):
        # call the ua parser
except TimeoutError:
    # handle the failed case

Thanks @Volans, didn't know about the contextmanager, that's nicer. And @elukey, as for concerns about the use of signal complicating the code, after reading more about it I think it's fine. Signals are received on the main thread as you said in the email, and the kafka-python threads are started in another part of the code. So all's good.

Closing because we haven't seen the issue resurface. Our temporary fix seems fine until we transition off of EventLogging, hopefully sooner than later.

Rather than closing I am going to move this one to "deprioritized" for now so as not to loose track of it completely.

Nuria lowered the priority of this task from High to Low.
Nuria raised the priority of this task from Low to Needs Triage.
Nuria moved this task from Operational Excellence to Deprioritized on the Analytics board.
Nuria triaged this task as Low priority.