Maniphest T20147

Faster logic for Abuse Filter parser
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Dragons_flight
	Mar 25 2009, 7:29 AM

Description

The attached patch improves the execution speed of AbuseFilterParser::nextToken through a series of small changes.

The most significant impact comes from modifying the application of regex to focus on the immediate offset and not look downstream unnecessarily. It also stops radixRegex from giving empty string matches.

This patch preserves all current behavior and is transparent to the user.

Benchmarking done with function evaluation and variable lookup hacked off, saw a ~20% improvement in the parsing speed for rules after applying this patch.

Version: unspecified
Severity: enhancement

Details

Reference: bz18147

Event Timeline

• bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:36 PM

• bzimport added a project: AbuseFilter.

• bzimport set Reference to bz18147.

Dragons_flight created this task.Mar 25 2009, 7:29 AM

Faster logic for parser

Retrying to upload patch...

Attached:

faster_nextToken.patch4 KBDownload

The use of substr( $code, $offset ) is slow. Instead, we should be using the /A modifier (which I thought I was, maybe I didn't commit properly).

Calling preg_match with an offset only matches the beginning of the string flag if the offset is actually set to 0. This is annoying behavior, but I think the only way to force a beginning of string match from preg_match is actually to send it a truncated string.

(In reply to comment #3)

Calling preg_match with an offset only matches the beginning of the string flag
if the offset is actually set to 0. This is annoying behavior, but I think the
only way to force a beginning of string match from preg_match is actually to
send it a truncated string.

I know that, but as I said in my previous comment, you can use the /A modifier to do what you want.

http://au2.php.net/manual/en/reference.pcre.pattern.modifiers.php

Oh, neat. I learned regex in Python, and I'm pretty confident Python doesn't have that flag. By all means, that looks even better.

Done with a similar, but independent patch in r48806.

	F5660: faster_nextToken.patch
	Nov 21 2014, 10:36 PM

Faster logic for Abuse Filter parserClosed, ResolvedPublicActions

Description

Details

Event Timeline

Faster logic for Abuse Filter parser
Closed, ResolvedPublic
Actions