Page MenuHomePhabricator

Deploy log tailing/filtering
ClosedPublic

Authored by dduvall on Oct 15 2015, 12:15 AM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, May 22, 1:39 PM
Unknown Object (File)
Tue, May 16, 5:13 AM
Unknown Object (File)
Mon, May 15, 1:29 AM
Unknown Object (File)
Apr 20 2023, 6:43 AM
Unknown Object (File)
Apr 13 2023, 5:06 AM
Unknown Object (File)
Mar 10 2023, 2:37 AM
Unknown Object (File)
Mar 7 2023, 11:10 PM
Unknown Object (File)
Mar 6 2023, 10:10 PM
Subscribers
None

Details

Maniphest Tasks
T115137: (Re)implement logging output for scap targets
Reviewers
thcipriani
mmodell
mobrovac
demon
Commits
rMWTS2b3bc25baab0: Deploy log tailing/filtering
rMSCA2b3bc25baab0: Deploy log tailing/filtering
Patch without arc
git checkout -b D18 && curl -L https://phabricator.wikimedia.org/D18?download=true | git apply
Summary

The main deploy application now sends all structured log output to a
file under scap/log/{git-tag}.log which the new deploy-log utility
can tail and filter using a given free-form expression. By default the
latter utility will periodically scan the scap/log directory for new
files and immediately begin tailing them. It can also be given an
explicit log file to parse via the --file option or the latest log
file by using --latest; in this case, it will simply filter the entire
file for matching records and exit.

Free-form log.Filter expressions are hopefully simple enough to grok
by most users and general enough to be used with later implementations
of iscap log multiplexing. See deploy-log -h or log.Filter for
expression examples.

Introduced a base main.DeployApplication class which sets up some of
the most commonly referenced attributes of the derived deploy classes,
including the root deploy directory, scap directory, and log directory.

Depends on D15. Resolves T115137

Test Plan

Tail behavior

  1. Run deploy-log {expr}.
  2. Run deploy in a separate terminal.
  3. Verify that deploy-log in the first terminal starts reading the new log file. It should say -- Opening log file: {file}.
  4. Verify that only log messages matching the given expression are output.

Latest log file behavior

  1. Run deploy.
  2. Run deploy-log -l {expr}
  3. Verify that only log messages from the latest log file matching the given expression are output.

Single log file behavior

  1. Run deploy a couple of times.
  2. Run deploy-log -f {log-file} {expr}.
  3. Verify that only log messages from the given log file matching the given expression are output.

Diff Detail

Repository
rMSCA Scap
Branch
T115137-follow-up
Lint
Lint Errors
SeverityLocationCodeMessage
Errorbin/deploy-log:16E402PEP8 E402
Errorscap/utils.py:693E731PEP8 E731
Unit
No Test Coverage
Build Status
Buildable 134
Build 136: test harbormaster with jenkins
Build 135: test harbormaster with jenkins

Event Timeline

dduvall retitled this revision from to Deploy log tailing/filtering.
dduvall updated this object.
dduvall edited the test plan for this revision. (Show Details)
dduvall added reviewers: thcipriani, mmodell, demon.

Removed redundant setting of self.root_dir in main.Deploy

Let me reiterate what I said to Dan in IRC, "holy shit deploy-log is really awesome."

scap/log.py
361

This seems to blow up since the partial returned by using the expression: levelno > WARNING would be: lambda v: getattr(30, 'gt')(v)

scap/main.py
1183

"taret"

1200

currently must be an absolute directory which was unexpected from a usage perspective. Might be better to do:

os.path.join(os.getcwd(), self.arguments.file)
dduvall edited edge metadata.

Reimplemented log.Filter expression parsing using shlex instead of a regex (thanks, @thcipriani!) and fixed comparison operators (>, <, >=, <=).

dduvall marked an inline comment as done.

Fixed typo

scap/main.py
1200

ignore me, had the paths all messed up.

Fixed another typo, this time in the test plan

mmodell edited edge metadata.

Very cool stuff. Great work Dan!

scap/main.py
1221

I like this algorithm. Good thinking.

This revision is now accepted and ready to land.Oct 15 2015, 11:26 PM
mmodell requested changes to this revision.Oct 16 2015, 8:18 PM
mmodell edited edge metadata.

Found problems that were actually caused by D15 but I'm providing feedback here because the problems came up during testing of this patch.

scap/log.py
124

This fails when there is an exception passed from the remote deploy-local via json, because the exception can't be serialized in a satisfactory format. We probably need to create our own stack trace (as a string or whatever...) and avoid passing it in the exc_info argument slot.

140

This fails when there is an exception in the record. Exceptions aren't serializable. You can pass default=callable(rec) to the dumps method, to override the default handling of non-serializable objects, however, when I do that, there is another problem on the other end when we reconstruct the logmessage from the serialized json (see above, line #119)

This revision now requires changes to proceed.Oct 16 2015, 8:18 PM

I spent a lot of time on this over the weekend, and simplified it some, but I still haven't managed to get the stack traces to carry through from deploy-local all the way through to deploy-log.

dduvall added inline comments.
scap/log.py
124

Pre-formatting exc_info makes sense to me. Per our IRC discussion, we need to figure out how to properly deserialize/propagate this info on the consuming end.

dduvall edited edge metadata.

Added a failing test for handling of exc_info

dduvall edited edge metadata.

Added failing test for deserialization of exc_text

scap/log.py
124

Just added a failing test for this.

140

Added a failing test for this as well.

Refactored log.JSONFormatter serialization to handle exceptions correctly by first formatting them as text.

Note: I'm working on revising this.

scap/log.py
127

If we add this, it will ensure that unserializable objects won't throw exceptions here:

def serialize_obj(obj):
    if hasattr(obj, "getvalue"):
        return obj.getvalue()
    elif hasattr(obj, "__dict__"):
        return obj.__dict__
    return None
rec = record.__dict__.copy()
return json.dumps(rec, default=serialize_obj, skipkeys=True)
scap/log.py
117

logrecord can actually contain arbitrary fields, and this doesn't account for any we decide to add. Maybe not a huge deal but @thcipriani and I were talking about surfacing important errors by including an "important" tag field in the log messages on the target machine, which would cause them to pass the filter on the main scap deploy(cli.application) console.

scap/log.py
111

why is args marked unserializable? It needs to be serialized for parameterized log messages to work.

Now all the log entries print out just the format string without the data, for example:

19:48:39 [debian-jessie] %s on %s returned [%d]: %s
mmodell edited reviewers, added: dduvall; removed: mmodell.
mmodell marked 2 inline comments as done.
  • pass default=callback to handle unserializable objects passed to json.dumps
  • and do not exclude args from logmessage serialization, args is important!

Now tests pass and exceptions propogate! :)

Your changes look good to me, but my commits seem to have been clobbered by the new diff. Also, arc patch D18 is now super confused and tries to cherry-pick D15: Normalize structured logging data which results in an empty commit.

scap/log.py
100–101

Arbitrary fields should still be included. See the _isvalid(field) method.

100–101

I was a bit worried about running into circular references there, but it looks like the JSON encoder actually takes care of that.

dduvall edited reviewers, added: mmodell; removed: dduvall.
mmodell edited edge metadata.

awesome

This revision is now accepted and ready to land.Oct 21 2015, 4:40 PM