Page MenuHomePhabricator

Eventlogging-service handles null incorrectly
Closed, ResolvedPublic

Description

Today I've discovered a couple of errors in the logs related to EventBus, which I could narrow down to 2 issues with test cases:

Problem 1:
If the field is optional, but it's value is specified as null, eventlogging-service rejects the event with None is not of type 'string' message.
Example:

curl -XPOST -H 'content-type: application/json'  localhost:8085/v1/events --data '{"user_blocked": "201.10.143.37", "reason": null, "meta": {"domain": "pt.wikipedia.org", "uri": "/user_block/uri", "topic": "mediawiki.user_block", "request_id": "9b0ff34e-402c-421b-8bb1-840462cdae0b", "schema_uri": "user_block/1", "dt": "2016-05-14T16:42:02+00:00", "id": "c8b10681-19f2-11e6-aab2-90b11c2793de"}, "user_id": 1053698, "user_text": "\u00d6nni", "blocks": {"account_create": true, "user_talk": false, "name": false, "email": false}, "expiry": "20160514174202"}'

Note the "reason": null part, it's the core of the problem.

Problem 2.
If I fix up the previous issue by providing a value to the service, it successfully emits an event, but for some reason responds with 500 and <html><title>500: Internal Server Error</title><body>500: Internal Server Error</body></html> message. I'm using the service from vagrant installation.
Example:

curl -i -XPOST -H 'content-type: application/json'  localhost:8085/v1/events --data '{"user_blocked": "201.10.143.37", "meta": {"domain": "pt.wikipedia.org", "uri": "/user_block/uri", "topic": "mediawiki.user_block", "request_id": "9b0ff34e-402c-421b-8bb1-840462cdae0b", "schema_uri": "user_block/1", "dt": "2016-05-14T16:42:02+00:00", "id": "c8b10681-19f2-11e6-aab2-90b11c2793de"}, "user_id": 1053698, "user_text": "\u00d6nni", "blocks": {"account_create": true, "user_talk": false, "name": false, "email": false}, "expiry": "20160514174202"}'

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptMay 19 2016, 12:24 PM
Pchelolo updated the task description. (Show Details)May 19 2016, 12:25 PM

Just chatted in IRC. Problem 1 is expected, null is not a valid string, so the schema validation fails. I can't reproduce Problem 2.

As for problem 2, here's the log:

2016-05-19 15:35:36,756 (MainThread) Writer kafka:///localhost:9092?async=False&topic=datacenter1.{meta[topic]} has stopped: .  Attempting to restart.
2016-05-19 15:35:36,802 (MainThread) Error loading topic metadata for datacenter1.mediawiki.user_block: <class 'kafka.common.LeaderNotAvailableError'> (5)
2016-05-19 15:35:37,342 (MainThread) Error loading topic metadata for datacenter1.mediawiki.user_block: <class 'kafka.common.LeaderNotAvailableError'> (5)
2016-05-19 15:35:37,843 (MainThread) Failed to ensure Kafka topic datacenter1.mediawiki.user_block exists in 1.000000 seconds when producing event. (Unable to create topic datacenter1.mediawiki.user_bloc$
2016-05-19 15:35:37,850 (MainThread) Uncaught exception POST /v1/events (10.0.2.2)
HTTPServerRequest(protocol='http', host='localhost:8085', method='POST', uri='/v1/events', version='HTTP/1.1', remote_ip='10.0.2.2', headers={'Host': 'localhost:8085', 'Content-Type': 'application/json',$
Traceback (most recent call last):
  File "/vagrant/srv/eventlogging/virtualenv/local/lib/python2.7/site-packages/tornado/web.py", line 1445, in _execute
    result = yield result
  File "/vagrant/srv/eventlogging/virtualenv/local/lib/python2.7/site-packages/tornado/gen.py", line 1008, in run
    value = future.result()
  File "/vagrant/srv/eventlogging/virtualenv/local/lib/python2.7/site-packages/tornado/concurrent.py", line 232, in result
    raise_exc_info(self._exc_info)
  File "/vagrant/srv/eventlogging/virtualenv/local/lib/python2.7/site-packages/tornado/gen.py", line 1014, in run
    yielded = self.gen.throw(*exc_info)
  File "/vagrant/srv/eventlogging/eventlogging/service.py", line 292, in post
    self.application.handle_events, events
  File "/vagrant/srv/eventlogging/virtualenv/local/lib/python2.7/site-packages/tornado/gen.py", line 1008, in run
    value = future.result()
  File "/vagrant/srv/eventlogging/virtualenv/local/lib/python2.7/site-packages/tornado/concurrent.py", line 232, in result
    raise_exc_info(self._exc_info)
  File "/vagrant/srv/eventlogging/virtualenv/local/lib/python2.7/site-packages/tornado/gen.py", line 584, in Task
    func(*args, callback=_argument_adapter(set_result), **kwargs)
  File "/vagrant/srv/eventlogging/eventlogging/service.py", line 192, in handle_events
    self.process_event(event)
  File "/vagrant/srv/eventlogging/eventlogging/service.py", line 175, in process_event
    self.send(event)
  File "/vagrant/srv/eventlogging/eventlogging/service.py", line 138, in send
    w.send(event)
  File "/vagrant/srv/eventlogging/eventlogging/handlers.py", line 241, in kafka_writer
    raise e
KafkaTimeoutError: Unable to create topic datacenter1.mediawiki.user_block
2016-05-19 15:35:37,851 (MainThread) 500 POST /v1/events (10.0.2.2) 1097.02ms

Looks like something is wrong with my vagrant setup as the service cannot write to kafka

Change 289680 had a related patch set uploaded (by Ppchelko):
Do not include the null block reason

https://gerrit.wikimedia.org/r/289680

Looks like something is wrong with my vagrant setup as the service cannot write to kafka

I had the same problem recently on MW-Vagrant. Stopping both Kafka and ZK, clearing ZK's dir, starting them up again and running ./bin/ensure-kafka-topics-exist /--topic-config vagrant/srv/event-schemas/config/eventbus.yaml --prefixes datacenter1 did the trick for me.

Change 289680 merged by Mobrovac:
Do not include the null block reason

https://gerrit.wikimedia.org/r/289680

Pchelolo closed this task as Resolved.May 20 2016, 9:24 AM

Ok, null issue was fixed up, and vagrant problem resolved by @mobrovac suggestion