Page MenuHomePhabricator

clientIP needs to be collected as part of the schema or ...
Closed, ResolvedPublic

Description

As discussed in the comments of T165364 clientIp is not being collected for events (since T128407). We need clientIp to be able to connect the survey data to webrequest logs.

Event Timeline

From the email thread:

Here’s a little more flushed-out query for extracting relevant pieces of the event while retaining the webrequest data. The event can then be compared to the mysql event logging data to check for validity of the event.

@leila, does this satisfy the requirements for getting the client IP?

select *,get_json_object(json_event, '$.event.surveySessionToken') as survey_session_token
  from (
    select *,reflect("java.net.URLDecoder", "decode", substr(uri_query, 2)) as json_event
      from webrequest
      where
        uri_path like '%beacon/event'
        and uri_query like '%QuickSurvey%'
        and year=2017
        and month=05
        and day=15
        and hour=15
      limit 1
  ) q1
;

@schana per our follow up conversation in IRC and your suggestion: this may be a more accurate way to link the two datasets (EL and webrequest logs) anyway given that we don't have to worry about approximations based on IP+UA.

@flemmerich can you look into updating the code based on this new information and confirm if this approach allows us to link EL and webrequest logs? If yes, we don't have a blocker.

@leila I'm moving this to done based on the latest emails. Let me know if anything further is required.

leila lowered the priority of this task from Unbreak Now! to High.May 25 2017, 8:48 PM

@schana closing this per your comment and that we know now that we can link the data without this information.