Maniphest T208550

bothersome output in hive when querying events database
Closed, ResolvedPublic3 Estimated Story Points
Actions

Assigned To

Authored By

	• Nuria
	Nov 1 2018, 9:37 PM

Description

A lot of output in hive when querying events database, how can we make it disappear?

Can't load log handler "java.util.logging.FileHandler"
java.io.FileNotFoundException: /tmp/hive-parquet-logs/parquet-0.log (Permission denied)
java.io.FileNotFoundException: /tmp/hive-parquet-logs/parquet-0.log (Permission denied)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
at java.util.logging.FileHandler.open(FileHandler.java:228)
at java.util.logging.FileHandler.rotate(FileHandler.java:680)
at java.util.logging.FileHandler.openFiles(FileHandler.java:557)
at java.util.logging.FileHandler.<init>(FileHandler.java:281)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruc

Nov 1, 2018 9:35:31 PM INFO: parquet.hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 5161 records.
Nov 1, 2018 9:35:31 PM INFO: parquet.hadoop.InternalParquetRecordReader: at row 0. reading next block
Nov 1, 2018 9:35:31 PM INFO: parquet.hadoop.InternalParquetRecordReader: block read in memory in 21 ms. row count = 5161
Nov 1, 2018 9:35:31 PM WARNING: parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl

Note: this output is printed when query does not return any results

Details

	Subject	Repo	Branch	Lines +/-
	Update hive parquet log destination	operations/puppet/cdh	master	+5 -25

Customize query in gerrit

Related Objects

Mentioned In: T297734: Hive query failure in Jupyter notebook on stat1005

Event Timeline

• Nuria created this task.Nov 1 2018, 9:37 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 1 2018, 9:37 PM

• Nuria updated the task description. (Show Details)Nov 1 2018, 9:37 PM

• Nuria updated the task description. (Show Details)Nov 1 2018, 9:47 PM

@Nuria: This is related to the fix we deployed to try to prevent the logging lines (https://gerrit.wikimedia.org/r/c/operations/puppet/cdh/+/469499)
Can you precise which machine you were using when you got those logs?

I was using 1007.

The only reason I can think of for this issue to happen would be that you and someone else have a hive query writing parquet-logs at the same time.
I'd love to be able to provide a better naming for the log-files (embedding username at least), but java-logging configuration doesn't allow that easily by default.
Maybe there are ways to do differently?

• fdans assigned this task to JAllemandou.Nov 5 2018, 5:32 PM

• fdans triaged this task as High priority.

• fdans moved this task from Incoming to Operational Excellence on the Analytics board.

• fdans added a project: Analytics-Kanban.

Change 471928 had a related patch set uploaded (by Joal; owner: Joal):
[operations/puppet/cdh@master] Update hive parquet log to HiverServer2 only

https://gerrit.wikimedia.org/r/471928

gerritbot added a project: Patch-For-Review.Nov 6 2018, 9:55 AM

JAllemandou moved this task from Next Up to In Code Review on the Analytics-Kanban board.Nov 6 2018, 9:55 AM

Change 471928 merged by Elukey:
[operations/puppet/cdh@master] Update hive parquet log destination

https://gerrit.wikimedia.org/r/471928

JAllemandou moved this task from In Code Review to Done on the Analytics-Kanban board.Nov 8 2018, 5:01 PM

• Nuria set the point value for this task to 3.Nov 12 2018, 4:00 PM

This worked great and bogus output is no longer there.

• Nuria closed this task as Resolved.Nov 12 2018, 8:20 PM

BTullis mentioned this in T297734: Hive query failure in Jupyter notebook on stat1005.Jan 4 2022, 4:44 PM

Maintenance_bot removed a project: Patch-For-Review.Jan 4 2022, 5:10 PM

bothersome output in hive when querying events databaseClosed, ResolvedPublic3 Estimated Story PointsActions

Description

Details

Related Objects

Event Timeline

bothersome output in hive when querying events database
Closed, ResolvedPublic3 Estimated Story Points
Actions