Page MenuHomePhabricator

Extra tab is prepended to quoted fields in TSV output format
Closed, ResolvedPublic


For example, when I download the results of this query:

I get:

"	rev_id"	"	rev_timestamp"	"	page_id"	"	page_title"	"	user_id"	"	actor_name"	"	user_registration"	"	ug_group"	"	archived"
3051580	"	20160917041145"	152080	"	$"	36077	"	Koavf"	"	20121111063553"		0
3151740	"	20170217015742"	154584	"	'''Swiss_German'''"	74811	"	Andrewssi2"

Note that a tab appears in the beginning of quoted values. This whitespace should not be there.

I would expect something that looks like this:

"rev_id"	"rev_timestamp"	"page_id"	"page_title"	"user_id"	"actor_name"	"user_registration"	"ug_group"	"archived"
3051580	"20160917041145"	152080	"$"	36077	"Koavf"	"20121111063553"		0
3151740	"20170217015742"	154584	"'''Swiss_German'''"	74811	"Andrewssi2"

Event Timeline

Halfak created this task.May 24 2019, 3:54 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 24 2019, 3:54 PM
Halfak renamed this task from Extra whitespace is added to quoted fields in TSV output format to Extra tab is prepended to quoted fields in TSV output format.May 24 2019, 3:57 PM
Halfak updated the task description. (Show Details)

It looks like maybe this is to blame?

I'm honestly not sure why prepending a tab ever makes sense.

Looks like this affects the CSV writer too.

This is from T209226: Quarry can be affected by CSV Injection. It's not supposed to hit every line. I'm looking into it.

Change 512420 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[analytics/quarry/web@master] Fix logic error in _inner_csv_injection_escape

Change 512420 merged by jenkins-bot:
[analytics/quarry/web@master] Fix logic error in _inner_csv_injection_escape

Mentioned in SAL (#wikimedia-cloud) [2019-05-25T12:22:00Z] <wm-bot> framawiki: Deployed cc0c0a7 on -web-01 T224300

Framawiki closed this task as Resolved.May 25 2019, 12:23 PM
Framawiki assigned this task to zhuyifei1999.
Framawiki added a subscriber: Framawiki.

Thanks for the report.