Page MenuHomePhabricator

raw-paste-data.txt

Authored By
klausman
Feb 22 2021, 4:02 PM
Size
816 B
Referenced Files
None
Subscribers
None

raw-paste-data.txt

>>> rdd = sc.sequenceFile("/wmf/data/raw/atskafka_test_webrequest_text/atskafka_test_webrequest_text/hourly/2021/02/21/12/")
>>> webrequest_schema = spark.table("wmf_raw.webrequest").schema
>>> df = spark.read.schema(webrequest_schema).json(rdd)
>>> df.createOrReplaceGlobalTempView("requests")
21/02/22 16:01:12 WARN Utils: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.
>>> df2 = spark.sql("select uri_path from global_temp.requests limit 10")
>>> df2.show()
+--------+
|uri_path|
+--------+
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
| null|
+--------+

File Metadata

Mime Type
text/plain; charset=utf-8
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
8899152
Default Alt Text
raw-paste-data.txt (816 B)

Event Timeline