1,spark中用sql方式查询的方法步骤:
1)spark
res3: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSessio
2)读取数据的步骤,用spark.read 再按Table键,spark会告诉你spark能读取文件的格式有哪些,来,我们试一下。
spark.read.
csv format jdbc json load option options orc parquet schema table text textFile
不仅支持csv,jdbc,json,还支持parquet,orc,textFile,table等等格式,有些我也没试过,那我们接下来read.json试一下。
3)spark用sql的方式打开
spark.read.json("file:///opt/module/data/input/2.json")
res4: org.apache.spark.sql.DataFrame = [age: bigint, name: string]
4)spark定义df
val df = spark.read.json("file:///opt/module/data/input/2.json")
df: org.apache.spark.sql.DataFrame = [age: bigint, name: string]
5)需要建立全局临时表,关键要有表名
df.createGlobalTempView("student")
6)
spark.sql("select * from global_temp.student").show()
+---+--------+
|age| name|
+---+--------+
| 20|zhangsan|
| 20| lisi|
| 20| wangwu|
+---+--------+