1. 原始数据
hive> select * from word;
OK
1 MSN
10 QQ
100 Gtalk
1000 Skype
2. 创建保存为parquet格式的数据表
hive> CREATE TABLE parquet_table(age INT, name STRING)STORED AS PARQUET;
3. 数据表的描述
hive> describe parquet_table;
hive> describe parquet_table;
OK
id int
name string
Time taken: 0.099 seconds, Fetched: 2 row(s)
4. 插入数据
hive> INSERT OVERWRITE TABLE parquet_table SELECT * FROM word;
5. 查询
hive> select * from parquet_table;
OK
1 MSN
10 QQ
100 Gtalk
1000 Skype
6. HDFS上文件的内容(parquet二进制格式)
7.参考
https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-HiveQLSyntax