hbase phoenix下载 hbase和phoenix对比使用

转载

云端小梦 2023-11-17 14:37:37

文章标签 hbase phoenix下载 hbase 数据库 hdfs 大数据 文章分类 Hbase 数据库

接着上一篇数据库应用实验，这篇博客主要是搞明白Phoenix+HBase，刚开始接触这两个名词，一点都不理解这是干什的，一顿搜索B站的讲解，才大致了解这是干什么的。

一、Phoenix+HBase是什么

HBase是一个NOSQL(not onlysql)的数据库，能够实时读写大量的数据。单张表就可以做到10亿*百万列数据量的级别。Phoenix是构建在HBase上的一个SQL层，能让我们用标准的JDBC，Phoenix完全使用Java编写，作为HBase内嵌的JDBC驱动。Phoenix查询引擎会将SQL查询转换为一个或多个HBase扫描，并编排执行以生成标准的JDBC结果集。他们两个的关系是：Phoenix操作HBase，将SQL编译成原生的HBase scans。确定scan关键字的最佳开始和结束，让scan并行执行来查询出结果。

hbase phoenix下载 hbase和phoenix对比使用_hdfs

我在听B站关于Hbase的介绍，截下来的Hbase的架构图。上图讲述了每一个进程的作用。

逻辑结构

hbase phoenix下载 hbase和phoenix对比使用_hdfs_02

物理结构

hbase phoenix下载 hbase和phoenix对比使用_大数据_03

遇到的问题：

1、如何切换主机：

ssh+master 然后再输入密码

2、将文件上传到hdfs上：

hadoop fs -put /home/zkpk/experiment/t1.csv / 最后是一个空格和/ 自己没输入空格然后搞了半天都是错误的，啊啊啊啊

3、将文件导入hbase中

首先执行上面那一行命令，在执行下面这一条命令

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="," -Dimporttsv.columns=HBASE_ROW_KEY,cf1:province,cf1:market,cf1:class,cf1:name,cf1:lowprice,cf1:highprice,cf1:avgprice,cf1:time t1 /t1.csv（文件所在路径）

4、使用hbase之前，需要启动hadoop

start-all.sh，命令为启动hadoop集群

jps 查看启动进程

ssh 节点名称进入不同的节点

5、在使用Hbase之前，也要启动zookeeper

zkServer.sh start 启动zookeeper命令

并且再不同的节点都要启动zookeeper

zkServer.sh status 查看zookeeper状态

6、启动Hbase

start-hbase.sh 启动Hbase命令

在主节点上运行jps，观察是否有HMaster，若有，则Hbase启动完成，否则启动失败。

在从节点上运行jps，观察是否有HRegionServerm，若有，则启动完成，否则启动失败。

二、Hbase原生命令

hbase shell 启动shell

create '表名','列簇1','列簇2' 创建表

disable '表名' 下线表

enable '表名' 上线表

drop '表名' 删除表

list_namaspace 显示命名空间

create namespace '空间名' 创建命名空间

drop_namespace '空间名' 删除命名空间

put '表名','行','列族：列','值' 插入值和修改值

scan '表名' 浏览表

get '表名',''列族：列' 查询表

get '表名','列族' 查询表

get '表名','行' 查询表

delete '表名','行','列族' 删除表的列族

delete '表名','行','列族：列' 删除表的某一列

deleteall '表名','行' 删除行

truncate '表名' 删除表

三、HbaseAPI命令

package org.zkpk.hbase.api;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;



public class HbaseT1 {
	 static int count;
	    static Configuration conf;
	    static Connection connection;
	    static {
	        try {
	            conf = HBaseConfiguration.create();
	            conf.set("hbase.zookeeper.quorum", "master:2181");
	            connection = ConnectionFactory.createConnection(conf);
	        } catch (IOException e) {
	            // TODO Auto-generated catch block
	            e.printStackTrace();
	        }
	    }
	    
	    
	    //插入一条数据
	    public static void putRow(String tableName, String rowkey, String cfName, String qualifer, String data) {
	        try {
	            Table table = connection.getTable(TableName.valueOf(tableName));
	            Put put = new Put(Bytes.toBytes(rowkey));
	            put.addColumn(Bytes.toBytes(cfName), Bytes.toBytes(qualifer), Bytes.toBytes(data));
	            table.put(put);
	        } catch (Exception e) {
	            e.printStackTrace();
	        }
	        System.out.println("插入一条数据成功！");
	    }
	    
        //插入多条数据
	    public static void putRows(String tableName, List<Put> puts) {
	        try {
	            Table table = connection.getTable(TableName.valueOf(tableName));
	            count += puts.size();
	            table.put(puts);
	        } catch (Exception e) {
	            e.printStackTrace();
	        }
	    } 
	    
	    
        //获取数据
	    public static void getRow(String tableName, String rowkey) {
	        try {
	            Table table = connection.getTable(TableName.valueOf(tableName));
	            Get get = new Get(Bytes.toBytes(rowkey));
	            Result result = table.get(get);
	            System.out.println("rowkey == " + Bytes.toString(result.getRow()));
	            System.out.println(
	                    "province == " + Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("province"))));
	            System.out.println(
	                    "market == " + Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("market"))));
	            System.out
	                    .println("name == " + Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("name"))));
	            System.out.println(
	                    "avgprice == " + Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("avgprice"))));
	            System.out
	                    .println("time == " + Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("time"))));
	        } catch (Exception e) {
	            e.printStackTrace();
	        }
	    }
	    
        //扫描数据
	    public static void getScanner(String tableName) {
	        try {
	            Table table = connection.getTable(TableName.valueOf(tableName));
	            Scan scan = new Scan();
	            scan.setCaching(1000);
	            ResultScanner results = table.getScanner(scan);
	            results.forEach(result -> {
	                System.out.println("rowkey == " + Bytes.toString(result.getRow()));
	                System.out.println("province == "
	                        + Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("province"))));
	                System.out.println(
	                        "market == " + Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("market"))));
	                System.out.println(
	                        "name == " + Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("name"))));
	                System.out.println("avgprice == "
	                        + Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("avgprice"))));
	                System.out.println(
	                        "time == " + Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("time"))));
	            });
	        } catch (Exception e) {
	            e.printStackTrace();
	        }
	    } 
	    
	    
	    
        //删除数据
	    public static void deleteRow(String tableName, String rowkey) {
	        try {
	            Table table = connection.getTable(TableName.valueOf(tableName));
	            Delete delete = new Delete(Bytes.toBytes(rowkey));
	            table.delete(delete);
	        } catch (Exception e) {
	            e.printStackTrace();
	        }
	        System.out.println(" drowkey:"+rowkey+"Lpn");
	    }
	    
	    
	    public static void main(String[] args) {
	        // UaÒe
	        putRow("t1", "60000", "cf1", "name", "a");
	        // Uaåâ
	        getRow("t1", "33333");
	        // kÏhh
	        getScanner("t1");
	        //  dL
	        deleteRow("t1", "60000");
	        // yÏÒe
	        List<Put> puts = new ArrayList<Put>();
	        for (int i = 70000; i < 80000; i++) {
	            Put put = new Put(("Batch" + i).getBytes());
	            put.add("cf1".getBytes(), "name".getBytes(), ("banana" + i).getBytes());
	            puts.add(put);
	            if (i % 1000 == 0) {
	                putRows("t1", puts);
	                puts.clear();
	            }
	        }
	        
	    }
    
	    
}

在存取数据时，要把数据转化成字节形式，因为Hbase是以字节形式存储的。

四、Python访问Hbase

HBase Thrift服务：HBase原生只提供了JAVA API客户端，针对诸如python、php、c++等非java语言一般都是通过Thrift代理的方式访问HBase服务

开启HBase Thrift服务：hbase-daemon.sh start thrift

检查是否启动就是观察端口号：netstat -ntpl|grep 9090

import happybase
import re

conn = happybase.Connection(host='master',port=9090,timeout=None,autoconnect=True,table_prefix_separator = b'_',compat='0.98',transport = 'buffered',protocol='binary')
table = happybase.Table('t1',conn)
file = open("/home/zkpk/experiment/toHBase.txt", "r")
n = 1
while True:
    line=file.readline()
    if line:
        line=line.strip('\n')
        m=re.split("[,]",line)
        name = {'cf1:name':m[0]}
        age = {'cf1:age':m[1]}
        salary = {'cf1:salary':m[2]}
        table.put(str(n),name)
        table.put(str(n),age)
        table.put(str(n),salary)
        n = n+1
    else:
        break
conn.close()

连接参数的含义：

host：主机名
port：端口
timeout：超时时间
autoconnect：连接是否直接打开
tableprefix：用于构造表名的前缀
tableprefixseparator：用于tableprefix的分隔符
compat：兼容模式
transport：运输模式
protocol：协议

文本数据的样式：/home/zkpk/experiment/toHBase.txt

hbase phoenix下载 hbase和phoenix对比使用_数据库_04

如果遇到很多的数据，采用批量插入的方式。

import happybase
import re
import datetime;
def getHbaseConnection():
    conn = happybase.Connection(host='master', port=9090, timeout=None, autoconnect=True,table_prefix=None, table_prefix_separator=b'_',compat='0.98',transport='buffered', protocol='binary')
    return conn
def batchPut(table):
    conn = getHbaseConnection()
    t = happybase.Table(table, conn)
    batch = t.batch(batch_size=10)
    return batch
if __name__ == "__main__":

	startts = datetime.datetime.now().timestamp()

	print("starting put data into Hbase......")

	file = open('/home/zkpk/experiment/toHBase.txt','r')

	n = 1

	batch_put = batchPut('t1')

	while True:

		line=file.readline()

		if line:

			line=line.strip('\n')

			m=re.split("[,]",line)

			name = {'cf1:name':m[0]}

			age = {'cf1:age':m[1]}

			salary = {'cf1:salary':m[2]}

			with batch_put as bat:

				bat.put(str(n), name)

				bat.put(str(n), age)

				bat.put(str(n), salary)

				n = n+1

		else:

			break

	endts = datetime.datetime.now().timestamp()

	cost = endts - startts

	print("Insert data cost time is "+str(cost)+"s")

五、Phoenix的基本用法

启动：bin/sqlline.py 192.168.31.10或者节点名称:2181

创建表：创建之后就会和Hbase内的表构成映射

create view "t1"(
id VARCHAR PRIMARY KEY,
"cf1"."province" VARCHAR,
"cf1"."market" VARCHAR,
"cf1"."class" VARCHAR,
"cf1"."name" VARCHAR,
"cf1"."avgprice" VARCHAR,
"cf1"."time" VARCHAR);

查询：查询方法和sql语句一样，但是表一定要用双引号。

select * from "t1" limit 10;

插入和更新一样：插入的关键字不是insert 而是upsert

upsert into "test"("ID","name","age") values('1','zhangsan','24');

删除表：

drop view "t1";

在phoenix创建hbase映射表时，如果hbase的表名是小写，一定要在创建phoenix映射表时小写表名加上双引号，如果hbase表名是大写则不用加上双引号；另外phoenix映射表的字段不管hbase的字段是大小写都要加上双引号，否则会出现报错

创建全局索引

CREATE INDEX 索引名 ON TABLE_NAME(CF:COL1)

create index "test_index" on "test"("cf1"."age");

删除索引

drop index "test_index" on "test";

创建本地索引

create local index 索引名 ON TABLE(CF.COL1);

create local index "localindex" on "test"("cf1"."age");

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：Android vector 设置圆角 vectornator怎么画圆

下一篇：android parcelable传递数组 android fragment之间传递数据

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯