本文介绍一下MonetDB的漏斗功能. 此功能也是需要基于MonetDB的discovery和remote database.
m-funnel这个功能有点像plproxy, 多了一个queue/funnel的限制, 并且没有plproxy 的路由算法, 一个纯粹的代理.
plproxy接受客户端请求, 客户端的请求是并行提交给数据节点.
而MonetDB multiplex-funnel则是并行的接收客户端请求, 但是请求最终是一个一个提交给后端的所有数据节点的.
返回结果的话则是所有数据节点并行执行的结果返回, 如有4个后端monetdbd, 那么count(*)返回4条记录.
另外需要注意的是m-funnel还限制了单条SQL返回的结果集的大小, 默认是80K.
注意使用m-funnel会打乱事务特性, 例如一个客户端的事务请求, 事务的中间可能被其他客户端的SQL插入.
MonetDB的multiplex-funnel举例 :
使用monetdb创建m-funnel数据库
-m pattern
With the -m flag, instead of creating a database, a multiplex-funnel is created. See section MULTIPLEX-
FUNNEL in monetdbd(1). The pattern argument is not fully the same as a pattern for connecting or dis-
covery. Each parallel target for the multiplex-funnel is given as username+password@pattern sequence,
separated by commas. Here the pattern is an ordinary pattern as would be used for connecting to a
database, and can hence also be just the name of a database.
username+password@pattern, 这里的pattern是数据库, 所以数据库需要在同一个广播域. 并且配置好tag(假设有重名的情况下)
使用举例 :
配置hostname
150.sky-mobi.com
db-172-16-3-221
配置/etc/sysconfig/network
HOSTNAME=db-172-16-3-221
在所有monetdb节点配置DNS或/etc/hosts, 使解析正常.
本例配置/etc/hosts
172.16.3.221 db-172-16-3-221
172.16.3.150 150.sky-mobi.com
配置防火墙, 让monetdbd之间可以相互通讯
iptables -t filter -I INPUT -s 172.16.0.0/16 -j ACCEPT
创建monetdbd storage
monetdbd create /data01/mdb1
配置discovery
monetdbd set discovery=yes /data01/mdb1
可选, 配置forward = proxy (如果这是个边界m-funnel库的话, 默认就是proxy)
monetdbd set forward=proxy /data01/mdb1
启动monetdbd
monetdbd start /data01/mdb1
创建数据库, 解锁数据库
[root@150 postgresql-9.3.5]# monetdb create db1
created database in maintenance mode: db1
[root@150 postgresql-9.3.5]# monetdb create db2
created database in maintenance mode: db2
[root@150 postgresql-9.3.5]# monetdb release db1
taken database out of maintenance mode: db1
[root@150 postgresql-9.3.5]# monetdb release db2
taken database out of maintenance mode: db2
[root@db-172-16-3-221 ~]# monetdb create dbx1
created database in maintenance mode: dbx1
[root@db-172-16-3-221 ~]# monetdb create dbx2
created database in maintenance mode: dbx2
[root@db-172-16-3-221 ~]# monetdb release dbx1
taken database out of maintenance mode: dbx1
[root@db-172-16-3-221 ~]# monetdb release dbx2
taken database out of maintenance mode: dbx2
配置shared, tag(可选, 仅仅当数据库在广播域有重名是需要使用)
[root@150 postgresql-9.3.5]# monetdb set shared=150 db1
[root@150 postgresql-9.3.5]# monetdb set shared=150 db2
[root@db-172-16-3-221 ~]# monetdb set shared=221 dbx1
[root@db-172-16-3-221 ~]# monetdb set shared=221 dbx2
发现
[root@db-172-16-3-221 ~]# monetdb discover
location
mapi:monetdb://150.sky-mobi.com:50000/db1/150
mapi:monetdb://150.sky-mobi.com:50000/db2/150
mapi:monetdb://db-172-16-3-221:50000/dbx1/221
mapi:monetdb://db-172-16-3-221:50000/dbx2/221
[root@150 mdb1]# monetdb discover
location
mapi:monetdb://150.sky-mobi.com:50000/db1/150
mapi:monetdb://150.sky-mobi.com:50000/db2/150
mapi:monetdb://db-172-16-3-221:50000/dbx1/221
mapi:monetdb://db-172-16-3-221:50000/dbx2/221
创建第一个m-funnel, 这里假设dbx这个m-funnel是边界funnel.
首先测试一下库名是否正常.
[root@db-172-16-3-221 ~]# mclient -h 127.0.0.1 dbx1/221
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx1'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>\q
[root@db-172-16-3-221 ~]# mclient -h 127.0.0.1 dbx2/221
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx2'
Type \q to quit, \? for a list of available commands
auto commit mode: on
使用这两个库名创建m-funnel, 注意用户密码我没改, 用默认的
[root@db-172-16-3-221 ~]# monetdb create -m monetdb+monetdb@dbx1/221,monetdb+monetdb@dbx2/221 dbx
created multiplex-funnel in maintenance mode: dbx
[root@db-172-16-3-221 ~]# monetdb release dbx
taken database out of maintenance mode: dbx
查看状态
[root@db-172-16-3-221 ~]# monetdb get all dbx
name prop source value
dbx name - dbx
dbx type local mfunnel
dbx shared default yes
dbx mfunnel local monetdb+monetdb@dbx1/221,monetdb+monetdb@dbx2/221
测试第一个m-funnel是否正常, 可以正常的像两个底层数据库发出SQL.
[root@db-172-16-3-221 ~]# mclient -h 127.0.0.1 dbx
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx2'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>create table abc(id int);
operation successful (22.611ms)
sql>insert into abc values(1);
2 affected row (7.259ms)
sql>select * from abc;
+------+
| id |
+======+
| 1 |
| 1 |
+------+
2 tuples (1.854ms)
sql>insert into abc values(2);
2 affected row (5.553ms)
sql>select * from abc;
+------+
| id |
+======+
| 1 |
| 2 |
| 1 |
| 2 |
+------+
4 tuples (1.014ms)
已经正常了, 插入和查询都从两个节点正常返回并合并结果.
配置第一个m-funnel的tag(可选, 仅仅当广播域中这个m-funnel的名字有重名才需要)
[root@db-172-16-3-221 ~]# monetdb set shared=221 dbx
[root@db-172-16-3-221 ~]# monetdb get all dbx
name prop source value
dbx name - dbx
dbx type local mfunnel
dbx shared local 221
dbx mfunnel local monetdb+monetdb@dbx1/221,monetdb+monetdb@dbx2/221
创建第二个m-funnel :
先执行discover得到URI, 从而得知database name.(含shared设置的tag)
[root@150 postgresql-9.3.5]# monetdb discover
location
mapi:monetdb://150.sky-mobi.com:50000/db1/150
mapi:monetdb://150.sky-mobi.com:50000/db2/150
mapi:monetdb://db-172-16-3-221:50000/dbx/221
mapi:monetdb://db-172-16-3-221:50000/dbx1/221
mapi:monetdb://db-172-16-3-221:50000/dbx2/221
注意用户密码我没改, 用默认的
这个m-funnel包含3个库, 其中2个是normal 库, 另一个是前面创建的funnel, 所以实际上就是4个normal库, 但是funnel里看到的是3个库.
[root@150 postgresql-9.3.5]# monetdb create -m monetdb+monetdb@db1/150,monetdb+monetdb@db2/150,monetdb+monetdb@dbx/221 db0
created multiplex-funnel in maintenance mode: db0
[root@150 postgresql-9.3.5]# monetdb release db0
taken database out of maintenance mode: db0
查看m-funnel的状态
[root@150 postgresql-9.3.5]# monetdb get all db0
name prop source value
db0 name - db0
db0 type local mfunnel
db0 shared default yes
db0 mfunnel local monetdb+monetdb@db1/150,monetdb+monetdb@db2/150,monetdb+monetdb@dbx/221
测试 :
[root@150 postgresql-9.3.5]# mclient db0
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx2'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>create table abc(id int); -- 前面测试的时候创建了abc表, 所以这里创建会报错.
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use
-- 特别注意, 虽然报错了, 但是实际上只是2个节点报错, 其他2个节点创建成功了.
-- 这个测试用了auto commit, 所以导致部分提交成功, 部分提交失败了.
-- 要全局一致的话, 需要用start transaction
-- 但是用start transaction的话, 会扰乱, 因为并行的事情. 后面有测试
sql>select * from abc;
+------+
| id |
+======+
| 1 |
| 2 |
| 1 |
| 2 |
+------+
4 tuples (1.607ms)
sql>insert into abc values (3); -- 从这个插入我们就能知道, 上面建表的语句在2个节点执行成果了.
4 affected row (5.192ms)
sql>select * from abc;
+------+
| id |
+======+
| 3 |
| 3 |
| 1 |
| 2 |
| 3 |
| 1 |
| 2 |
| 3 |
+------+
8 tuples (1.459ms)
假设我们用了事务会怎么样呢?
可能出现部分提交失败, 部分提交成功或者全部提交成功的情况, 回滚的话则全部节点回滚.
所以鉴于此, 如果报错了, 就立刻回滚, 那么是全部回滚的.
sql>create table abc(id int);
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use
sql>select * from abc;
node dbx/221 failed: node dbx1/221 failed: 25005!current transaction is aborted (please ROLLBACK)
sql>commit; -- 此时commit, 部分节点提交成功, 部分节点提交失败.
node dbx/221 failed: node dbx1/221 failed: 2D000!COMMIT: failed
sql>rollback; -- 事务已经结束了, 所以不能再回滚.
node db1/150 failed: 2DM30!ROLLBACK: not allowed in auto commit mode
sql>select * from abc;
+------+
| id |
+======+
| 1 |
| 2 |
| 1 |
| 2 |
+------+
4 tuples (1.676ms)
sql>insert into abc values (3); -- 因为部分提交成功, 所以DB1和DB2创建abc表完成.
4 affected row (5.538ms)
sql>select * from abc;
+------+
| id |
+======+
| 3 |
| 3 |
| 1 |
| 2 |
| 3 |
| 1 |
| 2 |
| 3 |
+------+
8 tuples (1.395ms)
全部回滚的测试 :
sql>create table abc(id int);
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use
sql>rollback;
auto commit mode: on
sql>\dt
TABLE sys.abc
TABLE sys.abc
部分提交成功和失败的测试 :
sql>create table abc(id int);
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use
sql>\dt
TABLE sys.abc
TABLE sys.abc
TABLE sys.abc
TABLE sys.abc
全部提交成功的测试 :
sql>insert into abc values (10);
4 affected row (5.611ms)
sql>select * from abc where id=10;
+------+
| id |
+======+
| 10 |
| 10 |
| 10 |
| 10 |
+------+
4 tuples (1.906ms)
如果需要全局事务一致的话, 必须使用start trasaction;
使用start transaction带来的困扰, 原因就是漏斗效应, 因为并行转换成了串行来执行, 所以中间会被其他SQL插入, 测试如下.
会话A
sql>start transaction;
auto commit mode: off
会话B
sql>start transaction; -- 因为在m-funnel 中已经启动了事务, 所以再次启动失败
node db1/150 failed: 25001!START TRANSACTION: cannot start a transaction within a transaction
sql>select * from abc;
node db1/150 failed: 25005!current transaction is aborted (please ROLLBACK)
会话A
sql>select * from abc; -- 回退
node db1/150 failed: 25005!current transaction is aborted (please ROLLBACK)
-- 任何一个会话执行了rollback都可以, 就结束这个事务了.
[参考]
1. man monetdbd
MULTIPLEX-FUNNELS
Monetdbd implements multiplex-funnel capabilities. As the name suggests two techniques are combined, the mul-
tiplexer and the funnel.
The funnel capability limits the access to the database to one client at a time. That is, if multiple clients
connect to the funnel, their queries will be serialized such that they are executed one after the other. An
effect of this approach is that clients no longer have an exclusive channel to the database, which means that
individual queries from one client may have been interleaved with queries from others. This most notably makes
SQL transaction blocks unreliable with a funnel. The funnel, hence, is meant to scale down a large amount of
clients that perform short-running (read-only) queries, as typically seen in web-based query loads.
When a funnel is defined to use multiple databases, the funnel adds a multiplexer to its query channel. A mul-
tiplex-funnel sends each query to all of the defined databases. This behavior can be quite confusing at first,
but proves to be useful in typical sharding configurations, where in particular simple selection queries have
to be performed on each of the shards. The multiplexer combines the answers from all defined databases in one
single answer that it sends back to the client. However, this combining is without any smart logic, that is,
the multiplexer does not evaluate the query it is running, but just combines all answers it receives from the
databases. This results in e.g. as many return tuples for a SELECT COUNT(*) query, as there are databases
defined.
Due to the two above mentioned characteristics, a multiplex-funnel has some limitations. As mentioned before,
transactions over multiple queries are likely not to result in the desired behavior. This is due to each query
to the funnel is required to be self-contained. Further, since for each query, the results from multiple
servers have to be combined into one, that query must only return a single response, i.e. multi-statement
queries are most likely causing the funnel to respond with an error, or return garbled results. Last, the size
of each query is limited to currently about 80K. While this size should be sufficient for most queries, it is
likely not enough for e.g. COPY INTO statements. Apart from the data transfer implications, such statements
should not be used with the funnel, as the results will be undefined due to the limited query buffer. Applica-
tions using the funnel should aim for short and single-statement queries that require no transactions.
See the create command in the monetdb(1) man-page for details on how to setup a multiplex-funnel.