本文介绍一下MonetDB的漏斗功能. 此功能也是需要基于MonetDB的discovery和remote database.





m-funnel这个功能有点像plproxy, 多了一个queue/funnel的限制, 并且没有plproxy 的路由算法, 一个纯粹的代理.



plproxy接受客户端请求, 客户端的请求是并行提交给数据节点.



而MonetDB multiplex-funnel则是并行的接收客户端请求, 但是请求最终是一个一个提交给后端的所有数据节点的. 



返回结果的话则是所有数据节点并行执行的结果返回, 如有4个后端monetdbd, 那么count(*)返回4条记录.



另外需要注意的是m-funnel还限制了单条SQL返回的结果集的大小, 默认是80K.



注意使用m-funnel会打乱事务特性, 例如一个客户端的事务请求, 事务的中间可能被其他客户端的SQL插入.






MonetDB的multiplex-funnel举例 : 


使用monetdb创建m-funnel数据库

-m pattern
              With the -m flag, instead of creating a database, a multiplex-funnel is created.  See section MULTIPLEX-
              FUNNEL in monetdbd(1).  The pattern argument is not fully the same as a pattern for connecting  or  dis-
              covery.   Each  parallel target for the multiplex-funnel is given as username+password@pattern sequence,
              separated by commas.  Here the pattern is an ordinary pattern as would  be  used  for  connecting  to  a
              database, and can hence also be just the name of a database.


  username+password@pattern, 这里的pattern是数据库, 所以数据库需要在同一个广播域. 并且配置好tag(假设有重名的情况下)





使用举例 : 


配置hostname

150.sky-mobi.com
db-172-16-3-221


配置/etc/sysconfig/network


HOSTNAME=db-172-16-3-221


在所有monetdb节点配置DNS或/etc/hosts, 使解析正常.


本例配置/etc/hosts

172.16.3.221 db-172-16-3-221
172.16.3.150 150.sky-mobi.com


配置防火墙, 让monetdbd之间可以相互通讯


iptables -t filter -I INPUT -s 172.16.0.0/16 -j ACCEPT


创建monetdbd storage


monetdbd create /data01/mdb1


配置discovery


monetdbd set discovery=yes /data01/mdb1


可选, 配置forward = proxy (如果这是个边界m-funnel库的话, 默认就是proxy)


monetdbd set forward=proxy /data01/mdb1


启动monetdbd


monetdbd start /data01/mdb1


创建数据库, 解锁数据库


[root@150 postgresql-9.3.5]# monetdb create db1
created database in maintenance mode: db1
[root@150 postgresql-9.3.5]# monetdb create db2
created database in maintenance mode: db2
[root@150 postgresql-9.3.5]# monetdb release db1
taken database out of maintenance mode: db1
[root@150 postgresql-9.3.5]# monetdb release db2
taken database out of maintenance mode: db2

[root@db-172-16-3-221 ~]# monetdb create dbx1
created database in maintenance mode: dbx1
[root@db-172-16-3-221 ~]# monetdb create dbx2
created database in maintenance mode: dbx2
[root@db-172-16-3-221 ~]# monetdb release dbx1
taken database out of maintenance mode: dbx1
[root@db-172-16-3-221 ~]# monetdb release dbx2
taken database out of maintenance mode: dbx2


配置shared, tag(可选, 仅仅当数据库在广播域有重名是需要使用)


[root@150 postgresql-9.3.5]# monetdb set shared=150 db1
[root@150 postgresql-9.3.5]# monetdb set shared=150 db2
[root@db-172-16-3-221 ~]# monetdb set shared=221 dbx1
[root@db-172-16-3-221 ~]# monetdb set shared=221 dbx2



发现


[root@db-172-16-3-221 ~]# monetdb discover
                    location
mapi:monetdb://150.sky-mobi.com:50000/db1/150
mapi:monetdb://150.sky-mobi.com:50000/db2/150
mapi:monetdb://db-172-16-3-221:50000/dbx1/221
mapi:monetdb://db-172-16-3-221:50000/dbx2/221

[root@150 mdb1]# monetdb discover
                    location
mapi:monetdb://150.sky-mobi.com:50000/db1/150
mapi:monetdb://150.sky-mobi.com:50000/db2/150
mapi:monetdb://db-172-16-3-221:50000/dbx1/221
mapi:monetdb://db-172-16-3-221:50000/dbx2/221



创建第一个m-funnel, 这里假设dbx这个m-funnel是边界funnel.


首先测试一下库名是否正常.

[root@db-172-16-3-221 ~]# mclient -h 127.0.0.1 dbx1/221
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx1'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>\q
[root@db-172-16-3-221 ~]# mclient -h 127.0.0.1 dbx2/221
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx2'
Type \q to quit, \? for a list of available commands
auto commit mode: on


使用这两个库名创建m-funnel, 注意用户密码我没改, 用默认的


[root@db-172-16-3-221 ~]# monetdb create -m monetdb+monetdb@dbx1/221,monetdb+monetdb@dbx2/221 dbx
created multiplex-funnel in maintenance mode: dbx
[root@db-172-16-3-221 ~]# monetdb release dbx
taken database out of maintenance mode: dbx


查看状态


[root@db-172-16-3-221 ~]# monetdb get all dbx
     name          prop     source           value
dbx              name      -        dbx
dbx              type      local    mfunnel
dbx              shared    default  yes
dbx              mfunnel   local    monetdb+monetdb@dbx1/221,monetdb+monetdb@dbx2/221


测试第一个m-funnel是否正常, 可以正常的像两个底层数据库发出SQL.


[root@db-172-16-3-221 ~]# mclient -h 127.0.0.1 dbx
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx2'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>create table abc(id int);
operation successful (22.611ms)
sql>insert into abc values(1);
2 affected row (7.259ms)
sql>select * from abc;
+------+
| id   |
+======+
|    1 |
|    1 |
+------+
2 tuples (1.854ms)
sql>insert into abc values(2);
2 affected row (5.553ms)
sql>select * from abc;
+------+
| id   |
+======+
|    1 |
|    2 |
|    1 |
|    2 |
+------+
4 tuples (1.014ms)


已经正常了, 插入和查询都从两个节点正常返回并合并结果.


配置第一个m-funnel的tag(可选, 仅仅当广播域中这个m-funnel的名字有重名才需要)

[root@db-172-16-3-221 ~]# monetdb set shared=221 dbx
[root@db-172-16-3-221 ~]# monetdb get all dbx
     name          prop     source           value
dbx              name      -        dbx
dbx              type      local    mfunnel
dbx              shared    local    221
dbx              mfunnel   local    monetdb+monetdb@dbx1/221,monetdb+monetdb@dbx2/221



创建第二个m-funnel : 


先执行discover得到URI, 从而得知database name.(含shared设置的tag)

[root@150 postgresql-9.3.5]# monetdb discover
                    location
mapi:monetdb://150.sky-mobi.com:50000/db1/150
mapi:monetdb://150.sky-mobi.com:50000/db2/150
mapi:monetdb://db-172-16-3-221:50000/dbx/221
mapi:monetdb://db-172-16-3-221:50000/dbx1/221
mapi:monetdb://db-172-16-3-221:50000/dbx2/221


注意用户密码我没改, 用默认的


这个m-funnel包含3个库, 其中2个是normal 库, 另一个是前面创建的funnel, 所以实际上就是4个normal库, 但是funnel里看到的是3个库.

[root@150 postgresql-9.3.5]# monetdb create -m monetdb+monetdb@db1/150,monetdb+monetdb@db2/150,monetdb+monetdb@dbx/221 db0
created multiplex-funnel in maintenance mode: db0
[root@150 postgresql-9.3.5]# monetdb release db0
taken database out of maintenance mode: db0


查看m-funnel的状态


[root@150 postgresql-9.3.5]# monetdb get all db0
     name          prop     source           value
db0              name      -        db0
db0              type      local    mfunnel
db0              shared    default  yes
db0              mfunnel   local    monetdb+monetdb@db1/150,monetdb+monetdb@db2/150,monetdb+monetdb@dbx/221


测试 :


[root@150 postgresql-9.3.5]# mclient db0
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx2'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>create table abc(id int);  -- 前面测试的时候创建了abc表, 所以这里创建会报错.
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use


-- 特别注意, 虽然报错了, 但是实际上只是2个节点报错, 其他2个节点创建成功了.


-- 这个测试用了auto commit, 所以导致部分提交成功, 部分提交失败了.


-- 要全局一致的话, 需要用start transaction


-- 但是用start transaction的话, 会扰乱, 因为并行的事情. 后面有测试

sql>select * from abc;
+------+
| id   |
+======+
|    1 |
|    2 |
|    1 |
|    2 |
+------+
4 tuples (1.607ms)
sql>insert into abc values (3);  -- 从这个插入我们就能知道, 上面建表的语句在2个节点执行成果了.
4 affected row (5.192ms)
sql>select * from abc;
+------+
| id   |
+======+
|    3 |
|    3 |
|    1 |
|    2 |
|    3 |
|    1 |
|    2 |
|    3 |
+------+
8 tuples (1.459ms)



假设我们用了事务会怎么样呢?


可能出现部分提交失败, 部分提交成功或者全部提交成功的情况, 回滚的话则全部节点回滚.


所以鉴于此, 如果报错了, 就立刻回滚, 那么是全部回滚的.

sql>create table abc(id int);
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use
sql>select * from abc;
node dbx/221 failed: node dbx1/221 failed: 25005!current transaction is aborted (please ROLLBACK)
sql>commit;  -- 此时commit, 部分节点提交成功, 部分节点提交失败.
node dbx/221 failed: node dbx1/221 failed: 2D000!COMMIT: failed
sql>rollback;  -- 事务已经结束了, 所以不能再回滚.
node db1/150 failed: 2DM30!ROLLBACK: not allowed in auto commit mode
sql>select * from abc;
+------+
| id   |
+======+
|    1 |
|    2 |
|    1 |
|    2 |
+------+
4 tuples (1.676ms)
sql>insert into abc values (3);  -- 因为部分提交成功, 所以DB1和DB2创建abc表完成.
4 affected row (5.538ms)
sql>select * from abc;
+------+
| id   |
+======+
|    3 |
|    3 |
|    1 |
|    2 |
|    3 |
|    1 |
|    2 |
|    3 |
+------+
8 tuples (1.395ms)



全部回滚的测试 : 


sql>create table abc(id int);
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use
sql>rollback;
auto commit mode: on
sql>\dt
TABLE  sys.abc
TABLE  sys.abc


部分提交成功和失败的测试 : 


sql>create table abc(id int);
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use
sql>\dt
TABLE  sys.abc
TABLE  sys.abc
TABLE  sys.abc
TABLE  sys.abc


全部提交成功的测试 : 


sql>insert into abc values (10);
4 affected row (5.611ms)
sql>select * from abc where id=10;
+------+
| id   |
+======+
|   10 |
|   10 |
|   10 |
|   10 |
+------+
4 tuples (1.906ms)


如果需要全局事务一致的话, 必须使用start trasaction;



使用start transaction带来的困扰, 原因就是漏斗效应, 因为并行转换成了串行来执行, 所以中间会被其他SQL插入, 测试如下.

会话A
sql>start transaction;
auto commit mode: off

会话B
sql>start transaction;  -- 因为在m-funnel 中已经启动了事务, 所以再次启动失败
node db1/150 failed: 25001!START TRANSACTION: cannot start a transaction within a transaction
sql>select * from abc;
node db1/150 failed: 25005!current transaction is aborted (please ROLLBACK)

会话A
sql>select * from abc;  -- 回退
node db1/150 failed: 25005!current transaction is aborted (please ROLLBACK)

-- 任何一个会话执行了rollback都可以, 就结束这个事务了.



[参考]

1. man monetdbd

MULTIPLEX-FUNNELS
       Monetdbd  implements multiplex-funnel capabilities.  As the name suggests two techniques are combined, the mul-
       tiplexer and the funnel.

       The funnel capability limits the access to the database to one client at a time.  That is, if multiple  clients
       connect  to  the  funnel, their queries will be serialized such that they are executed one after the other.  An
       effect of this approach is that clients no longer have an exclusive channel to the database, which  means  that
       individual queries from one client may have been interleaved with queries from others.  This most notably makes
       SQL transaction blocks unreliable with a funnel.  The funnel, hence, is meant to scale down a large  amount  of
       clients that perform short-running (read-only) queries, as typically seen in web-based query loads.

       When a funnel is defined to use multiple databases, the funnel adds a multiplexer to its query channel.  A mul-
       tiplex-funnel sends each query to all of the defined databases.  This behavior can be quite confusing at first,
       but  proves  to be useful in typical sharding configurations, where in particular simple selection queries have
       to be performed on each of the shards.  The multiplexer combines the answers from all defined databases in  one
       single  answer  that it sends back to the client.  However, this combining is without any smart logic, that is,
       the multiplexer does not evaluate the query it is running, but just combines all answers it receives  from  the
       databases.   This  results  in  e.g.  as many return tuples for a SELECT COUNT(*) query, as there are databases
       defined.

       Due to the two above mentioned characteristics, a multiplex-funnel has some limitations.  As mentioned  before,
       transactions over multiple queries are likely not to result in the desired behavior.  This is due to each query
       to the funnel is required to be self-contained.  Further, since for  each  query,  the  results  from  multiple
       servers  have  to  be  combined  into one, that query must only return a single response, i.e.  multi-statement
       queries are most likely causing the funnel to respond with an error, or return garbled results.  Last, the size
       of  each query is limited to currently about 80K.  While this size should be sufficient for most queries, it is
       likely not enough for e.g. COPY INTO statements.  Apart from the data transfer  implications,  such  statements
       should not be used with the funnel, as the results will be undefined due to the limited query buffer.  Applica-
       tions using the funnel should aim for short and single-statement queries that require no transactions.

       See the create command in the monetdb(1) man-page for details on how to setup a multiplex-funnel.