一、总论
Openvswitch是一个virutal swtich, 支持Open Flow协议,当然也有一些硬件Switch也支持Open Flow协议,他们都可以被统一的Controller管理,从而实现物理机和虚拟机的网络联通。
Open Switch定义了一系列Flow Table,通过它来控制包的流向和结构。
根据Open Flow协议,Flow Table包含match field,如果匹配上了,则执行Action。
其中Match Field涵盖TCP/IP协议各层:
- Layer 1 – Tunnel ID, In Port, QoS priority, skb mark
- Layer 2 – MAC address, VLAN ID, Ethernet type
- Layer 3 – IPv4/IPv6 fields, ARP
- Layer 4 – TCP/UDP, ICMP, ND
Action也主要包含下面的操作:
- Output to port (port range, flood, mirror)
- Discard, Resubmit to table x
- Packet Mangling (Push/Pop VLAN header, TOS, ...)
- Send to controller, Learn
可以设置Tunnel
可以支持下列的框架来监控流量。
- sFlow
- NetFlow
- Port Mirroring
- SPAN
- RSPAN
- ERSPAN
支持QoS
- Uses existing Traffic Control Layer
- Policer (Ingress rate limiter)
- HTB, HFSC (Egress traffic classes)
- Controller (Open Flow) can select Traffic Class
二、Openvswitch的架构
三、数据库结构和OVS-VSCTL
# ps aux | grep openvswitch
root 1117 0.0 0.0 21200 1580 ? S< Jun09 0:35 ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
root 1153 0.6 1.1 169508 24016 ? S<Ll Jun09 16:24 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach --monitor
我们发现有两个进程:
- ovsdb-server 维护数据库/etc/openvswitch/conf.db
- ovs-vswitchd 核心daemon
- 两者通过unix domain socket /var/run/openvswitch/db.sock 互相通信
ovs-vsctl就是通过和ovsdb-server通信,来修改数据库。
ovs-vswitchd会和ovsdb-server进行通信,来对虚拟设备做相应的修改。
所以ovs-vsctl里面的命令大多数都是对数据库的操作,因而我们必须好好的了解数据库。
如果我们cat /etc/openvswitch/conf.db,我们会发现它是json格式的。
数据库可以通过ovsdb-client dump将数据库内容打印出来
# ovsdb-client dump
Bridge table
_uuid controller datapath_id datapath_type external_ids fail_mode flood_vlans flow_tables ipfix mirrors name netflow other_config ports protocols sflow status stp_enable
------------------------------------ ---------- ------------------ ------------- ------------ --------- ----------- ----------- ----- ------- ------ ------- ------------ ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------- ----- ------ ----------
929ab1c2-1146-411d-8557-af7498a26444 [] "0000080027dfbff7" "" {} [] [] {} [] [] br-ex [] {} [52c02f9d-db2f-4c50-84ce-7a377530ad3b, 79f74c54-056d-4c60-82d7-46412ceee17e] [] [] {} false
12ebfe38-6dab-402a-8fb5-aa814a5a3f52 [] "00003afeeb122a40" "" {} [] [] {} [] [] br-int [] {} [02049620-d3b4-4ecf-9d9e-f8b40a039f64, 035eed84-ce54-44c4-97d5-92f9cc9d662e, 18eacfea-0982-4c89-b2aa-3e2c7dc2b6d3, d43963b0-be0b-4265-b0bb-4ebc928d3ee0, d564225b-60e4-4786-a5fc-6b1c8febd0fb] [] [] {} false
ccfc8a92-2d29-4a55-ae4b-21f59eeeaed7 [] "0000928afccc554a" "" {} [] [] {} [] [] br-tun [] {} [6d14a97d-28e8-42f8-b5d6-ec655ccc7b91, 8b0946ae-3177-4d5d-bc8e-6d04536701ad] [] [] {} false
Controller table
_uuid connection_mode controller_burst_limit controller_rate_limit enable_async_messages external_ids inactivity_probe is_connected local_gateway local_ip local_netmask max_backoff other_config role status target
----- --------------- ---------------------- --------------------- --------------------- ------------ ---------------- ------------ ------------- -------- ------------- ----------- ------------ ---- ------ ------
Flow_Sample_Collector_Set table
_uuid bridge external_ids id ipfix
----- ------ ------------ -- -----
Flow_Table table
_uuid flow_limit groups name overflow_policy
----- ---------- ------ ---- ---------------
IPFIX table
_uuid cache_active_timeout cache_max_flows external_ids obs_domain_id obs_point_id sampling targets
----- -------------------- --------------- ------------ ------------- ------------ -------- -------
Interface table
_uuid admin_state bfd bfd_status cfm_fault cfm_fault_status cfm_health cfm_mpid cfm_remote_mpids cfm_remote_opstate duplex external_ids ifindex ingress_policing_burst ingress_policing_rate lacp_current link_resets link_speed link_state mac mac_in_use mtu name ofport ofport_request options other_config statistics status type
------------------------------------ ----------- --- ---------- --------- ---------------- ---------- -------- ---------------- ------------------ ------ -------------------------------------------------------------------------------------------------------------------------------------------------------- ------- ---------------------- --------------------- ------------ ----------- ----------- ---------- --- ------------------- ---- ---------------- ------ -------------- ---------------- ------------ --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------- --------
54ab6189-2611-40cc-884f-5ce20913bc32 down {} {} [] [] [] [] [] [] full {} 3 0 0 [] 0 1000000000 down [] "08:00:27:df:bf:f7" 1500 "eth1" 3 [] {} {} {collisions=0, rx_bytes=0, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=0, tx_bytes=558, tx_dropped=0, tx_errors=0, tx_packets=7} {driver_name="e1000", driver_version="7.3.21-k8-NAPI", firmware_version=""} ""
6da8e9cd-98df-451e-b7d9-fcddefe0325e up {} {} [] [] [] [] [] [] [] {} 0 0 0 [] 0 [] up [] "c2:9e:64:de:bd:db" [] patch-int 1 [] {peer=patch-tun} {} {collisions=0, rx_bytes=1340, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=18, tx_bytes=90, tx_dropped=0, tx_errors=0, tx_packets=1} {} patch
eda8218d-1d5e-4c52-ad89-24f8d0445ef9 up {} {} [] [] [] [] [] [] [] {} 0 0 0 [] 0 [] up [] "fa:91:44:32:65:f9" [] patch-tun 1 [] {peer=patch-int} {} {collisions=0, rx_bytes=90, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=1, tx_bytes=1340, tx_dropped=0, tx_errors=0, tx_packets=18} {} patch
ee54427c-133c-4b2a-a641-c2624732eb66 up {} {} [] [] [] [] [] [] [] {} 5 0 0 [] 0 [] up [] "08:00:27:df:bf:f7" 1500 br-ex 65534 [] {} {} {collisions=0, rx_bytes=648, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=8, tx_bytes=0, tx_dropped=0, tx_errors=0, tx_packets=0} {driver_name=openvswitch} internal
a45575fd-5179-459e-8951-3553c51f4aaa up {} {} [] [] [] [] [] [] [] {} 6 0 0 [] 2 [] up [] "3a:fe:eb:12:2a:40" 1500 br-int 65534 [] {} {} {collisions=0, rx_bytes=648, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=8, tx_bytes=2078, tx_dropped=0, tx_errors=0, tx_packets=27} {driver_name=openvswitch} internal
801c0dec-44db-4161-b791-08828d542ecf up {} {} [] [] [] [] [] [] [] {} 8 0 0 [] 2 [] up [] "92:8a:fc:cc:55:4a" 1500 br-tun 65534 [] {} {} {collisions=0, rx_bytes=648, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=8, tx_bytes=0, tx_dropped=0, tx_errors=0, tx_packets=0} {driver_name=openvswitch} internal
e8193562-c3bb-4b48-a9bf-00bba5f2a213 up {} {} [] [] [] [] [] [] full {attached-mac="fa:16:3e:37:28:e6", iface-id="7d18228b-476f-4b50-804a-18e60e6b0e6f", iface-status=active, vm-uuid="413d9fb4-34e5-4032-95ff-dea80f1f4adc"} 17 0 0 [] 0 10000000000 up [] "66:b8:10:c8:dc:df" 1500 "qvo7d18228b-47" 4 [] {} {} {collisions=0, rx_bytes=530, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=7, tx_bytes=0, tx_dropped=0, tx_errors=0, tx_packets=0} {driver_name=veth, driver_version="1.0", firmware_version=""} ""
b33cdfe2-dab1-449e-b2be-ce85e2d0ac79 up {} {} [] [] [] [] [] [] full {attached-mac="fa:16:3e:58:be:c6", iface-id="eea9263a-5e4b-4e5b-9923-3d59ca752082", iface-status=active, vm-uuid="98b592b1-7778-46f1-93df-3c5079650b71"} 14 0 0 [] 0 10000000000 up [] "7a:d4:a4:8d:60:b7" 1500 "qvoeea9263a-5e" 3 [] {} {} {collisions=0, rx_bytes=530, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=7, tx_bytes=390, tx_dropped=0, tx_errors=0, tx_packets=5} {driver_name=veth, driver_version="1.0", firmware_version=""} ""
c0a5afe4-b7a8-4230-b3bb-af11a33de14e up {} {} [] [] [] [] [] [] full {attached-mac="fa:16:3e:6f:5e:bf", iface-id="c64bd111-eff3-4c65-b096-53c3b3188a43", iface-status=active, vm-uuid="e2009bf8-de2b-4bff-a96a-627a61caf9e7"} 11 0 0 [] 0 10000000000 up [] "e2:d8:61:c4:1d:01" 1500 "qvoc64bd111-ef" 2 [] {} {} {collisions=0, rx_bytes=280, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=4, tx_bytes=250, tx_dropped=0, tx_errors=0, tx_packets=3} {driver_name=veth, driver_version="1.0", firmware_version=""} ""
Manager table
_uuid connection_mode external_ids inactivity_probe is_connected max_backoff other_config status target
----- --------------- ------------ ---------------- ------------ ----------- ------------ ------ ------
Mirror table
_uuid external_ids name output_port output_vlan select_all select_dst_port select_src_port select_vlan statistics
----- ------------ ---- ----------- ----------- ---------- --------------- --------------- ----------- ----------
NetFlow table
_uuid active_timeout add_id_to_interface engine_id engine_type external_ids targets
----- -------------- ------------------- --------- ----------- ------------ -------
Open_vSwitch table
_uuid bridges cur_cfg db_version external_ids manager_options next_cfg other_config ovs_version ssl statistics system_type system_version
------------------------------------ ------------------------------------------------------------------------------------------------------------------ ------- ---------- -------------------------------------------------- --------------- -------- ------------ ----------- --- ---------- ----------- --------------
ab19a0f9-1dc8-44d0-ac00-a9687bb43fdd [12ebfe38-6dab-402a-8fb5-aa814a5a3f52, 929ab1c2-1146-411d-8557-af7498a26444, ccfc8a92-2d29-4a55-ae4b-21f59eeeaed7] 273 "7.3.0" {system-id="6b963f5d-45e7-409f-b5ee-8e30006dcd73"} [] 273 {} "2.0.1" [] {} Ubuntu "14.04-trusty"
Port table
_uuid bond_downdelay bond_fake_iface bond_mode bond_updelay external_ids fake_bridge interfaces lacp mac name other_config qos statistics status tag trunks vlan_mode
------------------------------------ -------------- --------------- --------- ------------ ------------ ----------- -------------------------------------- ---- --- ---------------- ------------ --- ---------- ------ --- ------ ---------
52c02f9d-db2f-4c50-84ce-7a377530ad3b 0 false [] 0 {} false [ee54427c-133c-4b2a-a641-c2624732eb66] [] [] br-ex {} [] {} {} [] [] []
18eacfea-0982-4c89-b2aa-3e2c7dc2b6d3 0 false [] 0 {} false [a45575fd-5179-459e-8951-3553c51f4aaa] [] [] br-int {} [] {} {} [] [] []
8b0946ae-3177-4d5d-bc8e-6d04536701ad 0 false [] 0 {} false [801c0dec-44db-4161-b791-08828d542ecf] [] [] br-tun {} [] {} {} [] [] []
79f74c54-056d-4c60-82d7-46412ceee17e 0 false [] 0 {} false [54ab6189-2611-40cc-884f-5ce20913bc32] [] [] "eth1" {} [] {} {} [] [] []
6d14a97d-28e8-42f8-b5d6-ec655ccc7b91 0 false [] 0 {} false [6da8e9cd-98df-451e-b7d9-fcddefe0325e] [] [] patch-int {} [] {} {} [] [] []
035eed84-ce54-44c4-97d5-92f9cc9d662e 0 false [] 0 {} false [eda8218d-1d5e-4c52-ad89-24f8d0445ef9] [] [] patch-tun {} [] {} {} [] [] []
d564225b-60e4-4786-a5fc-6b1c8febd0fb 0 false [] 0 {} false [e8193562-c3bb-4b48-a9bf-00bba5f2a213] [] [] "qvo7d18228b-47" {} [] {} {} 1 [] []
02049620-d3b4-4ecf-9d9e-f8b40a039f64 0 false [] 0 {} false [c0a5afe4-b7a8-4230-b3bb-af11a33de14e] [] [] "qvoc64bd111-ef" {} [] {} {} 1 [] []
d43963b0-be0b-4265-b0bb-4ebc928d3ee0 0 false [] 0 {} false [b33cdfe2-dab1-449e-b2be-ce85e2d0ac79] [] [] "qvoeea9263a-5e" {} [] {} {} 1 [] []
QoS table
_uuid external_ids other_config queues type
----- ------------ ------------ ------ ----
Queue table
_uuid dscp external_ids other_config
----- ---- ------------ ------------
SSL table
_uuid bootstrap_ca_cert ca_cert certificate external_ids private_key
----- ----------------- ------- ----------- ------------ -----------
sFlow table
_uuid agent external_ids header polling sampling targets
----- ----- ------------ ------ ------- -------- -------
数据库表之间的关系如图所示
其中Open_vSwitch是根,结构如下:
这个表示ovs-vswitchd的配置,包含下面的几方面:
- 对bridge设备的配置,bridges指向bridge表,我们能看到的openvswitch的主要功能都是在bridge上实现的,在bridge表中详细叙述。
- 对本身的一些配置;
- other_config : stats-update-interval :将统计信息写入数据库的间隔时间
- other_config : flow-restore-wait : 为hot-upgrade使用的,如果设为true则不处理任何的包。一般使用的过程为,先停掉ovs-vswitchd,然后将这个值设为true,启动ovs-vswitchd,这个时候不处理任何包,然后使用ovs-ofctl将flow table restore到一个正确的状态,最后设置这个值为false,开始处理包
- other_config : flow-limit :在flow table中flow entry的数量
- other_config : n-handler-threads :用于处理新flow的线程数
- other_config : n-revalidator-threads :用于验证flow的线程数.
- other_config : enable-statistics 是否统计
- statistics : cpu 统计cpu数量,线程
- statistics : load_average system load
- statistics : memory 总RAM,swap
- statistics : process_NAME :with NAME replaced by a process name,统计memory size, cpu time等
- statistics : file_systems:mount point, size, used
- client request id: 也即cur_cfg和next_cfg,当一个client修改了数据库的之后,增加next_cfg,然后等待openvswitch应用这些修改,当修改应用完毕,则cur_cfg = next_cfg。如果我们打开/etc/openvswitch/conf.db文件,我们发现,随着我们队openvswitch的配置,cur_cfg是不断++的
- 对SSL的配置:指向SSL表,主要配置private key, certificate(里面是public key),已经CA的certificate
- 对ovsdb-server的配置,指向Manager表,ovs-vswitchd作为它的client,配置一下db connection的选项
四、SSL
在SSL表中,包含了经典的SSL connection的所有配置:
openvswitch本身的private key和public key对,其中public key放在certificate中,并且需要CA使用自己的private key进行签名,CA来担保这个certificate是合法的,为了验证这个CA签名,当然需要CA的public key,而CA的public key是放在ca cert里面的,当然也需要被签名,被更高级的CA担保,或者自己担保自己。
bootstrap_ca_cert是一个boolean,如果是true,则每次启动的时候,都会向controller去拿最新的ca cert。
我们如果仔细观察ovsdb-server进程,据发现这些配置被使用了。
db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert
我们还可以配置ovs-vswitchd和ovs-controller之间通过ssl相互通信,使用的是双向SSL互信
配置ovs-vswitchd,使用switch的private key(sc-privkey.pem)和public key(sc-cert.pem),但是指定的是controller的CA Cert(cacert.pem)
cd /etc/openvswitch
sudo ovs-pki req+sign sc switch
sudo ovs-vsctl set-ssl \
/etc/openvswitch/sc-privkey.pem \
/etc/openvswitch/sc-cert.pem \
/var/lib/openvswitch/pki/controllerca/cacert.pem
配置并启动controller
cd /etc/openvswitch
sudo ovs-pki req+sign ctl controller
sudo ovs-controller -v pssl:6633 \
-p /etc/openvswitch/ctl-privkey.pem \
-c /etc/openvswitch/ctl-cert.pem \
-C /var/lib/openvswitch/pki/switchca/cacert.pem
这里配置的是controller的的private key(ctl-privkey.pem)和public key(ctl-cert.pem),但是指定的是switch的CA Cert(cacert.pem)
这个SSL机制有关,当建立SSL连接的时候,
当client连接Server的时候,Server会将自己的certificate发给client,将来Client会用Server的certificate来解密Server用private key加密的信息,然而这个certificate是否合法,需要CA进行验证,由于这个certificate是Server的,当然client要使用Server的CA cert进行验证。同理Server会向client请求Client的certificate,来解密Client的private key加密的信息,则会用Client的CA cert进行验证。
当建立ssl连接后,我们可以看到下面的样子:
mininet@mininet:~$ sudo ovs-vsctl show
902d6aa3-6a0a-4708-a286-3301c8b36430
Bridge "s1"
Controller "ssl:127.0.0.1:6633"
is_connected: true
fail_mode: secure
Port "s1"
Interface "s1"
type: internal
Port "s1-eth1"
Interface "s1-eth1"
Port "s1-eth2"
Interface "s1-eth2"
ovs_version: "2.0.1"
连接controller时某一个switch连接到某个controller.
然而连接manager时某一个openvswitch daemon连接到某个manager.
五、Manager
Manager表配置的是ovsdb-server的,他配置了ovsdb-server使用manager_options中的配置来监听端口,等待client来连接。
其中最重要的一项就是target:
- ssl:ip[:port]:ovsdb-server会监听在ip的port上,协议为ssl
- tcp:ip[:port]:会监听在ip的port上,协议为tcp
- pssl:[port][:ip]:会监听在端口port上,协议为ssl
- ptcp:[port][:ip]:会监听在端口port上,协议为ssl
可以通过下面的命令进行设置:
ovs-vsctl set-manager…
从架构图中我们看出,ovs-vswitchd是ovsdb-server的client,两者是通过unix domain sock /var/run/openvswitch/db.sock进行通信的。
ovs-vsctl也是ovsdb-server的客户端,默认情况下,ovs-vsctl是运行在ovsdb-server同一台机器上,也是通过/var/run/openvswitch/db.sock进行通信的。
我们看ovs-vsctl有参数--db,默认是unix:file,然而也可以是tcp:ip:port或者ssl:ip:port
这就使得ovs-vsctl在另外一台机器上,也能远程控制ovsdb-server.
我们有两台机器,一台16.158.165.153,我们设置manager
sudo ovs-vsctl set-manager ptcp:8881
$ sudo ovs-vsctl show
ab19a0f9-1dc8-44d0-ac00-a9687bb43fdd
Manager "ptcp:8881"
Bridge br-int
Port br-int
Interface br-int
type: internal
Port patch-tun
Interface patch-tun
type: patch
options: {peer=patch-int}
Port "qvoeea9263a-5e"
tag: 1
Interface "qvoeea9263a-5e"
Port "qvo7d18228b-47"
tag: 1
Interface "qvo7d18228b-47"
Port "qvoc64bd111-ef"
tag: 1
Interface "qvoc64bd111-ef"
Bridge br-ex
Port br-ex
Interface br-ex
type: internal
Port "eth1"
Interface "eth1"
Bridge br-tun
Port patch-int
Interface patch-int
type: patch
options: {peer=patch-tun}
Port br-tun
Interface br-tun
type: internal
ovs_version: "2.0.1"
另一台机器16.158.165.102,我们在上面运行
root@openstackcliu8:~# ovs-vsctl show
21c19aae-c278-4c65-9c6b-9b2c4d67b6dd
ovs_version: "2.0.1"
发现表是空的。
# ovs-vsctl --db=tcp:16.158.165.153:8881 show
ab19a0f9-1dc8-44d0-ac00-a9687bb43fdd
Manager "ptcp:8881"
Bridge br-int
Port br-int
Interface br-int
type: internal
Port patch-tun
Interface patch-tun
type: patch
options: {peer=patch-int}
Port "qvoeea9263a-5e"
tag: 1
Interface "qvoeea9263a-5e"
Port "qvo7d18228b-47"
tag: 1
Interface "qvo7d18228b-47"
Port "qvoc64bd111-ef"
tag: 1
Interface "qvoc64bd111-ef"
Bridge br-ex
Port br-ex
Interface br-ex
type: internal
Port "eth1"
Interface "eth1"
Bridge br-tun
Port patch-int
Interface patch-int
type: patch
options: {peer=patch-tun}
Port br-tun
Interface br-tun
type: internal
ovs_version: "2.0.1"
六、Bridge表
Open_vSwitch指向Bridge表
Bridge表无疑是最重要的表,所谓的virtual switch,多是用bridge来实现的。
Bridge有下面几项配置:
- 核心功能配置项
- name
- ports:指向Port表,
- mirrors:指向mirror表
- netflow
- sflow
- ipfix
- flood_vlans:是一些vlan id,对于这些vlan id,mac address learning是不做的,每次寻找mac都进行arp
- OpenFlow配置项:从架构图中我们可以看出,openvwitch的一个bridge可以通过openflow协议,被一个统一的controller管理的
- Controller
- flow_tables
- fail_mode:
- 一旦一个bridge连到一个openflow controller,则flow table就由controller统一管理,如果连接断了
- secure: 这个bridge会试图一直连接controller,并不自己建立flow table
- standalone:一旦bridge三次连不上controller,就自己建立和管理flow table
- datapath_id:
- Spaning Tree配置
单纯创建一个bridge 很简单
ovs-vsctl add-br helloworld,但是里面的若干配置相对比较复杂。
下一节,我们来看Controller