标题

  • 1. 微服务之 consul
  • 1. 什么是 consul
  • 1.1. About HashiCorp
  • 1.2. Consul vs. ZooKeeper, doozerd, etcd
  • 2. 基础概念
  • 3. 使用
  • 4. 默认端口
  • 5. 部署集群
  • 5.1. 安装 server
  • 5.2. 安装 client
  • 6. 使用
  • 6.1. 列出所有节点
  • 6.2. 列出所有服务
  • 6.3. 列出服务的实例
  • 6.4. 注册服务
  • 6.5. 注销服务
  • 7. Web UI
  • 8. Registrator
  • 9. 服务注册
  • 10. 健康检查


1. 微服务之 consul

1. 什么是 consul

Consul is a distributed service mesh to connect, secure, and configure services across any runtime platform and public or private cloud. Service Mesh Made Easy.

Consul 是 HashiCorp 出品的开源服务发现工具。也有人用 etcd 或者 ZooKeeper 做类似的事情, 它们之间的区别可以看官方文档的对比 Consul vs. ZooKeeper, doozerd, etcd

Consul 提供了诸如服务发现, 健康检查, KV 数据库等功能, 方便你使用它构建自己的服务集群。

1.1. About HashiCorp

Consistent workflows to provision, secure, connect, and run any infrastructure for any application.

1.2. Consul vs. ZooKeeper, doozerd, etcd

consul 采用 http 和 dns 协议, etcd 只支持 http.

ZooKeeper, doozerd, and etcd are all similar in their architecture. All three have server nodes that require a quorum of nodes to operate (usually a simple majority). They are strongly-consistent and expose various primitives that can be used through client libraries within applications to build complex distributed systems.

Consul also uses server nodes within a single datacenter. In each datacenter, Consul servers require a quorum to operate and provide strong consistency. However, Consul has native support for multiple datacenters as well as a more feature-rich gossip system that links server nodes and clients.

All of these systems have roughly the same semantics when providing key/value storage: reads are strongly consistent and availability is sacrificed for consistency in the face of a network partition. However, the differences become more apparent when these systems are used for advanced cases.

The semantics provided by these systems are attractive for building service discovery systems, but it’s important to stress that these features must be built. ZooKeeper et al. provide only a primitive K/V store and require that application developers build their own system to provide service discovery. Consul, by contrast, provides an opinionated framework for service discovery and eliminates the guess-work and development effort. Clients simply register services and then perform discovery using a DNS or HTTP interface. Other systems require a home-rolled solution.

A compelling service discovery framework must incorporate health checking and the possibility of failures as well. It is not useful to know that Node A provides the Foo service if that node has failed or the service crashed. Naive systems make use of heartbeating, using periodic updates and TTLs. These schemes require work linear to the number of nodes and place the demand on a fixed number of servers. Additionally, the failure detection window is at least as long as the TTL.

ZooKeeper provides ephemeral nodes which are K/V entries that are removed when a client disconnects. These are more sophisticated than a heartbeat system but still have inherent scalability issues and add client-side complexity. All clients must maintain active connections to the ZooKeeper servers and perform keep-alives. Additionally, this requires “thick clients” which are difficult to write and often result in debugging challenges.

Consul uses a very different architecture for health checking. Instead of only having server nodes, Consul clients run on every node in the cluster. These clients are part of a gossip pool which serves several functions, including distributed health checking. The gossip protocol implements an efficient failure detector that can scale to clusters of any size without concentrating the work on any select group of servers. The clients also enable a much richer set of health checks to be run locally, whereas ZooKeeper ephemeral nodes are a very primitive check of liveness. With Consul, clients can check that a web server is returning 200 status codes, that memory utilization is not critical, that there is sufficient disk space, etc. The Consul clients expose a simple HTTP interface and avoid exposing the complexity of the system to clients in the same way as ZooKeeper.

Consul provides first-class support for service discovery, health checking, K/V storage, and multiple datacenters. To support anything more than simple K/V storage, all these other systems require additional tools and libraries to be built on top. By using client nodes, Consul provides a simple API that only requires thin clients. Additionally, the API can be avoided entirely by using configuration files and the DNS interface to have a complete service discovery solution with no development at all.

2. 基础概念

  • Agent: agent 就是实际运行的 consul 服务, 启动时可选以 server 或者 client 模式运行, 每个集群至少有 1 个 server, 由于使用了 Raft 算法, 所以对于每个集群你应该把它的 server 数设置成 3 或 5 个。
  • Server: 核心的 consul 服务, 存储了所有服务注册的信息, 响应查询操作, 跨数据中心通信等。
  • Client: 用来在集群中每个机器上运行, 进行服务注册 / 健康检查的进程。
  • Cluster: 集群, 由多台共同提供服务的机器组成的集合称为集群, agent 在集群的每个成员上都要运行。
  • DataCenter: 数据中心。consul 支持跨数据中心组成集群。
  • Node: 安装了 agent, 接入集群的机器称为 node。
  • Service: 你的服务, 即服务注册和服务发现之类操作的对象。通过提供 config 文件或者调用 consul 的 HTTP API 来定义一个服务。

3. 使用

因为这玩意是用 Go 写的, 所以你只要下个二进制就能跑了。从 https://www.consul.io/downloads.html 下载对应平台的二进制程序安装即可。

或者更加政治正确一些, 可以使用 docker 容器

docker pull consul:latest

4. 默认端口

consul 默认使用下列端口

  • 8300(tcp): Server RPC, server 用于接受其他 agent 的请求
  • 8301(tcp,udp): Serf LAN, 数据中心内 gossip 交换数据用
  • 8302(tcp,udp): Serf WAN, 跨数据中心 gossip 交换数据用
  • 8400(tcp): CLI RPC, 接受命令行的 RPC 调用
  • 8500(tcp): HTTP API 及 Web UI
  • 8600(tcp udp): DNS 服务, 可以把它配置到 53 端口来响应 dns 请求

5. 部署集群

以容器化部署为例。具体可参考 docker hub 上的文档

主要以下几点需要注意:

  • 如果你没有丰富的网络知识, 那么建议所有 agent 以容器化部署时, 网络模式设置为 host 模式
  • 容器内的 /consul/data 路径应挂个 volume 进去, 以持久化 agent 状态
  • consul 配置在 /consul/config 路径下, 也可以通过环境变量 CONSUL_LOCAL_CONFIG 写一个配置的 JSON 字符串来修改配置
  • 几台机器的 host 不能相同, 否则无法组成集群

5.1. 安装 server

这个启动命令基本上是从官方文档上抄的, 用命令参数进行配置显得并不十分优雅, 最好还是弄个 config 文件。详细的配置参数见文档 Configuration

$ docker run -d --net=host -e 'CONSUL_LOCAL_CONFIG={"skip_leave_on_interrupt": true}' consul agent -server -bind=<external ip> -retry-join=<root agent ip> -bootstrap-expect=<number of server agents> -data-dir=/consul/data -node=<node_name> -client=<client ip> -ui

5.2. 安装 client

$ docker run -d --net=host -e 'CONSUL_LOCAL_CONFIG={"skip_leave_on_interrupt": true}' consul agent -bind=<external ip> -retry-join=<root agent ip> -data-dir=/consul/data -node=<node_name> -client=<client interface ip> -client=<client ip> -ui

参数说明

-bind: agent 绑定到哪个 ip(用于集群内部通讯), 默认是 0.0.0.0, 为了安全最好设为内网 ip
-retry-join: server 与集群失联后重新加入集群时请求的 ip
-bootstrap-expect: 集群 server 数量
-data-dir: 数据目录的位置
-client: 作为 client 提供服务时的 ip
-ui: 启动 web ui

6. 使用

更详细请参考 官方 API 文档 这里只列几个我用到的。可以通过调用 client 的 HTTP API 来操作 consul, 也可以通过封装好的 SDK, 如官方的 go api

6.1. 列出所有节点

curl -X GET \
  http://consul.rocks/v1/catelog/nodes

6.2. 列出所有服务

curl -X GET \
  http://consul.rocks/v1/catelog/services

6.3. 列出服务的实例

curl -X GET \
  http://consul.rocks/v1/catelog/service/<service-name>

6.4. 注册服务

注册一个 redis 服务, 并设置了每 10 秒钟一次的 http 健康检查

curl -X PUT \
  http://192.168.5.36:8500/v1/agent/service/register \
  -d '{"ID":"my-redis","Name":"redis","Tags": ["primary","v1"],"Address":"127.0.0.1","Port": 6379,"Meta": {"redis_version":"4.0"},"EnableTagOverride": false,"Check": {"DeregisterCriticalServiceAfter":"90m","HTTP":"http://localhost:5000/health","Interval":"10s"
  }
}'

6.5. 注销服务

curl -X PUT \
    https://consul.rocks/v1/agent/service/deregister/<service-id>

7. Web UI

访问任意 server 的 8500 端口即可打开 Web UI, 界面还算精美, 但功能比较简陋, 可以查看集群内 node 和 service 的情况, 设置 ACL Token, 增删改查 KV 数据库等, 但无法对 node 和 service 进行进一步修改或删除操作。

8. Registrator

容器化部署最麻烦的一点是, 在容器里的服务对自己所处的环境知之甚少, 很难提供有效的信息 (如地址端口等) 用来进行服务注册, 因此需要有上帝视角的外部工具来完成这项任务。Registrator 就是这样的工具, 它可以用于向 etcd, consul, SkyDNS 2 中注册服务。具体使用方式参考文档。

docker run -d \
    --name=registrator \
    --net=host \
    --volume=/var/run/docker.sock:/tmp/docker.sock \
    gliderlabs/registrator:latest \
      consul://<client ip>:8500

9. 服务注册

它会把当前机器启动的容器自动注册到 consul 里面去。Registrator 注册的服务 id 是按照如下格式生成的

host:container_name:port

例如一台 host 为 ubuntu 的机器上的某个 redis 服务 id 可能就是

ubuntu:my_redis:6379

对于同一个容器暴露的不同端口, 会分别注册服务。

10. 健康检查

在需要进行健康检查的容器增加如下 label 或者环境变量, 即可启动健康检查, 其中 80 要改成服务的端口

SERVICE_80_CHECK_HTTP=/health/endpoint/path
SERVICE_80_CHECK_INTERVAL=15s
SERVICE_80_CHECK_TIMEOUT=1s     # optional, Consul default used otherwise

所以最终部署出来这一坨的东西基本上长成这个样子

找 3 台机器分别安装 server, 组成高可用 consul 集群, 在需要服务注册的节点 (node) 上安装 client 和 registrator。registrator 读取 docker 的信息发送到当前节点的 client 进行服务注册。client 负责将请求转发至 server 完成实际的注册工作。健康检查由 client 进行检查并上报到 server。