限制容器的资源:
默认情况下,一个容器是没有任何资源限制的,可以几乎耗尽内核可分配给当前容器的所有资源,宿主机的调度器能调度多少资源,容器就可以用多少资源(高负载的情况下)
docker提供了下面的途径:如何限制内存,CPU,磁盘IO等,
内存是非可压缩资源,CPU是可压缩资源,依赖于linux的一些深层知识
memory hogs
oom obj
oom score
非常非常重要的容器在创建时就应该调整它的oom obj
docker run或docker create时限制容器的使用资源
内存资源控制:https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
默认情况下,每个容器都可以使用宿主机的所有CPU
用户可以设置不同的容器去限制每个容器的CPU
Docker1.13以上,you can also configure the realtime scheduler.
Most user use and configure the default CFS scheduler
进程:CPU密集型,IO密集型
OOME
在linux主机上,当内核探测到宿主机没有足够的内存可用于系统重要的功能,就会抛出一个OOME(Out Of Memory Exception)异常,并开始kill某些进程以释放内存资源
- 一旦发生OOME,任何进程都有可能被杀死,包括docker daemon在内
- 为此,Docker特地调整了docker daemon的OOM优先级,以免他被“正法”,但容器的优先级并未被调整
lscpu
[root@node1 ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
座: 1
NUMA 节点: 1
厂商 ID: GenuineIntel
CPU 系列: 6
型号: 42
型号名称: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
步进: 7
CPU MHz: 3801.000
BogoMIPS: 7602.00
虚拟化: VT-x
超管理器厂商: VMware
虚拟化类型: 完全
L1d 缓存: 32K
L1i 缓存: 32K
L2 缓存: 256K
L3 缓存: 8192K
NUMA 节点0 CPU: 0
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc eagerfpu pni pclmulqdq vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm tpr_shadow vnmi ept vpid tsc_adjust arat
docker run --help
[root@node1 ~]# docker run --help
Usage: docker run [OPTIONS] IMAGE [COMMAND] [ARG...]
Run a command in a new container
Options:
--add-host list Add a custom host-to-IP mapping (host:ip)
-a, --attach list Attach to STDIN, STDOUT or STDERR
--blkio-weight uint16 Block IO (relative weight), between 10 and 1000, or 0 to disable (default 0)
--blkio-weight-device list Block IO weight (relative device weight) (default [])
--cap-add list Add Linux capabilities
--cap-drop list Drop Linux capabilities
--cgroup-parent string Optional parent cgroup for the container
--cidfile string Write the container ID to the file
--cpu-period int Limit CPU CFS (Completely Fair Scheduler) period
--cpu-quota int Limit CPU CFS (Completely Fair Scheduler) quota
--cpu-rt-period int Limit CPU real-time period in microseconds
--cpu-rt-runtime int Limit CPU real-time runtime in microseconds
-c, --cpu-shares int CPU shares (relative weight)
--cpus decimal Number of CPUs
--cpuset-cpus string CPUs in which to allow execution (0-3, 0,1)
--cpuset-mems string MEMs in which to allow execution (0-3, 0,1)
-d, --detach Run container in background and print container ID
--detach-keys string Override the key sequence for detaching a container
--device list Add a host device to the container
--device-cgroup-rule list Add a rule to the cgroup allowed devices list
--device-read-bps list Limit read rate (bytes per second) from a device (default [])
--device-read-iops list Limit read rate (IO per second) from a device (default [])
--device-write-bps list Limit write rate (bytes per second) to a device (default [])
--device-write-iops list Limit write rate (IO per second) to a device (default [])
--disable-content-trust Skip image verification (default true)
--dns list Set custom DNS servers
--dns-option list Set DNS options
--dns-search list Set custom DNS search domains
--entrypoint string Overwrite the default ENTRYPOINT of the image
-e, --env list Set environment variables
--env-file list Read in a file of environment variables
--expose list Expose a port or a range of ports
--group-add list Add additional groups to join
--health-cmd string Command to run to check health
--health-interval duration Time between running the check (ms|s|m|h) (default 0s)
--health-retries int Consecutive failures needed to report unhealthy
--health-start-period duration Start period for the container to initialize before starting health-retries countdown
(ms|s|m|h) (default 0s)
--health-timeout duration Maximum time to allow one check to run (ms|s|m|h) (default 0s)
--help Print usage
-h, --hostname string Container host name
--init Run an init inside the container that forwards signals and reaps processes
-i, --interactive Keep STDIN open even if not attached
--ip string IPv4 address (e.g., 172.30.100.104)
--ip6 string IPv6 address (e.g., 2001:db8::33)
--ipc string IPC mode to use
--isolation string Container isolation technology
--kernel-memory bytes Kernel memory limit
-l, --label list Set meta data on a container
--label-file list Read in a line delimited file of labels
--link list Add link to another container
--link-local-ip list Container IPv4/IPv6 link-local addresses
--log-driver string Logging driver for the container
--log-opt list Log driver options
--mac-address string Container MAC address (e.g., 92:d0:c6:0a:29:33)
-m, --memory bytes Memory limit
--memory-reservation bytes Memory soft limit
--memory-swap bytes Swap limit equal to memory plus swap: '-1' to enable unlimited swap
--memory-swappiness int Tune container memory swappiness (0 to 100) (default -1)
--mount mount Attach a filesystem mount to the container
--name string Assign a name to the container
--network string Connect a container to a network (default "default")
--network-alias list Add network-scoped alias for the container
--no-healthcheck Disable any container-specified HEALTHCHECK
--oom-kill-disable Disable OOM Killer
--oom-score-adj int Tune host's OOM preferences (-1000 to 1000)
--pid string PID namespace to use
--pids-limit int Tune container pids limit (set -1 for unlimited)
--privileged Give extended privileges to this container
-p, --publish list Publish a container's port(s) to the host
-P, --publish-all Publish all exposed ports to random ports
--read-only Mount the container's root filesystem as read only
--restart string Restart policy to apply when a container exits (default "no")
--rm Automatically remove the container when it exits
--runtime string Runtime to use for this container
--security-opt list Security Options
--shm-size bytes Size of /dev/shm
--sig-proxy Proxy received signals to the process (default true)
--stop-signal string Signal to stop a container (default "SIGTERM")
--stop-timeout int Timeout (in seconds) to stop a container
--storage-opt list Storage driver options for the container
--sysctl map Sysctl options (default map[])
--tmpfs list Mount a tmpfs directory
-t, --tty Allocate a pseudo-TTY
--ulimit ulimit Ulimit options (default [])
-u, --user string Username or UID (format: <name|uid>[:<group|gid>])
--userns string User namespace to use
--uts string UTS namespace to use
-v, --volume list Bind mount a volume
--volume-driver string Optional volume driver for the container
--volumes-from list Mount volumes from the specified container(s)
-w, --workdir string Working directory inside the container
压测的镜像:stress-ng
https://hub.docker.com/r/lorel/docker-stress-ng
[root@node1 ~]# docker pull lorel/docker-stress-ng
[root@node1 ~]# docker run --name stress -it --rm lorel/docker-stress-ng:latest stress --help
[root@node1 ~]# docker run --name stress -it --rm lorel/docker-stress-ng:latest stress --help
stress-ng, version 0.03.11
Usage: stress-ng [OPTION [ARG]]
--h, --help show help
--affinity N start N workers that rapidly change CPU affinity
--affinity-ops N stop when N affinity bogo operations completed
--affinity-rand change affinity randomly rather than sequentially
--aio N start N workers that issue async I/O requests
--aio-ops N stop when N bogo async I/O requests completed
--aio-requests N number of async I/O requests per worker
-a N, --all N start N workers of each stress test
-b N, --backoff N wait of N microseconds before work starts
-B N, --bigheap N start N workers that grow the heap using calloc()
--bigheap-ops N stop when N bogo bigheap operations completed
--bigheap-growth N grow heap by N bytes per iteration
--brk N start N workers performing rapid brk calls
--brk-ops N stop when N brk bogo operations completed
--brk-notouch don't touch (page in) new data segment page
--bsearch start N workers that exercise a binary search
--bsearch-ops stop when N binary search bogo operations completed
--bsearch-size number of 32 bit integers to bsearch
-C N, --cache N start N CPU cache thrashing workers
--cache-ops N stop when N cache bogo operations completed (x86 only)
--cache-flush flush cache after every memory write (x86 only)
--cache-fence serialize stores
--class name specify a class of stressors, use with --sequential
--chmod N start N workers thrashing chmod file mode bits
--chmod-ops N stop chmod workers after N bogo operations
-c N, --cpu N start N workers spinning on sqrt(rand())
--cpu-ops N stop when N cpu bogo operations completed
-l P, --cpu-load P load CPU by P %%, 0=sleep, 100=full load (see -c)
--cpu-method m specify stress cpu method m, default is all
-D N, --dentry N start N dentry thrashing processes
--dentry-ops N stop when N dentry bogo operations completed
--dentry-order O specify dentry unlink order (reverse, forward, stride)
--dentries N create N dentries per iteration
--dir N start N directory thrashing processes
--dir-ops N stop when N directory bogo operations completed
-n, --dry-run do not run
--dup N start N workers exercising dup/close
--dup-ops N stop when N dup/close bogo operations completed
--epoll N start N workers doing epoll handled socket activity
--epoll-ops N stop when N epoll bogo operations completed
--epoll-port P use socket ports P upwards
--epoll-domain D specify socket domain, default is unix
--eventfd N start N workers stressing eventfd read/writes
--eventfd-ops N stop eventfd workers after N bogo operations
--fault N start N workers producing page faults
--fault-ops N stop when N page fault bogo operations completed
--fifo N start N workers exercising fifo I/O
--fifo-ops N stop when N fifo bogo operations completed
--fifo-readers N number of fifo reader processes to start
--flock N start N workers locking a single file
--flock-ops N stop when N flock bogo operations completed
-f N, --fork N start N workers spinning on fork() and exit()
--fork-ops N stop when N fork bogo operations completed
--fork-max P create P processes per iteration, default is 1
--fstat N start N workers exercising fstat on files
--fstat-ops N stop when N fstat bogo operations completed
--fstat-dir path fstat files in the specified directory
--futex N start N workers exercising a fast mutex
--futex-ops N stop when N fast mutex bogo operations completed
--get N start N workers exercising the get*() system calls
--get-ops N stop when N get bogo operations completed
-d N, --hdd N start N workers spinning on write()/unlink()
--hdd-ops N stop when N hdd bogo operations completed
--hdd-bytes N write N bytes per hdd worker (default is 1GB)
--hdd-direct minimize cache effects of the I/O
--hdd-dsync equivalent to a write followed by fdatasync
--hdd-noatime do not update the file last access time
--hdd-sync equivalent to a write followed by fsync
--hdd-write-size N set the default write size to N bytes
--hsearch start N workers that exercise a hash table search
--hsearch-ops stop when N hash search bogo operations completed
--hsearch-size number of integers to insert into hash table
--inotify N start N workers exercising inotify events
--inotify-ops N stop inotify workers after N bogo operations
-i N, --io N start N workers spinning on sync()
--io-ops N stop when N io bogo operations completed
--ionice-class C specify ionice class (idle, besteffort, realtime)
--ionice-level L specify ionice level (0 max, 7 min)
-k, --keep-name keep stress process names to be 'stress-ng'
--kill N start N workers killing with SIGUSR1
--kill-ops N stop when N kill bogo operations completed
--lease N start N workers holding and breaking a lease
--lease-ops N stop when N lease bogo operations completed
--lease-breakers N number of lease breaking processes to start
--link N start N workers creating hard links
--link-ops N stop when N link bogo operations completed
--lsearch start N workers that exercise a linear search
--lsearch-ops stop when N linear search bogo operations completed
--lsearch-size number of 32 bit integers to lsearch
-M, --metrics print pseudo metrics of activity
--metrics-brief enable metrics and only show non-zero results
--memcpy N start N workers performing memory copies
--memcpy-ops N stop when N memcpy bogo operations completed
--mmap N start N workers stressing mmap and munmap
--mmap-ops N stop when N mmap bogo operations completed
--mmap-async using asynchronous msyncs for file based mmap
--mmap-bytes N mmap and munmap N bytes for each stress iteration
--mmap-file mmap onto a file using synchronous msyncs
--mmap-mprotect enable mmap mprotect stressing
--msg N start N workers passing messages using System V messages
--msg-ops N stop msg workers after N bogo messages completed
--mq N start N workers passing messages using POSIX messages
--mq-ops N stop mq workers after N bogo messages completed
--mq-size N specify the size of the POSIX message queue
--nice N start N workers that randomly re-adjust nice levels
--nice-ops N stop when N nice bogo operations completed
--no-madvise don't use random madvise options for each mmap
--null N start N workers writing to /dev/null
--null-ops N stop when N /dev/null bogo write operations completed
-o, --open N start N workers exercising open/close
--open-ops N stop when N open/close bogo operations completed
-p N, --pipe N start N workers exercising pipe I/O
--pipe-ops N stop when N pipe I/O bogo operations completed
-P N, --poll N start N workers exercising zero timeout polling
--poll-ops N stop when N poll bogo operations completed
--procfs N start N workers reading portions of /proc
--procfs-ops N stop procfs workers after N bogo read operations
--pthread N start N workers that create multiple threads
--pthread-ops N stop pthread workers after N bogo threads created
--pthread-max P create P threads at a time by each worker
-Q, --qsort N start N workers exercising qsort on 32 bit random integers
--qsort-ops N stop when N qsort bogo operations completed
--qsort-size N number of 32 bit integers to sort
-q, --quiet quiet output
-r, --random N start N random workers
--rdrand N start N workers exercising rdrand instruction (x86 only)
--rdrand-ops N stop when N rdrand bogo operations completed
-R, --rename N start N workers exercising file renames
--rename-ops N stop when N rename bogo operations completed
--sched type set scheduler type
--sched-prio N set scheduler priority level N
--seek N start N workers performing random seek r/w IO
--seek-ops N stop when N seek bogo operations completed
--seek-size N length of file to do random I/O upon
--sem N start N workers doing semaphore operations
--sem-ops N stop when N semaphore bogo operations completed
--sem-procs N number of processes to start per worker
--sendfile N start N workers exercising sendfile
--sendfile-ops N stop after N bogo sendfile operations
--sendfile-size N size of data to be sent with sendfile
--sequential N run all stressors one by one, invoking N of them
--sigfd N start N workers reading signals via signalfd reads
--sigfd-ops N stop when N bogo signalfd reads completed
--sigfpe N start N workers generating floating point math faults
--sigfpe-ops N stop when N bogo floating point math faults completed
--sigsegv N start N workers generating segmentation faults
--sigsegv-ops N stop when N bogo segmentation faults completed
-S N, --sock N start N workers doing socket activity
--sock-ops N stop when N socket bogo operations completed
--sock-port P use socket ports P to P + number of workers - 1
--sock-domain D specify socket domain, default is ipv4
--stack N start N workers generating stack overflows
--stack-ops N stop when N bogo stack overflows completed
-s N, --switch N start N workers doing rapid context switches
--switch-ops N stop when N context switch bogo operations completed
--symlink N start N workers creating symbolic links
--symlink-ops N stop when N symbolic link bogo operations completed
--sysinfo N start N workers reading system information
--sysinfo-ops N stop when sysinfo bogo operations completed
-t N, --timeout N timeout after N seconds
-T N, --timer N start N workers producing timer events
--timer-ops N stop when N timer bogo events completed
--timer-freq F run timer(s) at F Hz, range 1000 to 1000000000
--tsearch start N workers that exercise a tree search
--tsearch-ops stop when N tree search bogo operations completed
--tsearch-size number of 32 bit integers to tsearch
--times show run time summary at end of the run
-u N, --urandom N start N workers reading /dev/urandom
--urandom-ops N stop when N urandom bogo read operations completed
--utime N start N workers updating file timestamps
--utime-ops N stop after N utime bogo operations completed
--utime-fsync force utime meta data sync to the file system
-v, --verbose verbose output
--verify verify results (not available on all tests)
-V, --version show version
-m N, --vm N start N workers spinning on anonymous mmap
--vm-bytes N allocate N bytes per vm worker (default 256MB)
--vm-hang N sleep N seconds before freeing memory
--vm-keep redirty memory instead of reallocating
--vm-ops N stop when N vm bogo operations completed
--vm-locked lock the pages of the mapped region into memory
--vm-method m specify stress vm method m, default is all
--vm-populate populate (prefault) page tables for a mapping
--wait N start N workers waiting on child being stop/resumed
--wait-ops N stop when N bogo wait operations completed
--zero N start N workers reading /dev/zero
--zero-ops N stop when N /dev/zero bogo read operations completed
Example: stress-ng --cpu 8 --io 4 --vm 2 --vm-bytes 128M --fork 4 --timeout 10s
Note: Sizes can be suffixed with B,K,M,G and times with s,m,h,d,y
使用stress镜像压测玩一下:
-m 256m表示该容器最大使用256m内存
--vm 2 表示启动两个进程,每个进程默认256m内存,但容器只给分配了256m
[root@node1 ~]# docker run --name stress -it --rm -m 256m lorel/docker-stress-ng:latest stress --vm 2
stress-ng: info: [1] defaulting to a 86400 second run per stressor
stress-ng: info: [1] dispatching hogs: 2 vm
docker top stress查看stress容器占了多少资源,有4个子进程
[root@node1 ~]# docker top stress
UID PID PPID C STIME TTY TIME CMD
root 24334 24320 0 21:49 ? 00:00:00 /usr/bin/stress-ng stress --vm 2
root 24371 24334 0 21:49 ? 00:00:00 /usr/bin/stress-ng stress --vm 2
root 24372 24334 0 21:49 ? 00:00:00 /usr/bin/stress-ng stress --vm 2
root 24374 24372 26 21:49 ? 00:00:00 /usr/bin/stress-ng stress --vm 2
root 24379 24371 51 21:49 ? 00:00:00 /usr/bin/stress-ng stress --vm 2
docker stats监控容器资源消耗
这个是实施刷新的
证实,容器内存资源可以限制:只分配256m,容器需要512m,没有,哈哈
stress压测CPU
--cpus 1限制容器只能最大使用一个核心
--cpu 8表示启动8个进程,但由于限制容器只能使用一个核。所以CPU最大利用率不超过百分之百
[root@docker1 harbor]# docker container run --name stress2 --rm -it --cpus 1 lorel/docker-stress-ng --cpu 8
stress-ng: info: [1] defaulting to a 86400 second run per stressor
stress-ng: info: [1] dispatching hogs: 8 cpu
docker stats查看容器资源消耗情况
如果不限制容器的内核,那么两个容器的CPU使用率会是加起来小于百分之二百(两核)
[root@docker1 harbor]# docker container run --name stress2 --rm -it lorel/docker-stress-ng --cpu 8
stress-ng: info: [1] defaulting to a 86400 second run per stressor
stress-ng: info: [1] dispatching hogs: 8 cpu
CPU负载很高:
指定容器运行在某个核心上
docker2机器有4个核心,不限制cpu的话单个容器cpu使用率可以高达差不多百分之400
--cpuset-cpus 0,1表示容器在0和1核心上运行,使用率自然约为百分之二百
[root@docker2 ~]# docker container run --name stress2 --rm -it --cpuset-cpus 0,1 lorel/docker-stress-ng --cpu 8
stress-ng: info: [1] defaulting to a 86400 second run per stressor
stress-ng: info: [1] dispatching hogs: 8 cpu
--cpu-shares int CPU shares (relative weight)
设置CPU利用率的权重
第一个容器权重设置为100并查看容器资源利用率:
[root@docker2 ~]# docker container run --name stress2 --rm -it --cpu-shares 100 lorel/docker-stress-ng --cpu 8
stress-ng: info: [1] defaulting to a 86400 second run per stressor
stress-ng: info: [1] dispatching hogs: 8 cpu
再启动第二个容器,CPU利用率权重为50
[root@docker2 ~]# docker container run --name stress3 --rm -it --cpu-shares 50 lorel/docker-stress-ng --cpu 8
stress-ng: info: [1] defaulting to a 86400 second run per stressor
stress-ng: info: [1] dispatching hogs: 8 cpu
CPU使用率基本上是2:1
然后电脑就卡得不要不要的,直到我杀死容器
(未完待续)。。。