stress是一个在linux下的压力测试小工具。
我看到有些人用这个工具来描述一些资源耗尽的场景,也有人用它来做混沌测试中。请使用者要注意,这个工具并不是模拟业务问题的,是模拟系统级问题的。所以用它来模拟的时候,和业务真实场景差别还是很大的。
因为在性能工作中,经常有人因为不理解工具而误用工具,所以这里我把这个工具拿出来说明一下。
安装stress
yum install -y stress
stress参数
[root@7DGroupT1 ]# stress `stress' imposes certain types of compute stress on your system stress [OPTION [ARG]] ... , help show help statement version show version statement v, verbose be verbose q, quiet be quiet n, dryrun show what would have been done t, timeout timeout after seconds backoff wait factor of microseconds before work starts c, cpu spawn workers spinning on () i, io spawn workers spinning on () m, vm spawn workers spinning on ()() vmbytes malloc bytes per vm worker ( is MB) vmstride touch a every bytes ( is ) vmhang sleep secs before free ( none, is inf) vmkeep redirty memory instead of freeing and reallocating d, hdd spawn workers spinning on ()() hddbytes write bytes per hdd worker ( is GB) stress cpu io vm vmbytes M timeout s may be suffixed ,m,h,d,y (time) or ,,, (size). [root@7DGroupT1 ]#
参数可以说非常简单了。大概看一眼就可以知道,它可以模拟CPU、IO、内存、磁盘这些常见又重要的资源消耗。
下面就一一来看一下。
模拟CPU
[root@7DGroupT1 ~]# stress -c 4 -t 100
top up days, , users, load average , , total, running, sleeping, stopped, zombie us, sy, ni, id, wa, hi, si, st us, sy, ni, id, wa, hi, si, st us, sy, ni, id, wa, hi, si, st us, sy, ni, id, wa, hi, si, st total, free, used, buffcache total, free, used. avail
模拟CPU的参数很简洁。打印一下栈看一下:
[root@s6 ~]# pstack 29253#0 0x00007f123634761b in random () from /usr/lib64/libc.so.6#1 0x00007f1236347b39 in rand () from /usr/lib64/libc.so.6#2 0x0000557e9ea32dbd in hogcpu ()#3 0x0000557e9ea3180a in main ()[root@s6 ~]
其实代码很简单,就是一个hogcpu函数。源码是这样的:
int
hogcpu (void)
{
while (1)
sqrt (rand ());
return 0;
}
是不是看了之后觉得自己都能写一个了?不就是一个while吗?
模拟内存
[root@7DGroupT1 ~]# stress --vm 30 --vm-bytes 1G --vm-hang 50 --timeout 50s
[root@7DGroupT1 ]# vmstat procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa st [root@7DGroupT1 ]# sar .el7.x86_64 (GroupT1) _x86_64_ ( CPU) AM pgpgins pgpgouts faults majflts pgfrees pgscanks pgscands pgsteals vmeff AM AM AM AM AM AM AM AM AM AM AM AM [root@7DGroupT1 ]# sar r .el7.x86_64 (GroupT1) _x86_64_ ( CPU) AM kbmemfree kbmemused memused kbbuffers kbcached kbcommit commit kbactive kbinact kbdirty AM
从上面的数据来看,确实产生了很大的page faults,这也是模拟内存消耗的过程中必然会出现的现象。之前我也强调过,看内存够不够,就是要看这个faluts。
在Stress中,怎么模拟的内存呢。来看一下。
int
hogvm (long long bytes, long long stride, long long hang, int keep)
{
long long i;
char *ptr = 0;
char c;
int do_malloc = 1;
while (1)
{
if (do_malloc)
{
dbg (stdout, "allocating %lli bytes ...\n", bytes);
if (!(ptr = (char *) malloc (bytes * sizeof (char))))
{
err (stderr, "hogvm malloc failed: %s\n", strerror (errno));
return 1;
}
if (keep)
do_malloc = 0;
}
dbg (stdout, "touching bytes in strides of %lli bytes ...\n", stride);
for (i = 0; i < bytes; i += stride)
ptr[i] = 'Z'; /* Ensure that COW happens. */
if (hang == 0)
{
dbg (stdout, "sleeping forever with allocated memory\n");
while (1)
sleep (1024);
}
else if (hang > 0)
{
dbg (stdout, "sleeping for %llis with allocated memory\n", hang);
sleep (hang);
}
for (i = 0; i < bytes; i += stride)
{
c = ptr[i];
if (c != 'Z')
{
err (stderr, "memory corruption at: %p\n", ptr + i);
return 1;
}
}
if (do_malloc)
{
free (ptr);
dbg (stdout, "freed %lli bytes\n", bytes);
}
}
return 0;
}
就是一个死循环加上一个内存malloc。
模拟磁盘
[root@7DGroupT1 ~]# stress --hdd 5 --hdd-bytes 1G
[root@7DGroupT1 ]# top top up days, , users, load average , , total, running, sleeping, stopped, zombie us, sy, ni, id, wa, hi, si, st us, sy, ni, id, wa, hi, si, st us, sy, ni, id, wa, hi, si, st us, sy, ni, id, wa, hi, si, st total, free, used, buffcache total, free, used. avail [root@7DGroupT1 ]# vmstat procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa st [root@7DGroupT1 ]# sar d AM DEV tps rd_secs wr_secs avgrqsz avgqusz await svctm util AM dev253 AM dev11 AM DEV tps rd_secs wr_secs avgrqsz avgqusz await svctm util AM dev253 AM dev11 AM DEV tps rd_secs wr_secs avgrqsz avgqusz await svctm util AM dev253 AM dev11 AM DEV tps rd_secs wr_secs avgrqsz avgqusz await svctm util AM dev253 AM dev11 AM DEV tps rd_secs wr_secs avgrqsz avgqusz await svctm util AM dev253 AM dev11 [root@7DGroupT1 ]# iostat x d rrqms wrqms rs ws rkBs wkBs avgrqsz avgqusz await r_await w_await svctm util vda scd0 rrqms wrqms rs ws rkBs wkBs avgrqsz avgqusz await r_await w_await svctm util vda scd0 rrqms wrqms rs ws rkBs wkBs avgrqsz avgqusz await r_await w_await svctm util vda scd0 rrqms wrqms rs ws rkBs wkBs avgrqsz avgqusz await r_await w_await svctm util vda scd0 [root@7DGroupT1 ]# iotop DISK READ s DISK WRITE s DISK READ s DISK WRITE s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND be root s s stress hdd hddbytes G be root s s stress hdd hddbytes G be root s s stress hdd hddbytes G be root s s stress hdd hddbytes G be root s s stress hdd hddbytes G be root s s [kworkeru8] be root s s [jbd2vda1] be root s s be root s s systemd systdeserialize be root s s [kthreadd] be root s s [ksoftirqd] be libstora s s lsmd d be root s s [kworkerH] be root s s acpid rt root s s [migration] be root s s [rcu_bh] be root s s [rcu_sched] rt root s s [watchdog] rt root s s [watchdog] rt root s s [migration] be root s s [ksoftirqd] be polkitd s s polkitd nod ] be root s s [kworkerH] rt root s s [watchdog] rt root s s [migration] be root s s [ksoftirqd] be root s s python usintuned l be root s s [kworkerH] rt root s s [watchdog] rt root s s [migration] be root s s [ksoftirqd] be root s s [kworkerH] be root s s [kdevtmpfs] be root s s [netns] be root s s [khungtaskd] be root s s [writeback]
模拟磁盘,看起来效果也是非常不错的哦。来翻一下源码。
int
hoghdd (long long bytes)
{
long long i, j;
int fd;
int chunk = (1024 * 1024) - 1; /* Minimize slow writing. */
char buff[chunk];
/* Initialize buffer with some random ASCII data. */
dbg (stdout, "seeding %d byte buffer with random data\n", chunk);
for (i = 0; i < chunk - 1; i++)
{
j = rand ();
j = (j < 0) ? -j : j;
j %= 95;
j += 32;
buff[i] = j;
}
buff[i] = '\n';
while (1)
{
char name[] = "./stress.XXXXXX";
if ((fd = mkstemp (name)) == -1)
{
err (stderr, "mkstemp failed: %s\n", strerror (errno));
return 1;
}
dbg (stdout, "opened %s for writing %lli bytes\n", name, bytes);
dbg (stdout, "unlinking %s\n", name);
if (unlink (name) == -1)
{
err (stderr, "unlink of %s failed: %s\n", name, strerror (errno));
return 1;
}
dbg (stdout, "fast writing to %s\n", name);
for (j = 0; bytes == 0 || j + chunk < bytes; j += chunk)
{
if (write (fd, buff, chunk) == -1)
{
err (stderr, "write failed: %s\n", strerror (errno));
return 1;
}
}
dbg (stdout, "slow writing to %s\n", name);
for (; bytes == 0 || j < bytes - 1; j++)
{
if (write (fd, &buff[j % chunk], 1) == -1)
{
err (stderr, "write failed: %s\n", strerror (errno));
return 1;
}
}
if (write (fd, "\n", 1) == -1)
{
err (stderr, "write failed: %s\n", strerror (errno));
return 1;
}
++j;
dbg (stdout, "closing %s after %lli bytes\n", name, j);
close (fd);
}
return 0;
}
死循环加上for循环不断的调用write。这个调用,就是不停地做写的动作。这个和我们在上面看到的监控数据也是一致的。
总结一下,通过这些源码说明,请你在使用的时候,要注意一下,像这样的工具,如果说只是为了单纯地消耗系统级的资源,然后观察应用在较少的可用资源下的表现如何,这样的工具是可以用的。
但是如果是想要模拟你的业务层出现的问题,那我劝你还是别用这样的工具了。