Linux CPU 负载度量公式
一个top命令不就行了么?顶多再加一些管道什么的过滤一下。我一开始也是这么想得。其实还可以理解的更多。
首先一个问题,是统计某个时间点的CPU负载,还是某个时间段的?
为了画折线图报表,一般横坐标都是某个时间点,也就是希望能够统计某个时间点的CPU负载,但这是很难办得到的。比较容易的做法是通过两个时间点之间的CPU负载,也就是某个时间段。如果要做benchmark,就把时间段变得很小,1秒甚至更小。如果要常规监控, 可以将时间段放大到1分钟,甚至更多。
第二个问题,用什么来判断某个时间段的CPU的负载?
CPU有一个基本时间度量单位叫做jiffy,这是一个很短的时间,具体时长多少取决与硬件。不过关系不大,对于我的计算负载达到百分之多少来讲已经够用了。
下面这篇文章http://www.linuxhowtos.org/System/procstat.htm介绍了介绍了 介绍了/proc/stat文件。里面指的关注的是:
1. 第一行CPU的数值是下面几个CPU数值的总和
2. 一行7个数字的分别解释:
/proc/stat kernel/system statistics. Varies with architecture. Common entries include: cpu 3357 0 4313 1362393 The amount of time, measured in units of USER_HZ (1/100ths of a second on most architectures, use sysconf(_SC_CLK_TCK) to obtain the right value), that the system spent in user mode, user mode with low priority (nice), system mode, and the idle task, respectively. The last value should be USER_HZ times the second entry in the uptime pseudo-file. In Linux 2.6 this line includes three additional columns: iowait - time waiting for I/O to complete (since 2.5.41); irq - time servicing interrupts (since 2.6.0-test4); softirq - time servicing softirqs (since 2.6.0-test4). Since Linux 2.6.11, there is an eighth column, steal - stolen time, which is the time spent in other operating systems when running in a virtualized environment Since Linux 2.6.24, there is a ninth column, guest, which is the time spent running a virtual CPU for guest operating systems under the control of the Linux kernel.
第8个是虚拟机环境下,其他OS偷走的时间。
第9个是如果是host机器,那么运行的guest VM用去的时间。
这些信息也是很有用的。毕竟现在不少server其实只是VM而已。