SOS Global Summary
The Global Summary Screen
The sos Global Summary screen provides a summary of activity system-wide:
Product version and collection interval information
Key indicators of performance data
Global statistics
CPU utilization statistics
CPU miscellaneous statistics
Memory and virtual memory statistics
Miscellaneous statistics
Disk statistics
Process statistics
Workload statistics
System performance advice
The Global Summary screen is the first screen to display when you start sos and the usual
starting point for any review of system activity and performance. The screen can be displayed in
either graphical or tabular format.
To access the Global Summary screen from any sos display screen:
Type s from the sos Enter command: prompt to view the Screen Selection Menu screen.
From the Screen Selection Menu screen, enter g (Global Summary). The Global Summary screen will display.
Type t from the Global Summary screen to toggle between the graphical and tabular formats.
Graphical Format
Figure 9.1 shows an example of the Global Summary screen in graphical format.
Figure 9.1 SOS Global Summary screen (graphical format)
This example screen contains the following components:
The SOS banner
The Key Indicators of Performance (KIP) line (optional)
GLOBAL statistics
PROCESS SUMMARY (optional)
WORKLOAD SUMMARY (optional)
SYSTEM PERFORMANCE ADVICE messages (optional)
Tabular Format
To toggle between the graphical and tabular format options, press the t key from the Global
Summary screen.
Figure 9.2 shows an example of the Global Summary screen in tabular format.
Figure 9.2 SOS Global Summary screen (tabular format)
This example screen contains the following components:
The SOS banner
CPU UTILIZATION statistics (including cumulative statistics)
CPU MISC statistics
MEM/VM statistics (optional)
MISC global statistics (optional)
DISK statistics (optional)
PROCESS SUMMARY (optional)
System Performance Advice messages (optional)
Global Summary Screen Display Items
SOS Banner
The SOS banner is always displayed at the top of all SOS data display screens.
Figure 9.3 SOS Global Summary screen: SOS banner
The banner contains information about the SOS program, the host system, the elapsed interval,
and the current interval.
Product Version Number (SOS V.nna)
The first item displayed in the SOS banner (reading left to right) is the product version number
(SOS V.nna). The version number denotes the following about the product:
SOS is the name of the product.
V denotes the major version level.
n denotes the minor version level.
a denotes the fix level.
The SOS version number displayed in the example (refer to
Figure 9.3) is B.02c. When contacting
technical support, please provide the product version number of the software installed on your
system.
System Name
The second item displayed in the SOS banner is the name of the system given during the
installation of the operating system. The name of the system used in the example shown in
Figure
9.3 is "spot."
Current Date and Time (DDD, DD MMM YYYY, HH:MM)
The third item in the SOS banner is the current date and time:
DDD denotes the day of the week.
DD denotes the day of the month.
MMM denotes the month.
YYYY denotes the year.
HH:MM denotes the hour and minutes.
Elapsed Time (E: HH:MM:SS)
The fourth item displayed in the SOS banner is the elapsed time (E:HH:MM:SS), which is the time
counted in hours, minutes, and seconds that has passed since you started the current session of
SOS. This elapsed time measurement is especially valuable when viewing cumulative statistics.
For further information, refer to
"Display cumulative stats".
To reset the elapsed time to zero, type r from any SOS display screen.
Current Interval (I: MM:SS)
The last item displayed in the SOS banner is the current interval (I: MM:SS). The current interval
is the amount of time in minutes and seconds accumulated since SOS last updated the screen.
The measurements reported on any SOS display screen are valid for the current interval.
By default, the interval refresh rate is 60 seconds. You can adjust this rate from the Main Options
Menu screen. For further information, refer to
"Screen refresh interval in seconds".
Assuming the interval refresh rate is 60 seconds, the current interval displayed in the SOS banner
should be I: 01:00. However, if at some point during the measurement interval the program has to
wait for user input, the interval update will be delayed. For example, when the f key is pressed
from an SOS display screen to "freeze" the current interval, the next update is delayed until the
user enters the command to "unfreeze" the interval.
If the current interval displayed is less than the interval refresh rate, the user pressed the u key
from an SOS display screen to update the performance data mid-interval.
Current Interval Metrics vs. Cumulative Averages
The statistical values expressed in the format "nnn.n" represent measurements for the current
interval (I: MM:SS). The values in brackets, [nnn.n], represent cumulative averages for the
elapsed interval (E: HH:MM:SS).
Key Indicators of Performance (KIP) Line
The Key Indicators of Performance (KIP) line can be displayed just below the SOS banner. This
option is invoked when the Display Key Indicators of Performance option is enabled from the SOS
Main Option Menu screen.
Figure 9.4 SOS Global Summary screen: Key Indicators of Performance (KIP) line
The purpose of the KIP line is to display statistics associated with the primary indicators of
system performance.
Total Busy
The Total Busy value displayed in the KIP line is the percentage of time the CPU spent executing
the following activities instead of being in a pause or idle state:
Processing user and system process code.
Processing interrupts.
Processing context switches.
Managing main memory.
Managing traps.
Avg Pg Res Time
The Avg Pg Res Time value displayed in the KIP line is the average response time in seconds of
the corresponding process during the current interval.
Disk Serv Time
The Disk Serv Time value displayed in the KIP line is the average number of milliseconds an I/O
request takes to be serviced once it begins to be processed by the disk (removed from the disk
queue). This value does not include wait time.
|
|
NOTE By editing the soskip text file located in the /etc/opt/lps/cfg directory, you can redefine the variables to display in the KIP line. For information about editing the soskip file, see "SOS soskip File".
|
GLOBAL
The GLOBAL statistics portion of the Global Summary screen contains a simple bar graph that
summarizes activity levels system-wide.
GLOBAL (Left Column)
CPU%
The CPU% bar graph (the left portion of the GLOBAL statistics) shows the percentage of CPU
time expended during the current measurement interval on various activities.
Figure 9.5 SOS Global Summary screen: GLOBAL (left column)
Each letter-width space on the CPU% bar graph represents approximately 2 percent of the CPUs
time for the current interval. The code letters correspond to the CPU activities described in
Table
9.1. Where a block of spaces on the bar graph is bordered by two instances of one code letter
(e.g., S...S), that corresponding activity (e.g., executing system calls and code) would account for
the CPU% range bordered by the two letters.
For example, the CPU% bar shown in
Figure 9.5 indicates the following:
Approximately 4 percent of CPU time in the current interval was spent executing user code.
Approximately 4 percent of CPU time in the current interval was spent executing system calls and code (in kernel mode).
Approximately 6 percent of CPU time in the current interval was spent waiting for disk I/Os to complete.
The code letters used in the CPU% bar graph are described in
Table 9.1.
Table 9.1 CPU% states or activities
|
Code
|
Statistic
|
Description
|
|
W
|
Wait
|
The amount of idle time the CPU spent waiting for a disk I/O to complete.
|
|
U
|
User Mode
|
The percentage of CPU time spent executing user program code with a nice value of 20 and without any special priority.
|
|
S
|
System
|
The percentage of CPU time spent executing system calls and code (in kernel mode). This does not include time spent performing context switches or idle time.
|
Pg Res
The Pg Res value represents the average page residence time in milliseconds. This is the
average amount of time a page is able to reside in memory. The default Pg Res Tm value is
600.0, which means that pages are not being forced out of memory. Values less than 600.0 mean
that pages are being forced from memory.
IO/s
The IO/s bar represents the disk I/O rate. This is the number of physical reads and writes per
second for each type of physical I/O. Similarly to the CPU% bar (see
"CPU%"), specific code
letters in the bar graph tell you how many of each type of physical I/Os were accumulated in the
current interval. Each of these code letters are listed and described in
Table 9.2.
Table 9.2 Physical I/Os
|
Code
|
Physical I/O
|
Description
|
|
R
|
Physical Reads
|
The number of physical reads per second.
|
|
W
|
Physical Writes
|
The number of physical writes per second.
|
The example screen shown in
Figure 9.5 shows that 12 physical writes were accumulated during
the current interval.
GLOBAL (Right Column)
The scale for the next four global statistics ranges from 2 to 20. A value greater than 20 is
represented by a trailing greater than character (>).
Figure 9.6 SOS Global Summary screen: GLOBAL (right column)
RunQ Len
The RunQ Len bar represents the average number of processes in the CPU run queue during the
current interval.
Swap Out/s
The Swap Out/s bar represents the number of processes swapped out per second.
I/O QLen
The I/O QLen bar represents the average number of disk I/O requests pending for all disks during
the current interval.
PROCESS SUMMARY
After reviewing the general state of global resources, the next logical step in analyzing a system’s
performance is to observe individual processes. It is important to find out which users are running
which programs and what kinds of resources those programs are consuming. The primary
purpose of the PROCESS SUMMARY portion of the Global Summary screen is to help you to
identify key resources consumed by various processes on the system.
To examine the CPU usage, disk I/O usage, and wait state information for a process, open the
Process Detail screen. For further information, see
"SOS Process Detail".
PROCESS SUMMARY Display Options
The PROCESS SUMMARY section is included in the Global Summary screen by default when
the SOS program is started. However, this information can be suppressed. For instructions, refer
to
"Display process information".
You can configure the PROCESS SUMMARY display in the following ways:
Display or suppress the extended process line.
Display either the total and I/O percentages or the read and write counts.
Display all processes or only the active processes.
Display or suppress attached processes.
Display or suppress detached processes.
Display or suppress system processes.
Display or suppress processes that have died.
Apply a process logon filter.
Apply a process sort option.
Display sorted processes in either ascending or descending order.
Set a maximum number of processes to display.
PROCESS SUMMARY Data Items
Figure 9.7 SOS Global Summary screen: PROCESS SUMMARY
The contents of each PROCESS SUMMARY column (shown in
Figure 9.7) are described in this
section.
PID
The PID is the process identification number, which uniquely identifies each process running on
the system.
Name
The name of each process running on the system is listed in the Name column.
User Name
The User Name column provides the name of the user that owns (or creates) each process
running on the system.
TTY
"TTY" is defined in SOS as the special device file of the terminal to which the process is attached.
The TTY column will show three dashes (---) for processes that are not attached to a terminal
(processes such as daemons and batch jobs).
CPU%
The CPU% column shows the percentage of CPU time that was used by each process during the
current interval.
Nice
The Nice column displays the nice value associated with each process.This value, ranging from 0
to 39 (the default is 20), is a determining factor when a process’s priority is recalculated.
A process with a larger nice value will receive a higher priority (resulting in a lower-priority status).
A process with a smaller nice value will receive a lower priority (resulting in a higher-priority status).
A process that slows system response time can be "niced" to lower its priority and allow other
processes to be executed more quickly.
Pri
The Pri column shows the most-recent priority that each process was given.
As explained earlier, high priority numbers indicate low-priority status, and vice versa. The priority
numbers between 0 and 127 indicate high-priority status and are reserved for certain system
daemons or real-time processes. The majority of processes are given numbers between 128 and
255, which indicate timeshare-priority status. A typical timeshare process will fluctuate within this
priority range, based on the process’s CPU demands and the system’s load. Processes executing
at nice priorities typically have larger numbers (lower priorities).
The system scheduler dynamically sets the priority by considering several factors, such as CPU
utilization. Because the scheduler tries to allocate CPU time fairly among the processes, it will
lower the scheduling priority of process that require a lot of CPU time. This means that as a
process’s CPU usage grows, its priority number in the Pri column will increase.
RSS/Size
The RSS/Size column presents two data items for each process running on the system. The RSS
value represents the resident set size—the amount of RAM used by the process. The Size value
represents the size in kilobytes of the core image of the process. This includes text, data, and
stack space. In other words, the amount of swap or virtual memory the process has reserved.
4Performance Tip
Large values in the RSS/Size column indicates the corresponding process uses a
lot of memory. Processes in this category may need to be checked for memory
usage problems.
#Rd
The #Rd column lists the number of physical reads performed by each process during the current
interval.
#Wr
The #Wr column shows the number of physical writes performed by each process during the
current interval.
4 Performance Tip
The #Wr values are important because they can point to processes that are
performing excessive disk I/Os. To confirm, check the SYSTEM PERFORMANCE
ADVICE portion of the Global Summary screen for a message that reports the
high I/O process for the current interval. When high #Rd and #Wr values are
evident, determine whether the I/Os are necessary or unnecessary.
State
The State column in the PROCESS SUMMARY portion of the Global Summary screen shows
which wait state the corresponding process was in at the end of the current interval. Each wait
state is described in the appendix,
"SOS/SOLARIS Wait States".
4 Performance Tip
Wait state information is helpful when you want to determine why a process is
"stuck." Keep in mind, however, that the wait state of a process can change
radically in a manner of seconds. If you suspect a problem, check the
information provided for that process in the Process Detail screen.
Extended Process Statistics Lines
The PROCESS SUMMARY portion of the Global Summary screen can be expanded to show the
percentage of time each process spent in one or more wait states during the current interval. This
additional process information is displayed below each corresponding process statistics line in an
extended process line.
The extended process lines together with the extended process headings line can be enabled
from the Process Display Options submenu of the SOS Main Options Menu or by typing the y key
from the Global Summary screen (toggles the extended process lines on and off).
The statistics in the extended process lines correspond with the column headings in the extended
process headings line. Each column heading is described in
Table 9.3.
Table 9.3 Extended process column headings
|
Heading
|
Description
|
|
PRI
|
The percentage of the process’ time spent in the PRI state during the current interval. See "Wait State Descriptions".
|
|
TPG
|
The percentage of the process’ time spent in the TPG state during the current interval. See "Wait State Descriptions".
|
|
DPG
|
The percentage of the process’ time spent in the DPG state during the current interval. See "Wait State Descriptions".
|
|
KPG
|
The percentage of the process’ time spent in the KPG state during the current interval. See "Wait State Descriptions".
|
|
ULCK
|
The percentage of a process’ time spent in the ULCK state during the current interval. See "Wait State Descriptions".
|
|
JOB
|
The percentage of the process’ time spent in the PRI state during the current interval. See "Wait State Descriptions".
|
|
OTH
|
The percentage of the process’ time spent in the OTH state during the current interval. See "Wait State Descriptions".
|
|
Gr |