TOCPREVNEXTINDEX

Lund Performance Solutions


SOS Global Summary

The Global Summary Screen

The sos Global Summary screen provides a summary of activity system-wide:
  • Product version and collection interval information
  • Key indicators of performance data
  • Global statistics
  • CPU utilization statistics
  • CPU miscellaneous statistics
  • Memory and virtual memory statistics
  • Miscellaneous statistics
  • Disk statistics
  • Process statistics
  • Workload statistics
  • System performance advice
  • The Global Summary screen is the first screen to display when you start sos and the usual starting point for any review of system activity and performance. The screen can be displayed in either graphical or tabular format.
    To access the Global Summary screen from any sos display screen:
  • Type s from the sos Enter command: prompt to view the Screen Selection Menu screen.
  • From the Screen Selection Menu screen, enter g (Global Summary). The Global Summary screen will display.
  • Type t from the Global Summary screen to toggle between the graphical and tabular formats.
  • Graphical Format

    Figure 9.1 shows an example of the Global Summary screen in graphical format.


    Figure 9.1 SOS Global Summary screen (graphical format)
    This example screen contains the following components:
  • The SOS banner
  • The Key Indicators of Performance (KIP) line (optional)
  • GLOBAL statistics
  • PROCESS SUMMARY (optional)
  • WORKLOAD SUMMARY (optional)
  • SYSTEM PERFORMANCE ADVICE messages (optional)
  • Each of these components is described in "Global Summary Screen Display Items".

    Tabular Format

    To toggle between the graphical and tabular format options, press the t key from the Global Summary screen. Figure 9.2 shows an example of the Global Summary screen in tabular format.


    Figure 9.2 SOS Global Summary screen (tabular format)
    This example screen contains the following components:
  • The SOS banner
  • CPU UTILIZATION statistics (including cumulative statistics)
  • CPU MISC statistics
  • MEM/VM statistics (optional)
  • MISC global statistics (optional)
  • DISK statistics (optional)
  • PROCESS SUMMARY (optional)
  • System Performance Advice messages (optional)
  • Each of these components is described in detail in "Global Summary Screen Display Items".

    Global Summary Screen Display Items

    SOS Banner

    The SOS banner is always displayed at the top of all SOS data display screens.


    Figure 9.3 SOS Global Summary screen: SOS banner
    The banner contains information about the SOS program, the host system, the elapsed interval, and the current interval.
    Product Version Number (SOS V.nna)
    The first item displayed in the SOS banner (reading left to right) is the product version number (SOS V.nna). The version number denotes the following about the product:
  • SOS is the name of the product.
  • V denotes the major version level.
  • n denotes the minor version level.
  • a denotes the fix level.
  • The SOS version number displayed in the example (refer to Figure 9.3) is B.02c. When contacting technical support, please provide the product version number of the software installed on your system.
    System Name
    The second item displayed in the SOS banner is the name of the system given during the installation of the operating system. The name of the system used in the example shown in Figure 9.3 is "spot."
    Current Date and Time (DDD, DD MMM YYYY, HH:MM)
    The third item in the SOS banner is the current date and time:
  • DDD denotes the day of the week.
  • DD denotes the day of the month.
  • MMM denotes the month.
  • YYYY denotes the year.
  • HH:MM denotes the hour and minutes.
  • Elapsed Time (E: HH:MM:SS)
    The fourth item displayed in the SOS banner is the elapsed time (E:HH:MM:SS), which is the time counted in hours, minutes, and seconds that has passed since you started the current session of SOS. This elapsed time measurement is especially valuable when viewing cumulative statistics. For further information, refer to "Display cumulative stats".

    To reset the elapsed time to zero, type r from any SOS display screen.
    Current Interval (I: MM:SS)
    The last item displayed in the SOS banner is the current interval (I: MM:SS). The current interval is the amount of time in minutes and seconds accumulated since SOS last updated the screen. The measurements reported on any SOS display screen are valid for the current interval.

    By default, the interval refresh rate is 60 seconds. You can adjust this rate from the Main Options Menu screen. For further information, refer to "Screen refresh interval in seconds".

    Assuming the interval refresh rate is 60 seconds, the current interval displayed in the SOS banner should be I: 01:00. However, if at some point during the measurement interval the program has to wait for user input, the interval update will be delayed. For example, when the f key is pressed from an SOS display screen to "freeze" the current interval, the next update is delayed until the user enters the command to "unfreeze" the interval.

    If the current interval displayed is less than the interval refresh rate, the user pressed the u key from an SOS display screen to update the performance data mid-interval.
    Current Interval Metrics vs. Cumulative Averages
    The statistical values expressed in the format "nnn.n" represent measurements for the current interval (I: MM:SS). The values in brackets, [nnn.n], represent cumulative averages for the elapsed interval (E: HH:MM:SS).

    Key Indicators of Performance (KIP) Line

    The Key Indicators of Performance (KIP) line can be displayed just below the SOS banner. This option is invoked when the Display Key Indicators of Performance option is enabled from the SOS Main Option Menu screen.

     
     
    KIP Line
     
     

    Figure 9.4 SOS Global Summary screen: Key Indicators of Performance (KIP) line
    The purpose of the KIP line is to display statistics associated with the primary indicators of system performance.
    Total Busy
    The Total Busy value displayed in the KIP line is the percentage of time the CPU spent executing the following activities instead of being in a pause or idle state:
  • Processing user and system process code.
  • Processing interrupts.
  • Processing context switches.
  • Managing main memory.
  • Managing traps.
  • Avg Pg Res Time
    The Avg Pg Res Time value displayed in the KIP line is the average response time in seconds of the corresponding process during the current interval.
    Disk Serv Time
    The Disk Serv Time value displayed in the KIP line is the average number of milliseconds an I/O request takes to be serviced once it begins to be processed by the disk (removed from the disk queue). This value does not include wait time.

    NOTE By editing the soskip text file located in the /etc/opt/lps/cfg directory, you can redefine the variables to display in the KIP line. For information about editing the soskip file, see "SOS soskip File".

    GLOBAL

    The GLOBAL statistics portion of the Global Summary screen contains a simple bar graph that summarizes activity levels system-wide.

    GLOBAL (Left Column)

    CPU%
    The CPU% bar graph (the left portion of the GLOBAL statistics) shows the percentage of CPU time expended during the current measurement interval on various activities.
    Figure 9.5 SOS Global Summary screen: GLOBAL (left column)
    Each letter-width space on the CPU% bar graph represents approximately 2 percent of the CPUs time for the current interval. The code letters correspond to the CPU activities described in Table 9.1. Where a block of spaces on the bar graph is bordered by two instances of one code letter (e.g., S...S), that corresponding activity (e.g., executing system calls and code) would account for the CPU% range bordered by the two letters.
    For example, the CPU% bar shown in Figure 9.5 indicates the following:
  • Approximately 4 percent of CPU time in the current interval was spent executing user code.
  • Approximately 4 percent of CPU time in the current interval was spent executing system calls and code (in kernel mode).
  • Approximately 6 percent of CPU time in the current interval was spent waiting for disk I/Os to complete.
  • The code letters used in the CPU% bar graph are described in Table 9.1.
    Table 9.1 CPU% states or activities
    Code
    Statistic
    Description
    W
    Wait
    The amount of idle time the CPU spent waiting for a disk I/O to complete.
    U
    User Mode
    The percentage of CPU time spent executing user program code with a nice value of 20 and without any special priority.
    S
    System
    The percentage of CPU time spent executing system calls and code (in kernel mode). This does not include time spent performing context switches or idle time.
    Pg Res
    The Pg Res value represents the average page residence time in milliseconds. This is the average amount of time a page is able to reside in memory. The default Pg Res Tm value is 600.0, which means that pages are not being forced out of memory. Values less than 600.0 mean that pages are being forced from memory.
    IO/s
    The IO/s bar represents the disk I/O rate. This is the number of physical reads and writes per second for each type of physical I/O. Similarly to the CPU% bar (see "CPU%"), specific code letters in the bar graph tell you how many of each type of physical I/Os were accumulated in the current interval. Each of these code letters are listed and described in Table 9.2.
    Table 9.2 Physical I/Os
    Code
    Physical I/O
    Description
    R
    Physical Reads
    The number of physical reads per second.
    W
    Physical Writes
    The number of physical writes per second.
    The example screen shown in Figure 9.5 shows that 12 physical writes were accumulated during the current interval.

    GLOBAL (Right Column)

    The scale for the next four global statistics ranges from 2 to 20. A value greater than 20 is represented by a trailing greater than character (>).


    Figure 9.6 SOS Global Summary screen: GLOBAL (right column)
    RunQ Len
    The RunQ Len bar represents the average number of processes in the CPU run queue during the current interval.
    Swap Out/s
    The Swap Out/s bar represents the number of processes swapped out per second.
    I/O QLen
    The I/O QLen bar represents the average number of disk I/O requests pending for all disks during the current interval.

    PROCESS SUMMARY

    After reviewing the general state of global resources, the next logical step in analyzing a system’s performance is to observe individual processes. It is important to find out which users are running which programs and what kinds of resources those programs are consuming. The primary purpose of the PROCESS SUMMARY portion of the Global Summary screen is to help you to identify key resources consumed by various processes on the system.
    To examine the CPU usage, disk I/O usage, and wait state information for a process, open the Process Detail screen. For further information, see "SOS Process Detail".

    PROCESS SUMMARY Display Options

    The PROCESS SUMMARY section is included in the Global Summary screen by default when the SOS program is started. However, this information can be suppressed. For instructions, refer to "Display process information".
    You can configure the PROCESS SUMMARY display in the following ways:
  • Display or suppress the extended process line.
  • Display either the total and I/O percentages or the read and write counts.
  • Display all processes or only the active processes.
  • Display or suppress attached processes.
  • Display or suppress detached processes.
  • Display or suppress system processes.
  • Display or suppress processes that have died.
  • Apply a process logon filter.
  • Apply a process sort option.
  • Display sorted processes in either ascending or descending order.
  • Set a maximum number of processes to display.
  • For information about these options, please refer to "Process Display Options".

    PROCESS SUMMARY Data Items

    Figure 9.7 SOS Global Summary screen: PROCESS SUMMARY
    The contents of each PROCESS SUMMARY column (shown in Figure 9.7) are described in this section.
    PID
    The PID is the process identification number, which uniquely identifies each process running on the system.
    Name
    The name of each process running on the system is listed in the Name column.
    User Name
    The User Name column provides the name of the user that owns (or creates) each process running on the system.
    TTY
    "TTY" is defined in SOS as the special device file of the terminal to which the process is attached. The TTY column will show three dashes (---) for processes that are not attached to a terminal (processes such as daemons and batch jobs).
    CPU%
    The CPU% column shows the percentage of CPU time that was used by each process during the current interval.
    Nice
    The Nice column displays the nice value associated with each process.This value, ranging from 0 to 39 (the default is 20), is a determining factor when a process’s priority is recalculated.
  • A process with a larger nice value will receive a higher priority (resulting in a lower-priority status).
  • A process with a smaller nice value will receive a lower priority (resulting in a higher-priority status).
  • A process that slows system response time can be "niced" to lower its priority and allow other processes to be executed more quickly.
    Pri
    The Pri column shows the most-recent priority that each process was given.
    As explained earlier, high priority numbers indicate low-priority status, and vice versa. The priority numbers between 0 and 127 indicate high-priority status and are reserved for certain system daemons or real-time processes. The majority of processes are given numbers between 128 and 255, which indicate timeshare-priority status. A typical timeshare process will fluctuate within this priority range, based on the process’s CPU demands and the system’s load. Processes executing at nice priorities typically have larger numbers (lower priorities).
    The system scheduler dynamically sets the priority by considering several factors, such as CPU utilization. Because the scheduler tries to allocate CPU time fairly among the processes, it will lower the scheduling priority of process that require a lot of CPU time. This means that as a process’s CPU usage grows, its priority number in the Pri column will increase.
    RSS/Size
    The RSS/Size column presents two data items for each process running on the system. The RSS value represents the resident set size—the amount of RAM used by the process. The Size value represents the size in kilobytes of the core image of the process. This includes text, data, and stack space. In other words, the amount of swap or virtual memory the process has reserved.
    4Performance Tip
    Large values in the RSS/Size column indicates the corresponding process uses a lot of memory. Processes in this category may need to be checked for memory usage problems.
    #Rd
    The #Rd column lists the number of physical reads performed by each process during the current interval.
    #Wr
    The #Wr column shows the number of physical writes performed by each process during the current interval.
    4 Performance Tip
    The #Wr values are important because they can point to processes that are performing excessive disk I/Os. To confirm, check the SYSTEM PERFORMANCE ADVICE portion of the Global Summary screen for a message that reports the high I/O process for the current interval. When high #Rd and #Wr values are evident, determine whether the I/Os are necessary or unnecessary.
    State
    The State column in the PROCESS SUMMARY portion of the Global Summary screen shows which wait state the corresponding process was in at the end of the current interval. Each wait state is described in the appendix, "SOS/SOLARIS Wait States".
    4 Performance Tip
    Wait state information is helpful when you want to determine why a process is "stuck." Keep in mind, however, that the wait state of a process can change radically in a manner of seconds. If you suspect a problem, check the information provided for that process in the Process Detail screen.

    Extended Process Statistics Lines

    The PROCESS SUMMARY portion of the Global Summary screen can be expanded to show the percentage of time each process spent in one or more wait states during the current interval. This additional process information is displayed below each corresponding process statistics line in an extended process line.
    The extended process lines together with the extended process headings line can be enabled from the Process Display Options submenu of the SOS Main Options Menu or by typing the y key from the Global Summary screen (toggles the extended process lines on and off).
    The statistics in the extended process lines correspond with the column headings in the extended process headings line. Each column heading is described in Table 9.3.
    Table 9.3 Extended process column headings
    Heading
    Description
    PRI
    The percentage of the process’ time spent in the PRI state during the current interval. See "Wait State Descriptions".
    TPG
    The percentage of the process’ time spent in the TPG state during the current interval. See "Wait State Descriptions".
    DPG
    The percentage of the process’ time spent in the DPG state during the current interval. See "Wait State Descriptions".
    KPG
    The percentage of the process’ time spent in the KPG state during the current interval. See "Wait State Descriptions".
    ULCK
    The percentage of a process’ time spent in the ULCK state during the current interval. See "Wait State Descriptions".
    JOB
    The percentage of the process’ time spent in the PRI state during the current interval. See "Wait State Descriptions".
    OTH
    The percentage of the process’ time spent in the OTH state during the current interval. See "Wait State Descriptions".
    Gr