Lund Performance Solutions

SOS CPU Detail
The information displayed in the CPU Detail Screen reveals the general state of the CPU. This information is similar to that included on the Global Screen, except that it handles multiple CPU’s.
To access the CPU Detail screen from the Global Summary screen:
  • Type S from the SOS Enter command: prompt to view the Screen Selection Menu screen.
  • From the Screen Selection Menu screen, enter C (CPU Detail Screen). The CPU Detail Screen will display.
  • Figure 13.1 shows an example of the CPU Detail screen.

    CPU Detail Screen Keys

    The CPU Detail Screen keys are listed and explained in Table 13.1.
    Table 13.1 CPU Detail Screen keys
    Refresh screen
    Return to Global Summary screen
    Toggle Screen Freeze
    Jump to new screen
    Help System
    Print Hardcopy
    Jump to SOS Screen Selection menu
    Exit SOS
    Zero Cumulative Totals
    Execute Shell Commands
    Execute Shell Commands
    Help System
    CTRL T
    Toggle Timer Status

    CPU Detail Screen Display Items

    Figure 13.1 SOS CPU Detail screen
    For each CPU you will see the following performance measurements:
    AQ, BQ, CQ, DQ, EQ nn.n[nn]
    These statistics indicate how much CPU time is spent executing user and system program codes on behalf of processes running in respective scheduling queues. For the current interval, this is the time the CPU works constructively on our behalf as opposed to performing overhead tasks (described later). MPE/iX system process time is usually measured within the AQ and BQ counters. Some user processes might run in the B queue (like SOS/3000). The queue is usually where interactive processes run. The D and E queues are typically where batch jobs run.
    Performance Tip
    If the sum of these percentages (particularly AQ, BQ, and CQ) are very large and there is little or no time spent in any active or paused states, it is possible that one or more processes are hung. Perhaps a looping condition exists. The offending process(es) should be identified by finding the highest CPU user (use the HOG PROC ZOOM key for this). If the sum of these numbers is very low and other active or passive statistics are very high, then an overhead task(s) is consuming the CPU’s attention and should be researched further. A low number in these process states counters (when other busy and paused counters are low) means that there is plenty of CPU capacity available for more processing (batch or interactive). It is important to note the spread of CPU in various queues. The AQ and BQ should have very low amounts of CPU utilization, except for brief spikes. It is best to see that CQ, DQ and EQ obtain the majority of CPU because other areas usually represent overhead, thus unproductive tasks.
    Mem nn.n[nn]
    This statistic represents how much CPU time is spent handling memory page activity. This counter includes time spent on memory allocations for user processes that cannot be launched (obtain the CPU) until necessary segments are present in memory.
    Performance Tip
    A slight memory load is indicated by a figure of 5-8% in this state, moderate if 8-12%, and heavy if it is greater than 12%. Remember, these are rough guidelines. A “shades of gray” principle applies here. A memory shortage may exist if this number is consistently greater than 5-8% and if other memory shortage indicators are present. See "SOS Memory Detail" for more on memory shortage diagnosis. This number tends to be a more reliable indicator of memory shortages on MPE/iX systems than they are on MPE V systems.
    Disp nn.n[nn]
    This statistic represents the amount of time the CPU spends on scheduling and dispatching processes.
    Performance Tip
    If this value rises above 8%, it can mean that MPE/iX is spending an inordinate amount of time dealing with process launch and process stop activity. Look at Launch/s (this section), Individual Process Stop Detail (Extended Process or Detailed Process displays), and Global Stops Detail to gain more insight as to why this is happening. This indicator is worth watching. If it becomes excessive, response times can increase.
    ICS/OH nn.n[nn]
    This statistic represents the time the CPU spends dealing with external device activity. Pressing RETURN to get an MPE/iX prompt is one such interrupt. Time handling disk I/O completions are included here. Interrupt Control Stack activity (ICS) requires service time by the CPU.
    Performance Tip
    If this value rises above 8%, it can mean that MPE/iX is spending an inordinate amount of time on the DT subsystem, disk, or other datacomm interrupt activity. Locating processes guilty of excessive terminal reads (DTC activity) or processes with large numbers of disk I/O’s will be helpful. A small value is desirable here.
    Pause&Idle nn.n[nn]
    This statistic reveals the percentage of time the CPU spends waiting for disk I/O’s to complete. This event is essentially a roadblock for further activity to take place. No other functions can occur during this waiting period. This number represents time in which processes could have had work performed on their behalf, but could not because of the relative slowness of the disk drives in retrieving I/O.
    Performance Tip
    The number indicated by this counter provides a good aspect of the state of the I/O system. A large number here basically indicates that the CPU could have been busier, but because of I/O requests that were not serviced rapidly, it could not. Big is bad. Small is good! If this number is above 10%, it is possible that an I/O bottleneck exists. A shortage of main memory can also induce an excessive amount of disk activity. It is best to look at some of the memory adequacy indicators to verify whether or not memory is the culprit. Also be sure to identify the high disk I/O user (Advice Module and Process Display). A large amount of disk I/O write activity can induce an excessive value here. We have seen a number of cases where this number has skyrocketed virtually causing the majority of CPU to become paralyzed. So, a Series 948 may only be operating at the level of the Series 920 because of excessive CPU Pause time for disk activity.
    This statistic also includes the percentage of time the CPU is not actively working on processes and not waiting for any disk I/O’s to complete. Simply stated, this is the amount of processing capacity you have “in the bank.”
    Performance Tip
    If there is a large amount of idle time consistently on your system, this means your CPU is on vacation most of the time. Although it is not desirable to swamp the processor, it should earn it’s keep by performing to capacity. Ample idle time indicates spare processor capacity. If idle time is zero (or close to it) most of the time, and a significant amount of the CPU’s processing is due to batch job activity, then you can sustain some growth in interactive transaction volume. If the lack of idle time is primarily due to session activity, then the system may be overloaded. Either reduce processing or obtain more CPU horsepower via an upgrade. It is helpful to observe entire days of idle time values for a system. You may have plenty of idle time at noon, but no idle time between 3:00 and 4:00 p.m. Shifting workloads (batch scheduling, user work hours) will help bring the peak period utilization down.

    Lund Performance Solutions
    Voice: (541) 812-7600
    Fax: (541) 812-7611