Lund Performance Solutions

Fragmentation Concepts
There are four types of fragmentation found on HP e3000 MPE/iX systems:

File Fragmentation

Within the context of disk fragmentation is the concept of a file extent. File fragmentation occurs when a file's set of extents becomes physically discontinuous on disk. The impact of file fragmentation is determined by the severity of fragmentation, the adequacy of main memory, the speed of disk devices, and the efficiency of MPE/iX's built-in pre-fetch mechanism.

Contiguous Extents

A file on an MPE/iX system can be broken up in to pieces known as "extents." These extents do not need to be next to each other on a disk, nor do they need to be on the same disk drive. This allows the file system to have very large files, (up to four gigabytes,) but not require that all the space be physically contiguous on the disk drive. A file which is contiguous means that all its extents are "next door neighbors" on a particular disk device. In addition, this allows us to build files which exceed the capacity of the largest disk drive available. A performance cost occurs when a file’s extents get spread out over the disk devices. This can present significant performance issues.

For example, let's assume you build a file using the following file build:

BUILD FOO1;REC=-80,,F,ASCII;DISC=100000,1,1

This will build the file "FOO1" with space for 100000 records in one physical extent and allocate all that disk space immediately. The advantage of this is that you are guaranteed to have all of the records that will ultimately go into this file physically contiguous on disk. The disadvantages are that you must have 100000 * 80 bytes of available, contiguous disk on a single disk device, and you must lose the availability of all of that disk space up front, perhaps long before it is needed.

Non-contiguous Extents

Now consider the following file build:

BUILD FOO2;REC=-80,,F,ASCII;DISC=100000,32,1

This will build the same file with a few differences. By specifying the "32" in the DISC parameter, we are telling the operating system to break this file up into multiple extents. (The exact number is not controlled on MPE/iX.) The ",1" in the DISC parameter tells the operating system that we only want a portion of the file allocated up front. In general, MPE will allocate about 1/32 of the file at that time. The advantage of this is that the file system doesn't have to obtain 100000*80 bytes of available, contiguous disk on a single disk drive. Rather, it has only to obtain (100000*80)/32 of available, contiguous bytes of disk initially. Additionally, you only lose the availability of (100000*80)/32 bytes of disk space at build time. You will acquire the rest of the disk space only as required as your application adds records to the file.

As you can see, these are significant advantages. MPE/iX system users must have and cannot live without these advantages, especially in these days of true mainframe-equivalent HP e3000 environments.

However, the disadvantage to building FOO2 in this manner can become apparent when subsequent extents are allocated. If the extents are allocated contiguously then there will be no performance degradation. There is some slight overhead associated with multiple extents resulting from more label table activity, but it is negligible in terms of any performance impact. If, however, the extents are allocated non-contiguously, there can be significant performance impact.

Let's examine these ramifications. Taking FOO2 as our example, we’ll look at the worst case extent allocation scenario. If the worst possible allocation occurs, the FOO2 file will be located in 32 separate, non-contiguous areas of disk on one disk device. This means that a serial read of this file could result in 32 times the amount of physical disk accesses required than if all extents were contiguous or if the file consisted of a single extent.

If, however, these extents were located on 32 different disk devices, then the impact of the non- contiguous nature of the file distribution would be somewhat decreased by the fact that much of the disk access could be performed simultaneously over the 32 spindles.

Be aware that the FOO1 and FOO2 examples represent a simplified view of the internal operation of MPE/iX systems. There are other issues, particularly multi-page prefetching, that play a role in the overall efficiency of disk I/O on MPE/iX systems.

Disk Fragmentation

Disk fragmentation is best defined as the process by which logically-related data become physically disassociated on disk. Disk fragmentation can be considered a measurable current state of your data, as well as the dynamic process by which such fragmentation gradually occurs.

Disk fragmentation is when the free space on a disk drive is spread throughout the disk in many small pieces. We list this as a separate category because the more fragmented a particular disk device is, the more impact there is on overall system performance. Additionally, the more fragmented a particular disk device is, the harder it is for certain important system functions to be performed on that device. For example, let's say that LDEV 1 (the system disk) has 1,000,000 sectors of free space, but is so fragmented that the largest free chunk is 13,000 sectors. In the event of an operating system update, MPE/iX requires a certain amount of contiguous free space on LDEV 1. If the required amount of contiguous disk space is not available, you will be unable to perform the update without intervention. That intervention may now take the form of a simple execution of the TRIM and CONDENSE commands. If you still do not have enough free space for the update, you can use the MAKEROOM command to make as much room as necessary.

NOTE The amount needed may change from release to release of the operating system.

Also, it is commonly thought that a lack of free space throughout a system can cause performance problems. There is a role for fragmentation in this same scenario. In other words, there is a fundamental level of free, contiguous free space required on your system, below which you must not go.

System Fragmentation

System fragmentation is yet another extension of file fragmentation. This is the perspective by which you must consider your whole system. This would include a file-level perspective, a disk- level perspective, a volume set-level perspective and a complete system-level perspective.

Database Internal Fragmentation

Database fragmentation is fragmentation within the skeletal structure of a DBMS such as Turbo Image. While specific internal DBMS fragmentation is outside the scope of De-Frag/X, all DBMS's exist on top of the MPE/iX file system. This means that there is still significant impact of defragmenting Turbo Image data sets at the file-level. In other words, given the database TRXDB1, you should still defragment the individual files of TRXDB101, TRXDB102, TRXDB103, etc.

It should still be said that the internal fragmentation of data within a DBMS is still a critical performance issue. Tools such as DBLOADNG (from the INTEREX Contributed Library) or HOWMESSY (from Robelle) were created to measure the internal fragmentation of Turbo Image databases. Included with any Lund Performance Solutions product is a copy of the DBLOADNG program courtesy of INTEREX. Additionally, a product such as Adager, from Adager Corporation, has the ability, via the DETPACK command, to actually fix DBMS-level fragmentation on Turbo Image databases. Information about this product can be obtained from a Lund Performance Solutions representative (see "Product Support") or directly from Adager at 1-800-533-7346.

For more information regarding Turbo Image database performance issues, refer to Taming the HP3000 - Volume 2 by Robert Lund, available from Lund Performance Solutions.

Lund Performance Solutions
Voice: (541) 812-7600
Fax: (541) 81207611