ReorgWizard™

[tab:ReorgWizard™]

ReorgWizard™ is a powerful utility for the IBM AS/400 (a.k.a. iSeries, System/i, or IBM i). It “reorganizes” a physical file without locking it while all these records are being moved. The active records are moved to the front and the deleted records are moved to the back. The file is reorganized in place and without locking users out of the file. After the records are reorganized, deleted record space can be recovered quickly with a brief exclusive file lock usually measured in seconds or perhaps minutes, as contrasted with the hours or days often associated with RGZPFM.  Finally, you can reorganize files while the users are active.

Optimizing Space

Files that contain deleted records should be ‘reorganized’ periodically. This frees up wasted space, but more importantly it makes database processing more efficient. Physical I/O’s to the database bring ‘pages’ of data to a program. A file that has been reorganized will contain more ‘active’ records in each page, so fewer physical database I/O’s will be required. File scans and searches will be more efficient, since they don’t have to waste time looking at (or scanning past) deleted records.

Typically, files are reorganized with IBM’s RGZPFM command. This command gets rid of the deleted records and rebuilds file access paths. The problem is, the file is locked for the entire process. Depending on the size of the file, this re-org could take anywhere from a few minutes to many hours to several days! Further complicating the problem is the fact that once the RGZPFM is started, it can’t be stopped without potentially serious consequences. If a RGZPFM were stopped mid-stream, the file would very likely have to be restored from backup and all the access paths would need to either be restored or rebuilt. If the access paths were not backed up, you might as well let the RGZPFM finish, because you’ll be rebuilding the access paths anyway. Because of these harsh realities, many companies don’t run reorgs or don’t run them as often as they should. They may not have the time or they may not want to risk kicking off a reorg and locking the file for an extended period of time.

[tab:Features]

With ReorgWizard™ , you won’t need to use the RGZPFM command any more. Reorgs can be run almost any time (as long as some other process doesn’t need an exclusive lock on the file).

Even though ReorgWizard™ doesn’t require an exclusive lock, you may have other processes that do – such as the weekly full backup. Whereas RGZPFM can’t be stopped once it’s started, ReorgWizard™ can be run in increments.

ReorgWizard™ can be run for a subset of the data, or up until a pre-determined date and time. There is no need to reorg the entire file in one pass. If your file has 100 million records, you can reorg the first 10 million today, the next 15 million next week, etc. Or, kick off the ReorgWizard™ on Friday afternoon and tell it to stop by Sunday at 4pm. If it’s not finished, resubmit it next weekend. When you resubmit a reorg job, it will pick up where the last one left off. Chip away at the reorg task as you see fit. Once the file is completely reorganized, the deleted record space can be recovered with only a brief exclusive file allocation.

Should a process need exclusive file access after the ReorgWizard™ has started running, the reorg can be ended gracefully from a menu option. The reorg can then be resubmitted again later to pick up where it left off.

IBM’s RGZPFM rebuilds access paths from scratch. This is one of the reasons that it requires an exclusive lock on the file, and one of the reasons why it runs for soooooooooooo long. ReorgWizard™ performs its access path maintenance while the job is running. The actual movement of the records is a fairly simple task, but left unchecked the overhead associated with access path maintenance can utilize an excessive amount of system resources and adversely affect other work on the system.

The ‘throttle’ function in the ReorgWizard™ allows the reorg job to be slowed down. This might seem counterintuitive, but it’s actually very important. Any reorg function will involve a tremendous amount of access path maintenance. This overhead can take over the system – even if the job run priority is changed to try to slow it down. The throttle feature allows the reorg to be slowed down so that other jobs on the system will not feel the effect of the reorg in progress. The throttle can be set anywhere from one record per second all the way up to ‘full speed’. So, ReorgWizard™ can proceed at a pace ranging from a few thousand records per hour all the way up to potentially many millions of records per hour. The reorg will take longer to process if the throttle is used to slow the job, but since the file isn’t locked, that is no longer a critical path issue.

Finally

Reorgs that can be run without RGZPFM and the risk of files being unavailable for Monday morning. Finally – reorgs that can be run whenever they’re needed. Finally – reorgs that can be run without locking out the users and without adversely affecting other work on the system.

[tab:END]