Craig S. Mullins
              
Database Performance Management

Return to Home Page

February 2001

 

The 24x365 Availability Challenge
by Craig S. Mullins

Talk to the database administration group in any major corporation today, and you'll hear about an atmosphere of controlled chaos. DBAs are scrambling to address a variety of needs ranging from the design of new applications to keeping business-critical applications operational. As more businesses demand full-time system availability, and as the cost of downtime increases geometrically, the time available for optimizing performance on business-critical systems and software is shrinking.

On the other hand, if routine maintenance procedures are ignored, performance suffers. IT is forced to perform a delicate balancing act between the mandate for 24x365 availability and the consequences of deferred system maintenance. The stakes are high and IT is caught between seemingly contradictory objectives.

Data growth and the shrinking maintenance window

All growing businesses accumulate enormous amounts of data. In fact, industry analysts estimate that the average database grew tenfold in size between 1995 and 2000. The largest databases in production today approach one petabyte in size. At the same time, 24x365 system availability is more a requirement than an exception. IT must be increasingly creative to find time to perform routine system maintenance. High-transaction databases need periodic maintenance and reorganization. With constant use, databases become fragmented, data paths become inefficient and performance degrades. Data must be put back in an orderly sequence; the gaps created by deletions must be erased. 

Decision Support

More and more companies are finding new ways to use core business data for decision support. For example, credit card companies maintain a basic body of information that they use to list purchases and prepare monthly statements. This same information can be used to analyze consumer spending patterns and design promotions that target specific demographic groups and, ultimately, individual consumers. This means that core business data must be replicated across multiple database environments and made available to users in user-friendly formats. Therefore, the availability of operational data can be negatively affected by the requirements of decision support users, since large amounts of data are not available for update processing during bulk data unload.

Data Warehousing

Just as decision support has expanded the use of operational data, data warehousing has driven overall database growth. Typical data warehouses require the replication of data for use by specific departments or business units. One of the major sources of operational data in global 2000 corporations today is legacy databases. The unloading and loading of external data to operational data stores, and then on to data marts, has increased the number of utility functions that must be administered. The time taken to propagate data has conversely affected the overall availability window of both the data sources and data targets during unload and load processing. The growth of data warehouses will continue unfettered into the foreseeable future, fed by the informational needs of knowledge workers and the falling cost of storage media.

Full-Time Availability

Just when the latest hardware and software technologies have finally brought 24x365 availability within reach, the mandates of the global economy have forced IT departments to reevaluate the situation. Now the buzz phrase is 24x24 availability, as businesses conduct operations in all time zones and data must be available to a new spectrum of users, not all of who work in the same time zone as the operational DBMS.

Airline reservation systems, credit card approval functions, telephone company applications–all must be up and running all day, every day. International finance is among the best examples of the need for full-time availability. Money never sleeps and the daily flow of Deutsche marks, dollars, pounds and yen goes on with the inevitability of the lunar orbit. So does the global information exchange on which brokers base their buy-and-sell decisions. Billions are on the line each minute and downtime is simply not optional. IT decision-makers need tools that accomplish maintenance and backup tasks in small fractions of the time normally allotted to these procedures.

Growing IT Complexity

Any single-vendor system should be clean, precise and predictable. But today, it is hard to find a company of any size that does not operate in a heterogeneous environment. At least some basic business functions run on mainframes as well as midranges and desktop systems in a client/server infrastructure. As these systems expand in size and functionality IT staffs must find ways to accommodate operational tuning across a complex, heterogeneous IT environment. This is rarely the seamless process portrayed by the hardware manufacturers. And the DBMS software itself also can add complexity with new releases and features being delivered at breakneck speed.

Complexity stems from human factors as well. Downsizing has forced former IT specialists to become generalists. As a result, tasks such as database reorganization–that used to be simple and straightforward for expert DBAs–are now complex and lengthy for generalists. Of course, IT is not immune to corporate downsizing; there are now fewer personnel to handle day-to-day computer issues than there were just a few years ago. Finally, mergers and acquisitions force IT staffs to consolidate incompatible systems and data structures.

The Effect of new Parallel Technologies

More and more work is being done in parallel. For example, IBM's Sysplex multiprocessor line splits tasks among parallel processors, eliminating some processor availability limitations. Individually, the processors are less powerful than bipolar predecessors, but combined they crunch data faster by assigning work to open processors rather than requiring users to wait for cycles on a single processor. Unfortunately, standard bipolar maintenance software does not run very efficiently on the newer systems. To reorganize databases and handle backup and recovery functions for parallel processing environments, IT departments need maintenance utilities written specifically to take advantage of parallel processors.

The Challenge: Toward the Goal of 24x365 Availability

Having established that life is challenging for today's IT managers and DBAs, the following section contains pointers on alleviating some of the problems and realizing the objective of 24x365 availability. Faced with shrinking budgets and resources, and an ever-increasing volume of data to manage, IT must evaluate its critical needs and implement a series of key strategic steps. Among them:

  •       Perform routine maintenance while systems remain operational

  •       Automate backup and recovery functions

  •       Exploit parallel processing technology

Perform Routine Maintenance while Systems Remain Operational

To address the need for performance optimization while trying to get the most out of smaller IT staffs and budgets, products that simplify and automate maintenance functions are key. IT needs tools that reduce maintenance time from hours to minutes or require no maintenance time at all, while allowing users continued access to the data they need to do their jobs. A few tools exist that perform these functions without taking systems offline, but IT must make these capabilities a requirement. Tools that work in conjunction with modern storage devices to minimize or eliminate downtime also are quite useful to maintain databases while they remain online and operational.

Automate Backup and Recovery Functions

To ensure that a company can get its data back as quickly as possible after an outage or disaster, preplanning is necessary. Taking a proactive approach to backup and recovery management can mean the difference between a minimal outage with no data loss and a situation from which a business can never recover. Few software products assist in managing the backup strategy, nor do they provide functions that enable proactive recovery planning. IT needs products that allow for frequent backups that exert minimal impact on the online system. IT also needs backup and recovery software that allows for extremely fast recovery of data in a sequence directly related to the criticality of the business application the data supports, as well as products that assist in automating the recovery in a crisis situation.

Exploit Parallel Processing Technology

New parallel technology reduces overall computing costs. To realize the rewards, however, IT needs tools that exploit parallel processing. Otherwise, products that run slowly and inefficiently because they were built for a different hardware environment will negate the benefits of parallel technology.

Non-Disruptive Utilities

RDBMS technology has matured over the years from a slow behemoth into a fast, sophisticated solution for business-critical applications. The native database utilities, however, have not kept pace with the explosive growth in database size. Independent software vendors (ISVs) have taken the lead in this area by providing faster, more functionally rich utilities to replace the default utilities. As the new millennium approaches, the expectation has been raised from simply reducing the window needed to perform basic database utility functions to providing zero disruption.

The definition of a non-disruptive utility is to provide both update and read access to a database during execution of the utility – and to do so without a loss of data integrity. Additional considerations for non-disruptive utilities are the number and types of resources needed to perform non-disruptive operations. Historically, the native utilities have used considerably more CPU and I/O resources than ISV utility solutions.  The areas where non-disruptive utilities are needed most include:

  •       Database reorganization for maintaining performance

  •       Database backup, to ensure data is available for recovery in the event of application or hardware failure, in addition to disaster recovery preparedness

  •       Recovery solutions that can apply recovered data without requiring an outage

  •       Unloading and loading of source data and operational data stores for decision support systems and data warehouses

  •       Checking for both referential integrity and structural data integrity

As DBMS architecture evolves to improve transaction throughput, the importance of caching data grows. Native database utilities rely on database buffers, which can negatively impact transaction performance due to contention. This is why some high-speed utilities make use of resources outside the DBMS to increase performance. However, be aware of utility options that provide a small performance gain at the expense of data integrity. The best solution provides data integrity, along with high performance and non-disruptive operation. 

Conclusion

Organizations have to find a balance between the seemingly incompatible needs for 24x365 uptime and periodic maintenance. A poorly maintained database is a business inhibitor and will be nearly impossible to restore in the event of a crisis. There are alternatives to the native database utilities that can deliver maintenance and backup functionality while providing continuous availability of the database and associated applications. In many instances, critical applications directly affect revenue. Thus, IT should implement a maintenance and backup strategy that provides optimum availability. 

Prudent DBA practices and 24x365 availability need not be mutually exclusive. It just takes the right tools and a little planning.

 

 

From Database Trends, February 2001.
 
© 2001 Craig S. Mullins,  All rights reserved.
Home.