Skip to main content

skip to main content

developerWorks  >  Web development  >

Quality busters: Customizing applications

Approaches to managing application configuration

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Introductory

Michael Russell (MikeRussell@VickiFox.com), Application Architect, Vicki Fox Productions

08 Sep 2004

To customize applications and program products for a specific operational environment, you must modify one or more configuration objects. These configuration objects can take many forms, such as text files, XML files, system registries, or a separate service. Managing the operational environment becomes more complex as the number of configuration objects increases.

I continue the Quality busters series, which looks at common influences on application quality from the enterprise view of the operational environment and non-functional requirements. Addressing these influences is a matter of making tradeoffs, with no single solution solving all the problems. This month I'll discuss the complexities of application configuration management.

Put what value where?

The operations team notifies the SHEEP Web application team that they are moving several servers to a new data center. Part of this move requires changing the host name for two systems. In keeping with data center naming standards, the WebSphere® MQ queue manager name on these systems will also change. The SHEEP team replies that the change should be easy; they just have to update the configuration objects on the two systems.

Plans are made, tasks assigned, and the day of the system move arrives. The SHEEP team member assisting with the move of the systems updates the known configuration files. The day after the move, the VP of Finance complains that his sales status report from the Data Warehouse system is not updating.

Operations, the SHEEP team, and the Data Warehouse (DW) team research the cause. They discover that one of the DW programmers took advantage of the SHEEP application's sales status request and reply messages. Instead of using the SHEEP application's configuration objects, the DW programmer created his own configuration object. The teams overlooked this configuration object during the system move. Once the DW programmer updates the configuration to reflect the new location and name of the request queue, the DW system updates again.



Back to top


General attributes

You can customize nearly every application by setting one or more configuration values. Configuration objects store these configuration values. Configuration values identify the system environment to the program; for example, queue and queue manager names, remote system identity, user logins and passwords, locale settings, timeout intervals, and more. Configuration values also identify user settings; for example, which features are enabled or disabled, default values for screen or processing elements, default user identification, personalization, and more.



Back to top


Formats

You'll find configuration objects in many formats. The following are the more popular ones.

Text file

The classic configuration object is a text file that contains key value information. Usually, each line in the text file corresponds to a configuration key-value pair. The key appears on the left, followed by a separator symbol (commonly an equals sign, a colon, or a space), and then the value. Sometimes a special separator line, often called a section heading, is included.

Examples of this format include the INI file found in DOS/Windows and the properties file (java.util.Properties Class) in Java.


Listing 1. Example of key-value format
# Sample configuration file lines
server.queueManagerName = SHEEPQM1
server.requestQueueName = RQSTQ
server.queueTimeout = 1000

Advantages:

  • Many parsers or access modules are already available.
  • Very low overhead is associated with opening and reading the text file.
  • The information is human-readable.
  • You can edit the contents with simple text editors.

Disadvantages:

  • Hierarchical information is difficult to store.
  • Repeating information groups are difficult to store.
  • Editing can be error prone whenever you have a large number of key-value pairs.
  • A string representation stores data and requires conversion to integer or other binary representations.

XML file

The XML format is growing in popularity, although XML has no standard configuration file format. It seems every application does it differently. Some cases use element attributes, while others use only element tags (see the example in Listing 2). You can name each repeating group with id or name attributes on a commonly named element tag or a uniquely named element tag .


Listing 2. Example of the XML format
<!-- Sample configuration file in XML -->
<config>
    <server>
        <queueManagerName>SHEEPQM1</queueManagerName>
        <requestQueueName>RQSTQ</requestQueueName>
        <queueTimeout>1000</queueTimeout>
    </server>
</config>

Advantages:

  • Standard XML parsers are available.
  • Hierarchical information is easy to store.
  • Repeating information groups are easy to store.
  • The information is human-readable.
  • You can edit the contents with text editors and XML editors.

Disadvantages:

  • Increased overhead is associated with opening and parsing the XML document.
  • Information is difficult to read when a large number of elements are present.
  • Data is stored in a string representation which requires conversion to integer or other binary representations.

Registry

A registry is a special index object, usually in binary format that efficiently stores configuration information in a hierarchical structure. Microsoft Windows, for example, implements a system registry.

Advantages:

  • You access the registry from a simple, consistent API. This API hides the location of the registry from the application program.
  • Hierarchical information is easy to store.
  • Repeating information groups are easy to store.
  • Data is stored in a data-type representation more appropriate for the value's usage.
  • All applications running on the system access a single location.

Disadvantages:

  • The information is not human-readable.
  • The binary format requires special editing programs, preferably tailored to the application, to view and edit the configuration values.
  • The configuration values for an application are difficult to extract and store in a form that can be saved with the application for backup and recovery.

Directory service

A directory service is a set of programs and processes that provides directory lookup services. An application sends a request (through a message or remote procedure call) to the directory service, which sends a reply. The directory service can store key-value pairs in a hierarchical structure. An example of a directory service is the X.500 Directory Services that is accessed using LDAP (lightweight directory access protocol).

Advantages:

  • Directory services are separated from the application and can reside on the same or separate computer systems.
  • A consistent API is available to access the service.
  • Hierarchical information is easy to store.
  • By being system-independent, a single service can be the repository of shared configuration information for many applications running on many computer systems.

Disadvantages:

  • The information is not human readable.
  • The service requires special edit programs to view and edit the configuration values.
  • Reliability concerns arise if the directory service is not accessible due to the service not running, a broken communications connection, or something else.
  • The configuration values for an application are difficult to extract in a form that can be used for application backup and recovery.
  • A local configuration object still must store the naming and routing information needed by the application to identify the directory service.

Preferences

The Java 2 SDK, Standard Edition, Version 1.4, introduces a new class called Preferences (java.util.prefs.Preferences). (See Resources.) The standard allows Preferences to be stored in an implementation-dependent back-end, which could be a file, a LDAP directory server, the Windows Registry, or some other storage mechanism.

Advantages and disadvantages:

  • The advantages are those of the implementation-dependent approach; which is one of the formats previously listed.
  • The disadvantages are, likewise, those of the implementation-dependent approach.

Database

You might store configuration information in a database table. One approach is a table with a separate column for each configuration element and a single row in the table. Reading this row retrieves all the configuration information at once. Another approach is a table with two columns -- a key column and a value column. Each key-value pair forms a row.


Listing 3. Example of SQL configuration table
CREATE TABLE the_config (
    queue_manager_name       VARCHAR(32)
                                NOT NULL DEFAULT('SHEEPQM1'),
    request_queue_name       VARCHAR(32)
                                NOT NULL DEFAULT('RQSTQ'),
    queue_timeout            INTEGER
                                NOT NULL DEFAULT(1000)
);

Advantages:

  • Database methods, such as SQL over JDBC, can access the data.
  • Parsing of values is unnecessary since information is stored in a more appropriate data representation.
  • Many applications running on many computer systems can easily access configuration information.

Disadvantages:

  • The information is not directly human-readable.
  • The format requires special database query tools or custom edit programs to view and edit the values.
  • Reliability concerns arise if the database is not accessible.
  • If a schema stores the configuration data separately from the application data, application configuration values might be difficult to extract and save for backup and recovery purposes.
  • A local configuration object must store database access information.

Environment variables

Most operating systems provide support for environment variables or system variables. Each process, when it starts, is loaded with a copy of the system-level environment variables. The process can then change the value of these variables or define additional environment variables. A program can retrieve the value of these environment variables. As a result, environment variables provide a facility for process level management of configuration information.


Listing 4. Example of DOS script with environment variables
set QUEUE_MANAGER_NAME=SHEEPQM4
echo %QUEUE_MANAGER_NAME%
myApplication.exe

Advantages:

  • You can define environment variables at the process level.
  • A parent program in the process can change the environment variable, thus affecting a child program that starts afterwards.
  • The information is generally human-readable since the setting of environment variables often occurs within runtime scripts.

Disadvantages:

  • The assignment of environment variables often repeats in most runtime scripts. This repetition creates a maintenance problem as you must find and update all copies as needed.
  • Diagnosing a problem increases in difficulty if another task in a process changes certain configuration information at runtime.
  • A programmer must change information since these variables are not in a location that users can typically access.

Command-line parameters

Finally, some configuration information can pass to the program through command-line parameters. Command-line parameters can override configuration values, such as the configuration object used, the method to find the configuration object, or override specific values, found in other configuration objects. Command-line parameters provide a facility for program level management of configuration information.


Listing 5. Example of command-line parameters
myApplication.exe -qm:SHEEPQM4

Advantages:

  • Command-line parameters are defined at the program level.
  • Programmers can easily force overrides to the default sources of configuration values.
  • Programming is relatively easy since command-line parameters have no external references to files or services.

Disadvantages:

  • Maintenance efforts increase because you must search all runtime scripts to find parameter usage.
  • Dynamically computed parameters can create diagnostic issues.
  • Accessibility diminishes with values stored in sources that only programmers can modify .

A combination of formats

Using a combination of formats is usually a good idea. When you add the ability to override configuration values, the programmer can selectively test pieces of the program without the worry of managing configuration objects -- which might be shared by other users and developers.

A common combination approach goes like this:

  • Use a search path, such as the classpath in Java environments, to search for the configuration object. If no object is found, then attempt reading from a default location.
  • Override the configuration values with the value of the environment variables.
  • Override the configuration values of the command-line parameters.
  • Log the final configuration values to assist with diagnosing program problems.


Back to top


Data representation

Configuration objects store the key-value information in one of two possible data representations: string- or data type-enabled.

String

The key-value text file, XML file, environment variable, and command-line parameter formats store values in a string representation. The using program must convert from the string representation to the desired internal representation. While the string representation makes it easy to edit the configuration object, it does lend itself to the entry of incorrect values. For example, a user might type the letter 'O' instead of the number '0' and a text editor cannot detect this.

Data type-enabled

The registry, preferences, and database formats store values in data type-specific representations. For example, numbers are stored in a numeric format, usually integer. This reduces the need for the program to convert values from one representation to another. It also reduces the likelihood of entering incorrectly formed values. However, with this representation, you need special programs to edit the configuration object.



Back to top


Location

A major decision regarding configuration objects is where to put them. You have several options:

  • Script
  • Program directory
  • Fixed directory
  • Search path
  • Separate service

Real world applications often use a combination of locations involving many configuration objects.

Script

For configuration information that is very program instance-specific, you might find it advantageous to put some information into environment variables or command-line parameters within the script that launches the program. This approach is rarely used because you must search all scripts to find out whether a changing configuration item is referenced within a script.

Program directory

You might place a configuration object in the same directory where the program itself resides. Finding the configuration object is easier since the program can determine where it resides and simply check that directory. This approach has limited ability to share configuration information. Only programs in the same directory can share the configuration object; programs in another directory are not able to find it.

Fixed directory

On systems such as UNIX, Windows, or OS/400, with a well-known and stable directory structure, you might place configuration objects in a well known fixed directory, such as the root directory or the QGPL library. All programs on the system can access this fixed directory. As a result, many programs and applications can share the configuration object. In most business environments, the technical support team does not permit the addition of user objects to these fixed directories, so this approach is often discouraged for operational support and security concerns.

Search path

Most systems provide a search path capability, such as the UNIX PATH environment variable or the Java CLASSPATH variable. By checking each directory in the search path for the configuration file, the program has more flexibility. This approach also supports testing better because a tester can put a tailored configuration object in the search path earlier. This benefit, however, is also its weakness. If, during operation, an incorrect version of the configuration object is inserted earlier in the search path, then the program will likely perform differently than expected. This can be difficult to diagnose.

Separate service

Finally, using approaches such as the registry, directory service, or database, you can separate the configuration object altogether from the application. The configuration information might even reside on a different computer system. However, as mentioned before, this approach requires a small local configuration object that identifies how to access the configuration service. Also, this approach has a reliability concern if the service becomes inaccessible.



Back to top


Scope

A similar consideration to location is deciding the scope of the configuration object. That is, how many program components will use the configuration object. Scope includes several levels -- program, process, application, system, or enterprise. Real world applications often combine scope levels involving many configuration objects. These levels are as follows:

  • Program. The configuration information is applicable to a specific instance of a program. A session identifier is one example.
  • Process. The configuration information is applicable to all threads, units of execution, and program modules that operate within the life of a process. The name of a response queue associated with the process is an example.
  • Application. The configuration information is applicable to all programs that comprise an application. The database connection details are an example.
  • System. The configuration information is applicable to all programs, independent of the owning application, that reside on a computer system. The computer name and operations notification console are examples.
  • Enterprise. The configuration information is applicable to all computer systems within the enterprise. The names of the enterprise domain name servers (DNS) are an example.


Back to top


Retrieval frequency

Another important consideration regarding configuration objects is how often a program retrieves values from the object. Your decision will be influenced by how often configuration values might change and by the business rules regarding how up-to-date the program must be. The more frequently a program retrieves configuration values, the more overhead the program will have. If this is the case, the architect should choose a configuration object that has lower overhead associated with value retrieval. The following are commonly encountered retrieval frequencies:

  • Program startup. The program reads the configuration object once, when the program starts. Any changes to the configuration object are ignored until the program is restarted.
  • Periodic refresh. The program re-reads the configuration object on a periodic basis. Any changes are detected at the next scheduled refresh from the configuration object.
  • Triggered refresh. A trigger in the program can force a re-read of the configuration object. The trigger might be a signal, a special message, a detected change in the configuration object's modify date, or some other event.
  • Transactional. The program re-reads the configuration object for each transaction. With this approach, the program guarantees it is using the latest configuration value.


Back to top


Maintenance

Finally, you must make a decision about who will maintain the configuration information -- developers, the operations department, or the users. In reality, you often use a combination of all three. Developers might maintain configuration values that support the ability to diagnose the programs. Operations might maintain configuration values that represent the system infrastructure and runtime environment. Finally, users might maintain personalization, locale, and other usage-oriented configuration values.

A tailored configuration edit program is generally beneficial for operations and user personnel to use. The tailored program can ensure that configuration values are correct and meaningful before they are saved to the configuration object. This edit program might be part of the application program itself, similar to the Tools -> Options dialogue in many Windows-based applications.



Back to top


Considerations

With so many different approaches and formats associated with configuration objects, how do you, as an architect, decide what to use?

As this series will show, it is all a matter of making trade-offs. The discussion above showed some of the advantages and disadvantages associated with each approach and format. In the end, you might end up using several approaches.

The architect might ask some of these before deciding which approach to take.

  • What types of configuration items do you need to store in a configuration object?
  • Can you dynamically compute the configuration item?
  • What is the set of valid values for each configuration item?
  • Is a default value associated with the configuration item, or is a user-specified value required?
  • Where is enforcement of the configuration item values located -- in the program after reading the value, or in an edit function that updates the value?
  • What is the scope of the configuration item?
  • What is the appropriate location, format, and retrieval frequency for the configuration item?
  • Who maintains the configuration item -- a programmer, operations team, or end-user?
  • Can existing tools edit the configuration object (such as a text editor) or must a custom editor?
  • If you create a custom editor, can you integrate it into the application, or will it be a stand-alone program?
  • Is the configuration item subject to security concerns? For example, storing a database access password in a text file might not be acceptable to security audits. To meet any security requirements, how will you store the configuration item -- plain text, encrypted, or in a secured object?
  • Must you synchronize the configuration item across several systems?
  • How often will the configuration item be read from the configuration object?
  • How should the program behave when a configuration item is not found?
  • How should it behave when a configuration object is corrupted?
  • Should the application repeat configuration information from another application or use the other application's configuration object directly? Repeating the information can result in replication and synchronization issues. Reusing the other application's configuration object can result in tight coupling and dependency issues.


Back to top


In summary

In this column, I presented many approaches to storing and managing configuration (or customization) information , along with the advantages and disadvantages associated with each approach. This guide is meant to be representative and not exhaustive. The goal is to challenge you, as an architect, to think about the effects on the operational environment and non-functional requirements brought about by your chosen approach.



Resources

  • Read the author's other articles in the Quality busters series on developerWorks.

  • If you work with Windows-based initialization (INI) files, try the GetPrivateProfileString(...) and GetProfileString(...) functions in the Windows SDK.

  • If you work with the System Registry, try several APIs, such as ReqQueryValueEx(...) and RegOpenKeyEx(...) from the Windows SDK.

  • To work with key=value pair text files, get the java.util.Properties Class in the Java SDK. In version 1.4, the Java SDK introduces the java.util.pref package, which includes the java.util.pref.Preferences class, for working with implementation-dependent configuration objects, such as the System Registry on Windows platforms.

  • Find more information about the LDAP protocol and learn to access directory services with it in this recently updated IBM Redbook, Understanding LDAP - Design and Implementation SG24-4986-01 (June 2004).

  • Get an example of using XML for a configuration file in "Java configuration with XML Schema" by Marcello Vitaletti (developerWorks, November 2001).

  • Visit these valuable resources on developerWorks:


About the author

Photo of Michael Russell

Michael Russell has a Bachelors degree in Physics and a Masters degree in Computer Science. He was a logistics engineer, a technical services manager, and a certified IT architect at IBM for nearly 14 years. Michael has experience in Windows, UNIX, and OS/400 environments and is currently a Web application architect for a resort company in Orlando. He uses Web technology for entertainment through his own company, Vicki Fox Productions (http://www.VickiFox.com). He can be reached at MikeRussell@VickiFox.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top