 | Level: Introductory Bobby Fielding, DB2 Development, IBM Toronto Lab Claire McFeely, DB2 Development, IBM Silicon Valley Lab
27 Feb 2003 Using examples, our SQL experts show you typical uses of identity columns and the recently introduced sequence objects to automatically generate numeric values.
© 2003 International Business Machines Corporation.
All rights reserved.
Introduction IBM® DB2® Universal DatabaseTM (UDB) for Linux, UNIX®, and Windows® Version 8.1 and the recently announced DB2 Universal Database for z/OSTM and OS/390® Version 8 support a rich set of features for automatically generating sequences of numeric values. These most recent releases of DB2 UDB introduce database sequence objects, building upon the previous introduction of identity columns. In this article, we briefly discuss typical uses of identity columns and sequences and then present some examples demonstrating them and comparing their features. | DB2 UDB for Linux, UNIX and Windows | DB2 UDB for z/OS and OS/390 | DB2 UDB for iSeries | | Identity columns | V7 | V6 | V5R2 | | Sequence objects | V7.2 | V8 (in beta at the time this article was written) | Planned for a future release | | International SQL Standard: Both identity columns and sequence objects are currently proposed to be in the next version of the international SQL standard. The support implemented in DB2 closely matches what has been proposed and is in the current draft version of the international SQL standard. |
Two ways to generate a sequence of numbers
A sequence of numbers can be automatically generated in DB2 in two ways:
The IDENTITY column When a column of a table is defined with the IDENTITY attribute, a numeric value is automatically generated for that column whenever a row is inserted into the table.
The SEQUENCE object The second way to have DB2 automatically generate a sequence of numbers is to create a SEQUENCE object. You can refer to a sequence object using a sequence expression. A sequence expression can appear most places that an expression can appear. A sequence expression can specify whether the value to be returned is a newly generated value, or the previously generated value.
If the sequence reference is for the next value, a numeric value is automatically generated for the sequence and returned as the result for the sequence expression. For example, if we assume that a sequence named orders_seq has been created, this sequence expression returns the next value generated for the sequence:
NEXT VALUE FOR orders_seq
|
If the sequence reference is for the previous value, the numeric value that was generated in the previous SQL statement for the sequence is returned as the result for the sequence expression. This sequence expression returns the previous value that was generated by the sequence:
PREVIOUS VALUE FOR orders_seq
|
Note that when DB2 UDB introduced sequences, the non-SQL-standard syntax of NEXTVAL was supported in place of NEXT VALUE, and PREVVAL was supported in place of PREVIOUS VALUE. These variations continue to be supported.
Choosing between identities and sequences
Although both identities and sequences are used for generating numeric values, you might choose to use one instead of another depending on the particular situation.
Identity columns are useful when:
- Only one column in a table requires automatically generated values
- Each row requires a separate value
- An automatic generator is desired for a primary key of a table
- The process of generating a new value is tied closely to inserting into a table, regardless of how the insert happens
Sequence objects are useful when:
- Values generated from one sequence are to be stored in more than one table
- More than one column per table requires automatically generated values (multiple values may be generated for each row using the same sequence or more than one sequence)
- The process of generating a new value is independent of any reference to a table
Unlike sequence objects, identity columns are defined on a table, and so require certain restrictions. Each table can have at most one identity column. When you create a column as an identity column, you must use an exact numeric data type for the column. Because the identity attribute generates a value for the column, which is similar to what the DEFAULT clause does, you cannot specify the DEFAULT clause when defining an identity column. An identity column is implicitly defined to be NOT NULL.
Example 1. Combining customer and supplier tables
To illustrate a typical use of the identity column, consider a table that contains customer order information for a fictional Widget company's database. The company wants the order number to be generated automatically for each row (order) that is inserted into the table.
The DDL for our example
As shown in Listing 1, we use an identity column for the order numbers, and they define the order number column as part of the primary key. Note that the IDENTITY attribute in and of itself does not guarantee that the sequence values generated will be unique. The PRIMARY KEY constraint, however, guarantees uniqueness for the rows in the table. To ensure that only automatically generated values are inserted into the identity column, they specify the GENERATED ALWAYS clause. At the end of each quarter, the Widget company uses the last generated order_id to determine how many orders were taken for that quarter. The options NO CACHE and ORDER make sure that in the event of a system failure, no unused identity values are discarded. The Widget company plans to restart the order number column from 1 again to start a new quarter, using an ALTER TABLE statement.
All attributes of the identity column are explicitly shown in Listing 1, even thought they are set to the values that they would default to if they were not specified. Because default values can vary depending on the vendor implementation, specifying all of the options is good coding practice.
Listing 1. Creating an order table with an IDENTITY column
CREATE TABLE customer_orders_t (
order_id INT NOT NULL
GENERATED ALWAYS
AS IDENTITY
(START WITH 1
INCREMENT BY 1
MINVALUE 1
NO MAXVALUE
NO CYCLE
NO CACHE
ORDER),
order_date DATE NOT NULL,
cust_id INT NOT NULL,
product_id INT NOT NULL,
quantity INT NOT NULL,
price DECIMAL(10,2) NOT NULL,
status CHAR(9) NOT NULL,
PRIMARY KEY (order_date, order_id))
|
Here is an example of an INSERT statement that inserts rows into the table.
INSERT INTO customer_orders_t VALUES
(DEFAULT, CURRENT DATE,
:cid, :pid, :qty, :cost, 'PENDING')
|
The Widget company not only manages customer orders, but also supplier orders. The supplier orders are in a separate supplier order table that is defined very similarly to the customer order table.
CREATE TABLE supplier_orders_t (
order_id INT NOT NULL
GENERATED ALWAYS
AS IDENTITY
(START WITH 1
INCREMENT BY 1
MINVALUE 1
NO MAXVALUE
NO CYCLE
NO CACHE
ORDER),
order_date DATE NOT NULL,
supp_id INT NOT NULL,
product_id INT NOT NULL,
quantity INT NOT NULL,
price DECIMAL(10,2) NOT NULL,
status CHAR(9) NOT NULL,
PRIMARY KEY (order_date, order_id))
|
Here is an example of an INSERT statement that inserts rows into the supplier_orders_t table:
INSERT INTO supplier_orders_t VALUES
(DEFAULT, CURRENT DATE,
:sid, :pid, :qty, :cost, 'PENDING')
|
Combining the tables The company realizes that greater efficiency and synergy can be obtained by combining both the customer order table and the supplier order table into one overall order table for both customers and suppliers. The only thing that is different between a customer order or a supplier order is whether the order is coming in or going out, and this is reflected in the STATUS field. To combine these tables with minimal disruption they plan to phase in such a change over time. The steps in their plan include:
- Synchronize the order numbers being generated for each table so that they would be unique between the tables.
- Wait until all the non-synchronized orders are completed. (Alternatively, they could wait until the beginning of a quarter, when the order numbers are reset.)
- Phase out the supplier orders table and use the customer orders table for managing all orders from customers and with suppliers.
- Clean up.
Step 1. Synchronize order numbers
To synchronize the generated order_id numbers used in both tables, both tables are changed so that values can be supplied by a SEQUENCE object for the identity column instead of always being generated. A single sequence, orders_seq, generates values for the identity columns in both tables. The INSERT statements for each table are changed to explicitly provide a value for the identity columns by referencing orders_seq in a NEXT VALUE expression. The orders_seq sequence is defined as follows:
CREATE SEQUENCE orders_seq AS INT
START WITH 1
INCREMENT BY 1
MINVALUE 1
NO MAXVALUE
NO CYCLE
NO CACHE
ORDER
|
Both the customer order table and supplier order table are altered with the SET GENERATED BY DEFAULT clause to allow insert operations to explicitly provide a value for the identity column. (Note that SET GENERATED BY DEFAULT and SET GENERATED ALWAYS for the ALTER TABLE statement is available on DB2 UDB for iSeries in V5R2, on DB2 UDB for z/OS and OS/390 in V8, and will be available on DB2 UDB for Linux, UNIX and Windows in a future release.)
ALTER TABLE customer_orders_t
ALTER COLUMN order_id SET GENERATED BY DEFAULT
ALTER TABLE supplier_orders_t
ALTER COLUMN order_id SET GENERATED BY DEFAULT
|
LOCK TABLE statements are issued to restrict inserts to the two tables while all INSERT statements for the order tables are modified to supply an explicit value for the order_id column and the orders_seq sequence is started at the appropriate value.
Here is how the INSERT statements for the supplier and customer order tables are changed:
INSERT INTO customer_orders_t VALUES
(NEXT VALUE FOR orders_seq, CURRENT DATE,
:cid, :pid, :qty, :cost, 'PENDING')
INSERT INTO supplier_orders_t VALUES
(NEXT VALUE FOR orders_seq, CURRENT DATE,
:sid, :pid, :qty, :cost, 'PENDING')
|
And now here is how orders_seq sequence is modified to start at the value that is next after the largest value that has been generated by the order_id identity columns of both the customer orders and supplier orders tables. First, a SELECT statement is used to determine this value:
SELECT MAX(c.order_id), MAX(s.order_id)
FROM customer_orders_t c, supplier_orders_t s
|
For example, suppose the above query returned the two values: 42331 and 57231. The orders_seq sequence would then be altered as follows:
ALTER SEQUENCE orders_seq
RESTART WITH 57232
|
A COMMIT statement releases the locks on the tables, and the two order tables are available again for inserts. Instead of each order_id column having values generated independently of each other by the identity attribute, the values inserted are generated from a single sequence, orders_seq, and thus the values will be unique across the two tables.
See Figure 1 for an illustration of Step 1.
Figure 1. order_id sequence values override identity columns

Step 2. Wait until orders are synchronized
The Widget company does not want to wait for the beginning of the quarter when the order IDs are reset; instead, they decide to monitor the status of the orders. When the result of the following query is an empty table, they can move to Step 3:
SELECT order_id
FROM (SELECT order_id FROM customer_orders_t
WHERE status <> 'COMPLETED' AND order_id < 57232) AS x
UNION ALL (SELECT order_id FROM supplier_orders_t
WHERE status <> 'COMPLETED' AND order_id < 57232)
|
See Figure 2 for an illustration of Step 2.
Figure 2. All potential duplicate order_id numbers are completed

Step 3. Phase out supplier orders
To phase out the supplier_orders_t table, the table is briefly made unavailable by renaming it as follows:
RENAME TABLE supplier_orders_t TO supplier_orders_t_old
|
Then a view is created to allow existing references to supplier_orders_t to continue to access the underlying data:
CREATE VIEW supplier_orders_t
(order_id, order_date, supp_id, product_id, quantity,
price, status)
AS SELECT
order_id, order_date, cust_id, product_id, quantity,
price, status
FROM customer_orders_t
|
Now all active orders for both traditional customers and suppliers are managed in the customer_orders_t table. To make things easier and more straightforward for future maintenance on these tables, a bit of cleanup work is also done in Step 4.
See Figure 3 for an illustration of Step 3.
Figure 3. All new orders go in customer_orders_t

Step 4. Clean up
Because order numbers must now only be generated for a single column (order_id), the values can be generated by the identity column rather than using a separate sequence object. Again, the table is briefly made unavailable and the identity value is reset to the next value that would be generated by the sequence.
LOCK TABLE customer_orders_t IN EXCLUSIVE MODE
VALUES NEXT VALUE FOR orders_seq INTO :nextorder
|
Assume, for example, the above query returned the value: 64243. The customer_orders_t table can then be altered as follows:
ALTER TABLE customer_orders_t
ALTER COLUMN order_id
SET GENERATED ALWAYS
RESTART WITH 64243
|
Again, each instance where an INSERT statement uses the orders_seq sequence needs to be changed, to go back to using DEFAULT, as was shown previously in Step 1. Now the sequence can be dropped:
DROP SEQUENCE orders_seq RESTRICT
|
After the data in the old supplier orders table is archived, this table can then be dropped, too.
See Figure 4 for an illustration of Step 4.
Figure 4. Use the identity column again to generate values, starting with 64243

Tuning performance
Now that the customer_orders_t table is being used for almost twice as much activity, the Widget company decides to place this table in a parallel environment. In order to take advantage of the parallel inserts that can now be done, they decide to tune the performance of the identity column for this table by caching values. They determine that a cache size of 50 would be appropriate for the number of orders that get created per hour and for how often the database system is restarted for any reason. They also changed how they calculate how many orders are actually created in a quarter and so that there is no reason to force the sequence values to be generated in order. The main requirement for unique values to be generated during the period of one quarter continues to be met and so the following adjustment is made to enhance the performance of the sequence generation for the identity column:
ALTER TABLE customer_orders_t
ALTER COLUMN order_id
SET CACHE 50
SET NO ORDER
|
Example 2. Collecting satellite readings
Sequences allow unique values to be generated across more than one table, as we showed in Example 1. Sequences also allow more than one column in a table to have its values automatically generated. For this example, consider an orbiting satellite around a planet or moon. This particular satellite, SAT1, is designed to take data readings at 16 different points along its orbit. The table that collects this data will have three columns where the values are automatically generated: one column will use the identity attribute to generate a reading ID, and two others will obtain their values from sequences. See Figure 5 for an illustration.
Figure 5. A satellite and the 16 data reading points along its orbit

The DDL for our example
Because the number of readings is likely to be quite large, the identity column uses the DECIMAL(31) data type.
CREATE TABLE SAT1_readings (
reading_id DECIMAL(31) NOT NULL PRIMARY KEY
GENERATED ALWAYS AS IDENTITY
(START WITH 1
INCREMENT BY 1
MINVALUE 1
NO MAXVALUE
NO CYCLE
NO CACHE
ORDER),
orbit_location SMALLINT NOT NULL,
horizon_adjustment SMALLINT NOT NULL,
planet_image BLOB(100 M))
|
The orbit_location has a value from between 0 and 15 for the 16 points along the orbit that readings are taken. The following sequence is created to generate a cycling sequence of these 16 values:
CREATE SEQUENCE orbit_location_seq
AS SMALLINT
START WITH 0
INCREMENT BY 1
MINVALUE 0
MAXVALUE 15
CYCLE
NO CACHE
ORDER
|
The horizon_adjustment value indicates where the satellite is in relation to the plane of horizon. A value of 0 means that it is on the plane of horizon, a value of +4 means it is at its maximum point above the plane of horizon and a value of -4 means it is at its maximum value below the plane of horizon. The sequence starts at the value -4, which is neither the minimum value nor the maximum value, because it will be used in an equation in order to calculate the horizon adjustment value. The sequence cycles for each orbit. The following sequence will be used in the calculation of the horizon_adjustment value:
CREATE SEQUENCE horizon_adjustment_seq
AS SMALLINT
START WITH -4
INCREMENT BY 1
MINVALUE -7
MAXVALUE 8
CYCLE
NO CACHE
ORDER
|
Populating the readings table
The following INSERT statement populates the table each time a reading is made:
INSERT INTO SAT1_readings VALUES
(DEFAULT, NEXT VALUE FOR orbit_location_seq,
ABS(NEXT VALUE FOR horizon_adjustment_seq) - 4,
:planet_image)
|
After the first 17 readings, and the corresponding INSERT statements, the values that are automatically generated for the columns reading_id (r_i), orbit_location (o_l) and horizon_adjustment (h_a) are:
| r_i | 1. | 2. | 3. | 4. | 5. | 6. | 7. | 8. | 9. | 10. | 11. | 12. | 13. | 14. | 15. | 16. | 17. | | o_l | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 0 | | h_a | 0 | -1 | -2 | -3 | -4 | -3 | -2 | -1 | 0 | 1 | 2 | 3 | 4 | 3 | 2 | 1 | 0 |
Capturing additional readings
Because the instruments are sensitive enough, readings on the planet can be made before the satellite actually achieves orbit. The range of the instruments allow 10 readings to be made as the satellite approaches the planet, before it settles into its orbit. To capture these additional readings, which are outside of the regular cycle of the orbit, the sequence values are started outside of the range of the MINVALUE and MAXVALUE values that define limits of the cycle.
The following statements alter the identity value and the sequence values to allow the extra 10 readings to be captured before readings are taken at the 16 points along the orbit. The RESTART option either restarts the identity column or sequence at the specified value, or when no value is specified, restarts at the value specified as the START WITH value when the identity column or sequence was created.
ALTER TABLE SAT1_readings
ALTER COLUMN reading_id
RESTART
ALTER SEQUENCE orbit_location_seq
RESTART WITH -10
ALTER SEQUENCE horizon_adjustment_seq
RESTART WITH -14
|
The following table shows what the first 17 readings would be if the above three ALTER statements were performed before the corresponding INSERT statements. The 17 readings include the 10 readings made prior to orbit and the first seven readings made during the orbit:
| r_i | 1. | 2. | 3. | 4. | 5. | 6. | 7. | 8. | 9. | 10. | 11. | 12. | 13. | 14. | 15. | 16. | 17. | | o_l | -10 | -9 | -8 | -7 | -6 | -5 | -4 | -3 | -2 | -1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | | h_a | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | -1 | -2 | -3 | -4 | -3 | -2 |
Conclusion DB2 supports two flexible ways to automatically generate numeric values: identity columns, which are tied closely to a table, and sequence objects, which generate values independent of any table reference. The examples above illustrate how powerful and flexible both identity columns and sequence objects are for automatically generating numeric values. Specifically, the examples demonstrate the following features of identity columns and sequences:
- Using an identity column in a primary key
- Using a sequence object with two tables
- Explicitly specifying a value for an identity column instead of having a value generated
- Starting a sequence of values at a predetermined value
- Restarting an identity column value at a specified value
- Caching identity values to improve performance
- Defining a sequence or identity column to allow the values to cycle
- Specifying a starting value for a sequence that is greater than the minimum value defined for the sequence
- Specifying a starting value for a sequence that is a negative value
- Defining a sequence to generate values that increase and then decrease
- Ensuring that values are returned in the order they are generated
We hope you find sequences and identities useful for your applications.
Resources
About the authors  | 
|  | Bobby Fielding has 14 years of experience in IBM database development. He is an SQL language architect in the DB2 UDB for Linux, UNIX and Windows development team. He is a member of the IBM SQL Language Council, which proposes enhancements to the international SQL standard. |
 | 
|  | Claire McFeely has 14 years of experience in IBM database development. She is an SQL language architect in the DB2 UDB for z/OS development team. She is a member of the IBM SQL Language Council, which proposes enhancements to the international SQL standard. |
Rate this page
|  |