Using the UNION Operator

Selecting Data from Multiple Tables › Using the UNION Operator

Using the UNION Operator

Using the UNION operator derives a result table by combining two other result tables.

The set of rows in the UNION of result tables R1 and R2 is the set of rows in either R1 or R2, with redundant duplicate rows eliminated. Each row of the UNION table is either a row from R1 or a row from R2.

The columns of the result table are not named.

Duplicate Rows

Two rows are duplicates of one another only if each value in the first row is equal to the corresponding value of the second row.

All but one row of each set of duplicates are eliminated by a UNION. The number of rows in the UNION table is the sum of the number of rows in R1 and R2, less the number of duplicates eliminated.

If you specify UNION ALL, duplicate rows are not eliminated.

Rules for Columns

Result tables R1 and R2 must have the same number of columns.

With the exception of column names, the description of the first column of R1 must be identical to the description of the first column of R2, that is to say, the data type and the length must be the same. The description of the second column of R1 must be identical to the description of the second column of R2, and so on.

Example

The following example (see next page) performs a union on the result tables derived from the CUSTOMERS and ORDERS tables. This example is taken from DBCOBSQF, a sample program available on the installation tape.


 Problem

List all customer numbers for customers who have more than $300,000 in
year-to-date sales.
 Solution

               .
               .
       (COBOL statements)
               .
               .
  1   EXEC SQL
  2      DECLARE CUSTORD2 CURSOR FOR
  3          SELECT CUST_NO
  4              FROM CUSTOMERS
  5              WHERE YTD_SALES > 300000
  6          UNION
  7          SELECT CUST_NO
  8              FROM ORDERS
  9              ORDER BY 1
 10   END-EXEC
               .
               .
       (COBOL statements)
               .
               .

Lines 3-5

The first subselect specifies the column to retrieve from the CUSTOMERS table. The search condition in the WHERE clause limits the retrieved rows to those where the value of YTD_SALES is greater than $300,000. You do not have to qualify the CUST_NO column in this SELECT to distinguish it from the CUST_NO column of the ORDERS table because each subselect is evaluated separately. The union is performed after each subselect has been processed.

Line 6

The UNION operator between the two subselect statements means the final result table contains data that is a set formed from the data retrieved by each subselect.

Lines 7-9

The second subselect specifies the column to retrieve from the ORDERS table. No search condition limits the number of rows retrieved by this subselect. The ORDER BY clause specifies that the rows in the result table of this union are to be placed in ascending order according to the value in the first column. This column is referenced by a number because columns do not have names in a result table formed by a union operation.

In the previous example, each subselect has one column and the definition of the column in the first subselect is identical to the definition of the column in the second subselect.

The union of the two result tables would not be possible if one subselect had more columns specified than the other, or the column definitions did not match.

The result table formed by the UNION operation is a set of the data retrieved by each subselect. The columns of the result table are not named.

Sample Output

Following is the report produced by running DBCOBSQF:

----------------------------------------------- CURRENT ORDERS ----------------------------------------------PAGE 1 CUSTOMER NO ---------------------------------------------------------------------------------------------------------------------- 0030 0170 0230 1210 1450 1630 1850 1890 1950 1970 2010 2050 2070 2090 2250 2330 2690 3910 4310 4350 5590 6390 7150 7290 7350 7410 7790 9130 END OF REPORT