The VARCHAR and LONG VARCHAR data types provide for varying-length character strings.
VARCHAR
VARCHAR(n) specifies a varying-length character string of maximum length n. The length may range from 1 to the maximum row size.
The physical size in bytes of a VARCHAR(n) column is:
n + 2 + x
where x=1 if the column allows nulls, else x=0.
The 2 is added because of the two-byte current-length field which precedes the character string. The 1 is added if the column allows nulls because of the one-byte null indicator flag which precedes the length and data fields.
LONG VARCHAR
Specifies a varying-length character string whose maximum length is determined by the amount of space available in a block.
The physical size of a LONG VARCHAR field varies with the block size of the table and the number of LONG VARCHAR fields defined for the table. Assuming:
m = Maximum row size (block size - 14)
i = The sum of the byte counts of all columns in the table that are not LONG VARCHAR
j = the number of LONG VARCHAR columns in the table
k = the number of LONG VARCHAR columns which allow nulls
then the physical size in bytes of a LONG VARCHAR column is:
2 * (INTEGER ((INTEGER ((m - i - k) / j )) / 2))
LONG VARCHAR columns get whatever space is left in the block after all of the other columns are defined. If only one LONG VARCHAR is defined, it gets all of the space left in the block that is not taken up by the other columns. If there is more than one LONG VARCHAR column, they share the space evenly. Defining any LONG VARCHAR columns causes the row size of the table to be equal to the block size (minus block overhead).
Defining a LONG VARCHAR column is exactly equivalent to defining a VARCHAR(n) where n = the size as calculated above. If the block size of the table is later changed, the size of its LONG VARCHAR columns is not recalculated. When modifying a LONG VARCHAR column using ALTER TABLE, the column is treated like any VARCHAR(n) column. The LONG VARCHAR data type is provided to simplify defining a column that is as long as possible to be used for storing text, binary data, and so forth.
Using VARCHAR Columns
Each VARCHAR or LONG VARCHAR column value is made up of two parts, a SMALLINT current-length subfield, which contains the length in bytes of the character string, followed by the character string itself. The length specified when the column is defined is the maximum length; at any given time, a column value may contain a character string whose length is between 0 and the maximum length of the column.
In a COBOL program, a VARCHAR host variable must have the following form:
01 VAR1.
49 VAR1-LEN PIC S9(4) USAGE COMP.
49 VAR1-TEXT PIC X(n).
Where VAR1, VAR1-LEN, and VAR1-TEXT are user-defined names, and n is the maximum length for the VARCHAR column. The group-level may be numbered 01 through 48. The group must contain two elementary items with the level number of 49. The first elementary item must be a SMALLINT integer variable. The second elementary item must have the same description as a fixed-length character string. When referring to the VARCHAR host variable in an embedded SQL statement, the group name is used, for example :VAR1 in the following:
INSERT INTO TABLE1 VALUES (:VAR1, 50)
Before the INSERT statement is executed, the character string to insert must be moved to VAR1-TEXT, and VAR1-LEN must be set to the length of the data in VAR1-TEXT which is to be inserted into the column.
When a VARCHAR column value is retrieved, CA Datacom/DB fills in the length as well as the text. For example, after the following statement is executed:
FETCH CURSOR1 INTO :VAR1
VAR1-LEN is set to the length of the character data returned in VAR1-TEXT.
Null indicators work as they do with any other data type. For instance, to insert a NULL value into a column, define a null indicator variable, set it to -1, and specify the null indicator variable and the variable for the column value:
01 VAR1.
49 VAR1-LEN PIC S9(4) USAGE COMP.
49 VAR1-TEXT PIC X(n).
01 VAR1-NI PIC S9(4) USAGE COMP.
SET VAR1-NI = TO -1.
INSERT INTO TABLE1 VALUES (:VAR1:VAR1-NI, 500)
In this case, neither VAR1-TEXT nor VAR1-LEN need be set to any particular value, because the null indicator is checked first. Similarly, when retrieving a VARCHAR value that might be null, provide a null indicator value and check the null indicator first. If the null indicator indicates that the value is null, the values in VAR1-LEN and VAR1-TEXT are undefined.
FETCH CURSOR1 INTO :VAR1:VAR1-NI
IF VAR1-NI = -1 THEN
PERFORM 'STRING-OUT' USING 4 'NULL'
ELSE
PERFORM 'STRING-OUT' USING VAR1-LEN VAR1-TEXT
END-IF
Various Rules Related to VARCHAR
In general, you can use a varying-length character string column or variable anywhere a fixed-length character string or column could be used.
When a string of length n is assigned to a varying-length string variable with a maximum length greater than n, the characters after the nth character of the variable are undefined and may or may not be set to blanks.
Varying-length strings that differ only in the number of trailing blanks are considered equal.
In CA Datacom/DB, character string constants are considered to be fixed-length character strings and are limited to 32720 characters.
MIXED Data
Three semantic types are allowed on CHAR, VARCHAR, and LONG VARCHAR columns: FOR MIXED DATA, FOR SBCS DATA, and FOR BIT DATA.
FOR MIXED DATA means that Double-Byte Character Set (DBCS) characters are allowed in values stored in the column, in addition to EBCDIC (Single-Byte Character Set (SBCS)) characters. This is relevant when SQL is processing a value in the column, since it must recognize the Shift-Out and Shift-In characters that delimit DBCS substrings. FOR SBCS DATA means that DBCS characters are not used in the column. FOR BIT DATA means that the data is a string of binary data rather than a string of characters.
If a semantic type is not specified when the column is created, the default is the semantic type that was specified on the CXXMAINT OPTION=ALTER,DBCS=xxx option. If xxx was IS or FS, the default is FOR SBCS DATA. If xxx was IM or FM, the default is FOR MIXED DATA.
Default values for MIXED columns may include DBCS characters (each sequence of DBCS characters must be delimited by Shift characters).
Note: SQL does not check to make sure Shift characters are not used in SBCS strings, and there is no validation of DBCS characters. X'42' is used as the first byte of any (non-blank) DBCS character generated by SQL.
|
Copyright © 2014 CA.
All rights reserved.
|
|