Previous Topic: Single Byte CharactersNext Topic: Fixed Length Character Strings


Multiple Byte Characters

Multiple byte characters can be used both on the display device and in the screen manager program data areas. When the screen manager data areas contain fill characters, punctuation characters, currency symbols or strings, the fields are defined large enough to contain multiple byte characters. No indication is made as to which is stored in one of these data areas. The values are implicitly defined. Since no byte within a multiple byte string can have a value less than or equal to a single byte space (with the exception of a double byte space, which is x'4040'), the runtime determines implicitly whether a character in a data area is a 1, 2, 3, or 4 byte value.

Multiple byte character strings are designated as such in the appropriate entry in the attribute table. All data within a multiple byte character string are assumed to be multiple byte characters. No Shift In or Shift Out control codes are used. When the data is presented to the user, the user is prevented from entering any single byte characters. To block the entry of single byte characters, the field must be created with the Input Control attribute set to disallow generation of SO/SI control codes.

When the field contains characters that occupy less space than is provided, the characters are left aligned. For example, if the fill character is defined as a PIC X(4) area in COBOL, and the fill character is a single byte asterisk (x'5C'), then the field is actually stored as x'5C404040'. Since the second byte is a x'40', it legally cannot be part of the character, and the x'5C' is assumed to be the fill character. If, on the other hand, the fill character is a double byte '?' (x'426F'), then the fill character field is stored as x'426F4040'. Again, the first occurrence of a space (x'40') signifies the end of the character sequence, which is x'426F'. This also works correctly for ASCII encodings, except that an ASCII space is x'20' and not x'40' as in EBCDIC.

The previous algorithm works when the field is defined to contain only one character. When a field contains a string, an indicator is needed to inform the runtime that the value is a multiple byte, mixed, or single byte string. This is provided for user-defined fields, as an indicator in the field table entry for the field. This algorithm is used for constant data encoded in the data areas of the screen manager.

When a multiple byte character is transmitted to a screen in a field, the appropriate mechanisms are used to represent it.