Previous Topic: XML File Validation

Next Topic: XML Schema

Non-Printable Characters in XML Files

Non-printable characters are low-order ASCII characters in the ranges that are described as follows:

#x0 - #x8

ASCII 0 - 8

#xB - #xC

ASCII 11 - 12

#xE - #x1F

ASCII 14 - 31

An example of such a character would be CTRL-Z (ASCII value 26).

The World Wide Web Consortium (W3C) XML language specification dictates that non-printable characters are not legal in an XML file. The XML parser from Microsoft, as well as other commercially available XML parsers, enforces this restriction. If an XML file contains non-printable characters, the XML parsers that conform to the W3C XML specification will yield an error.

Non-printable characters can find a way into a model through reverse engineering, third party integrations, or any other mechanism where external data is entered into a model. Since saving the model as an XML file does not require the services of an XML parser, a model can be saved as an XML file regardless of the presence of non-printable characters in the model. However, when you open an XML file that contains non-printable characters, it will not succeed. If the validation process was run against the XML file, the error log will contain a message similar to the following:

Error XML - 187: The XML file is not valid.

File: C:\testModel.xml

Reason: An invalid character was found in text content.

Line: 1

Source:

"><EMX:Drawing_Object_EntityProps>

If the validation process was not invoked prior to opening the XML file, a message similar to the following will appear in the error log:

Error XML - 155: Could not open ' C:\testModel.xml. If this XML file was not validated against the schema, you may be able to obtain more information regarding this error by re-running the import with the XML validation option turned on.

During the process of saving a model as an XML file, all non-printable characters will be replaced with their corresponding escape sequence in the XML file. The replacement escape sequence is \#xNN where NN is the hexadecimal ASCII value of the non-printable character. For example, the non-printable character CTRL-Z, whose hexadecimal ASCII value is 1A, would be replaced with \#x1A. If such a replacement occurred during the Save process, the HandleNonPrintableChar attribute of the XML element is created and is set to Y. If the model does not contain any non-printable characters, then the HandleNonPrintableChar attribute is not created.

The following is an excerpt of an XML file generated from a model that contains non-printable characters:

<Entity id="{4BB100C4-8DC7-4DDB-8E5B-727769522065}+00000000" name="\#x1A">

<EntityProps>

<Name HandleNonPrintableChar="Y">\#x1A</Name>