Previous Topic: Parse Settings

Next Topic: Multi-Byte Character Set Search Limitations

Define Parse Settings

When you publish a document to the knowledge base, the product parses the information in the Title, Summary, Problem, and Resolution fields of the document into keywords. When a user searches the knowledge base, the product compares keywords from the user query with the keywords parsed from the knowledge base to produce a result list.

To define the settings used to parse documents in the knowledge base, browse the Administration tab in Knowledge, Search, Parse Settings.

The Parse Settings appears, and you can use the following fields to define settings:

Maximum Search Keywords

Defines the maximum number of keywords to extract when the product parses the search text.

Default: 20

Note: The valid range is 1-100, so that a CA SDM Knowledge administrator can change the value within this range, based on search needs and parameters of a specific Knowledge database. Use a lower number of search keywords for faster performance.

Language

Specifies the language type to use for parse processing. Select one of the following settings:

English

Performs certain types of processing specific to the English language (for example, de-pluralizing search terms) during a search, if applicable.

Other European

Performs only European specific processing during the search.

Korean

Performs only Korean specific processing during the search.

Other Far East

Performs processing for other far east languages during the search.

Note: When you are in a Chinese, Japanese, or Korean operating environment, verify that you understand the available parsing approaches, and limitations of Multi-Byte Character Set (MBCS) languages, before implementing your Knowledge Management system to help ensure that user expectations are set appropriately.

Valid Character Range

Defines the range of alphanumeric characters to consider valid when parsing the Title, Summary, Problem, and Resolution fields in a document. The product treats any other characters as separators.

Note: When you select Yes from the Recognize Special Terms list, the product does not parse words and phrases defined as special terms.

Default: a-z, which indicates that the alphabetic characters a through z are valid characters for parsing.

The Valid Character Range field contains the appropriate letters that parsing uses. The letters not presented in the Valid Character Range are removed.

The recommended values for different languages are as follows:

Language Valid Character Range

German a-zäöüß

Spanish a-záéíóúüñ

French a-zàäâçéèêëîïôùû

Portuguese Brazil a-zàáãçéêíóúü

Italian a-zàèéìîïù

Simplified Chinese a-z

Japanese a-z

Traditional Chinese a-z

Korean a-z

Note: Japanese contains the "a-z" range plus a list of Katakana valid characters, excluding punctuation marks.

Remove Similar Words

Specifies whether the product removes structurally similar keywords from the groups used in a search. You can select one of the following settings:

Yes

Removes structurally similar keywords from the search criteria.

Note: When you select Yes, the product also removes similar words when you save or publish the document. This setting can impact whether a document is searchable if the Remove Similar Words field was set to Yes. The similar word may have not been indexed and used in the later search and retrieval of the document.

No

Leaves structurally similar keywords in the search criteria.

Default: No

Remove Noise Words

Specifies whether the product removes noise words when parsing the Title, Summary, Problem, and Resolution fields in a document. You can select one of the following settings:

Yes

Removes noise words from the search criteria.

No

Leaves noise words in the search criteria.

Default: Yes

Recognize Special Terms

Specifies whether the product considers special terms as single entities or as multiple words when parsing the Title, Summary, Problem, and Resolution fields in a document. You can select one of the following settings:

Yes

Processes special terms as single entities in the search criteria.

No

Processes the words that comprise special terms as separate entities in the search criteria.

Default: Yes