A CA DataMinder content database differs from a conventional relational database. All documents stored in the content database are rigorously dissected and indexed. As the database accumulates data, it acquires a sophisticated understanding of these documents based on content analysis that can contextualize occurrences of individual words within a document. This enables it to identify clusters of related documents, defined by their characteristic text patterns that reveal a shared subject or theme.
When you run a content search, the content database uses its acquired expertise to discern the theme embodied by your search criteria. It then examines each matching document for the text patterns that characterize this theme. For example, if you search for occurrences of the word 'sales', the content database compares each matching document against the characteristic profile of other sales-themed documents in the Content database.
By calculating the comparative strength of the various text patterns discernible in a document, the content database is able to quantify how closely it matches a particular theme. It then generates a percentage probability that the document corresponds to a specific theme. This probability is the confidence level.
When you run a content search, you specify a minimum confidence level. Documents that meet the search criteria but which do not have a high enough confidence level are considered irrelevant to the search. This eliminates false hits and focuses the search on documents that are relevant to the search theme.
Copyright © 2014 CA.
All rights reserved.
|
|