Previous Topic: Gather Data for  zFS File SystemsNext Topic: Common Crawl Errors


Debug the Web Crawler in the Knowledge Center

The Web Crawler in the Knowledge Center provides the capability to index the websites that are running on intranet or Internet. Note the following points:

Index Location

The Web Crawler generates the indexes in the following location:

<CHORUS_HOME>/userdoc/mfui/webcrawls/crawlresults/<DIRECTORY WITH URL NAME>

URL Log

The URL indexing logs reside in the following location:

<CHORUS_HOME>/userdoc/mfui/webcrawls/crawlresources/<DIRECTORY WITH URL NAME>

The View Log screen shows only the files (along with timestamp and status) that are being sent for indexing.

Uploading documents and indexing are two separate actions. If any indexed file appears in the Index View log window, the file that you uploaded does not get indexed automatically. Index the folder (<CHORUS_HOME>/userdoc/<folder with user name>) explicitly by using the Index Documents tab.

Server Log

The server log in the following location can indicate what documents have been uploaded to <CHORUS_HOME>/userdoc folder:

$CHORUS_HOME/logs

order.xml

The order.xml file stores Web Crawler configuration information including the following:

This file stores the proxy server name and port also. You can find this file in the following location:

<CHORUS_HOME>/userdoc/mfui/webcrawls/crawlresources / <DIRECTORY WITH URL NAME>/order.xml

Error Status FilecrawlErrorDetails.txt

The Error Status FilecrawlErrorDetails.txt file stores error information for which error has occurred and stores it in crawlErrorDetails.txt. You can find this file in the following location:

<CHORUS_HOME>/userdoc/mfui/webcrawls/crawlresources / <DIRECTORY WITH URL NAME>/crawlErrorDetails.txt