Configuring Dovetail Seeker

The Dovetail Seeker Service, Console, and web service are set up by default to use the same configuration file, seeker.config. You should only need to edit the database connection information. For more configuration information, see Other configuration properties section.

To edit the settings:

  1. Go to the directory where you installed Dovetail Seeker. The default directory is C:\Program Files\Dovetail Software\Seeker
  2. Go to the config directory
  3. Open seeker.config file in your favorite text editor.

Indexer Configuration

The seeker.config file defines all the application settings for the indexing applications.

Key Required Default Description
seekerIndexer.databaseType Yes mssql Specifies the type of provider ClarifyApplication should to connect to the DB. Standard values are "MSSQL", "ORACLE", or "ODPNET".
seekerIndexer.databaseConnectionString Yes Data Source=server; Initial Catalog=clarify; User Id=user; Password=password;
or:
Integrated Security=SSPI;
Persist Security Info=True;
Important: Before Integrated Security can be used a server set up procedure must be followed, see Integrated Security with Dovetail server applications for details.
seekerIndexer.ApplicationUsername Depends emptyString A valid Clarify username used by the application. Used only when Integrated Security is specified in seekerIndexer.databaseConnectionString
Must have a value when Integrated Security is to be used
seekerCommon.fileSystemIndexDirectory Yes [Installation Path]\seekerIndex Where should the indexer create/update the search index
seekerCommon.
documentSpecificationsFilePath
Yes [Installation Path]\config\documentSpecifications.xml Where should the indexer look for document specifications configuration
seekerIndexer.
dovetailDocumentPollingFrequencyInSeconds
Yes 15 How often should the database and filesystem be polled for objects that have been updated
seekerIndexer.
fileDocumentPollingFrequencyInSeconds
Yes 120 How often should the database and filesystem be polled for documents that have been updated
seekerIndexer.
maxUpdatesPerMessageInDovetailDocumentMessage
No 100 Document updates batch size
seekerIndexer.
maxUpdatesPerMessageInFileDocumentMessage
No 25 File document message batch size
seekerIndexer.
fileDocumentValidationPollingFrequencyInHours
Yes 3 How often should file documents based on deleted or moved files be removed from the index
seekerIndexer.
documentSummaryFieldLength
No 1000 How many charcters should be included in the summary field of dovetail and file documents
seekerIndexer.
numberOfIndexChangesBeforeOptimization
No 1000 How often the index is optimized
seekerIndexer.
numberOfIndexSegmentsAfterOptimization
No 3 Number of index segments after index optimization
SearchAnalysis.StopWords No Comma delimited list of words Comma delimited list of words which the indexer should skip when indexing
SearchAnalysis.Stemmer No emptyString The algorithm Seeker will use to reduce words to their stem
Database settings

If you skipped the database settings section during the install, you must edit a configuration file which tells Seeker how to talk to the Dovetail CRM database. Optionally, you can control where index files are created and how often the Seeker Service updates your index.

  1. Edit the seekerIndexer.databaseType node to be either 'mssql' or 'oracle' depending on your database platform.
  2. Edit the seekerIndexer.databaseConnectionString node to contain a valid database connection string that has at least read access to your Dovetail CRM database. It may contain Integrated Security=SSPI; and Persist Security Info=True; instead of User Id=user; Password=password;.

Important: We advise that you limit the database credentials used for running indexing applications to be read-only.

Where Is The Seeker Index Located?

Optionally, you can change the location of the seekerCommon.fileSystemIndexDirectory to control where the search index will be created. Be aware that the Seeker web application will need to have at least read access to this path.

What is Indexed Out of the Box?

Dovetail Seeker is shipped with a predefined set of specifications about what Dovetail CRM entities to index. These specifications are defined in an XML configuration file.

You can use the seekerCommon.documentSpecificationsFilePath in the configuration file to control the location where indexing applications look for document specifications. This should be an absolute path.

How Often Should the Seeker Service Update Dovetail Specifications?

The Seeker Service by default checks to see if the search index needs updating Dovetail CRM entities every 15 seconds and file documents every 120 seconds (2 minutes). You can change the value of the seekerIndexer.dovetailDocumentPollingFrequencyInSeconds entry to be the number of seconds you want Seeker Service to wait between index updates for Dovetail CRM entities and seekerIndexer.fileDocumentPollingFrequencyInSeconds for file documents.

How Many Documents Get Updated Per Message?

The Dovetail Seeker Service automatically batches up the number of document updates per Dovetail CRM message to 100 and file document message to 25. You can change the value of seekerIndexer.maxUpdatesPerMessageInDovetailDocumentMessage entry for Dovetail CRM entities and seekerIndexer.maxUpdatesPerMessageInFileDocumentMessage for file documents. If either entry is left blank, the service will not batch updates and do them all at once.

How Often Should Invalid File Documents Get Removed?

The Dovetail Seeker service automatically removes file documents from the index that have moved or been deleted. The default time for removing the invalid files from the index is 3 hours. You can change the value of seekerIndexer.fileDocumentValidationPollingFrequencyInHours entry.

How Many Characters Should Be Included In Every Document's Summary Field?

To avoid bloating the index with extremely large document summaries it is recommended to limit the number of characters included from the file being indexing in the summary field of the document. You can change the value of seekerIndexer.documentSummaryFieldLength to control the number of characters included in the file document's summary field.

After How Many Document Updates Should The Index Optimized?

By default the Seeker Service will only optimize the search index after 10000 documents have been added or updated in the search index. You can change the value of seekerIndexer.numberOfIndexChangesBeforeOptimization entry to be a numeric value representing the number of documents that need to be changed before the index will be updated.

How Many Index Segments Should The Index Be Optimized To?

When the index is optimized Lucene works to reduce the number of index segments. For ideal search performance the number of index segments is one. This setting seekerIndexer.numberOfIndexSegmentsAfterOptimization defaults to 3 segments in an attempt to strike a balance between search performance and index optimization time. If you experience long index optimization times you should increase this value. If you would like to boost your search performance reduce this value to 1.

Stop Words

SearchAnalysis.StopWords is a comma delimited list of words which the indexer should skip when indexing.

Stemming

SearchAnalysis.Stemmer is the algorithm Seeker will use to reduce words to their stem. (e.g. Cases -> Case, captured -> capture) Valid options are: Porter, Snowball, or an empty string (which will not use a stemmer).


Logging Configuration

Logging is done using log4net and is configured by editing the log4Net configuration in the log4net.config file. All logging output defaults to the logs directory under the installation location.

Logging capabilities include:

Default location for indexing applications: [InstallDir]\indexer\logs

Default location for web service: [InstallDir]\webservice\logs

Logging For File Document Indexing

When rich documents, such as word and PDF files, are being indexed a Text Extraction library is being used. This library has logging infrastructure which is unfortunately different than log4net. By default we have the configuration of this alternate logging infrastructure hard coded to log all errors to the following log file:

[InstallDir]\indexer\logs\text-extraction-errors.log

If you wish to customize the logging of this library you can create a log4j.config file according to the log4j manual and put it into the indexer's application directory.

[InstallDir]\indexer\log4j.config


Web Service Configuration

The seeker.config file defines all the application settings for the Web Service

Key Required Default
seekerWebService.luceneMaximumClauseCount No 1024
seekerWebService.fileStoreDirectoryPath No null
seekerWebService.fileDocumentProxyBaseUrl No http:///seekerproxy
seekerWebService.fileProxyTokenTimeoutPeriodInSeconds Yes 300
seekerWebService.spellingDictionaryFilePath Yes [Installation Path]\webservice\dictionaries
seekerWebService.spellingNativeDllFilePath No The spellingDictionaryFilePath setting path.
seekerWebService.defaultResourceTimeout No 7 days
seekerWebService.attachmentDirectoryPath No null
seekerWebService.attachmentMode No ModeB

Important: If Integrated Security is to be used with Seeker Indexer the Seeker Web Service must be also configured accordingly, see Integrated Security with Dovetail server applications for details.

Configuring Lucene's Maximum Number Of Search Terms

Lucene limits the number of search terms in a search query. The default maximum search term limit is 1024 search terms. The reason this limit exists is to keep complex search queries from slowing down search requests. The most common form of this problem is queries using wildcards on terms with few characters. For example a search query of a* against an index of respectable size will almost always result in a Too Many Clauses error.

Administrators wishing to allow users to search broadly can change this limit to be a larger number. How large this needs to be depends on your index.

You can change the maximum search term limit by enabling and changing the seekerWebService.luceneMaximumClauseCount entry.

Note: Be aware that the maximum search term limit is there for a reason and raising it may impact your search performance.

File Store Directory Path

The Seeker Web Service supports file uploads. Files uploaded will be saved into a directory structure under this local or UNC file path. You can change the value of seekerWebService.fileStoreDirectoryPath entry to represent the location of where you wish files to be saved. You may wish to set this to a directory under your existing Clarify attachments directory.

File Document Proxy URL

The Seeker Web Service Search API returns a paths to a file proxy for file documents. The default URL is entered during the installation. You can change the value of seekerWebService.fileDocumentProxyBaseUrl entry to represent the location of the file proxy.

File Token Timeout

Search results that contains files are associated with a token that times out for security reasons. The default timeout for the tokens is 300 seconds (5 minutes). You can change the value of seekerWebService.fileProxyTokenTimeoutPeriodInSeconds entry to a numeric value representing the number of seconds for a token to timeout.

Spelling Dictionary Path

The Dovetail Seeker Web APIs includes a spell check service that requires a path to the dictionary files. The default path is the installation location of the web service (c:\Program Files\Dovetail Software\Seeker\webservice\dictionaries). You can change the value of seekerWebService.spellingDictionaryFilePath entry to path of the dictionary files.

attachmentDirectoryPath

The base path where file attachments should be stored when using the Attachment Upload API.

attachmentMode

The AttachmentMode specifies which sub-folder mode should be used when saving file attachments using the Attachment Upload API. Valid values are ModeA, ModeB, or ModeC. For specific details, refer to the AttachmentMode Knowledgebase Article

web.config

The web.config file defines .NET specific settings for the Web Service. Most settings in this file will be left unchanged.

You may wish to alter the following settings for your specific environment.

Setting Default Value Description
system.web
--httpRuntime
----maxRequestLength
10241 maxRequestLength for asp.net, in KB.
Controls file upload size limits.
For more information:
Controlling upload file size in ASP.NET applications
system.webServer
--security
----requestFiltering
------requestLimits
--------maxAllowedContentLength
10485760 maxAllowedContentLength, for IIS, in bytes.
Controls file upload size limits.
For more information:
Controlling upload file size in ASP.NET applications

Backups

Dovetail Seeker

Uninstalling Dovetail Seeker creates backups of documentSpecifications.xml and seeker.config files in the config folder of the installation directory.

The default location is C:\Program Files\Dovetail Software\Seeker\config

The backup configuration files are renamed with the suffix backup.[yyyy-MM-ddThh-mm-ss].

The following is an example of a backup file for documentSpecifications.xml on 07-28-2010 at 9:22 am:

documentSpecifications.xml.backup.2010-07-28T09-22-03

Reinstalling Dovetail Seeker

If you reinstall Dovetail Seeker, the backup files do not overwrite the new files. It is at your discretion to overwrite the default configuration files with the backup files.

Configuring File Proxy

You should not need to do anything manually here as the Seeker file proxy installer should automatically configure the proxy settings. If you do run into problems downloading or uploading files you can use this guide to manually configure the proxy.

The Seeker file proxy installer creates a seeker file proxy web application. The web.config of this web application should be automatically configured by the installer. If you need to manually change these settings use the following guidance.

The file proxy currently proxies two Dovetail Seeker web service APIs.

Editing the file proxy web.config

The following is the element from the web.config file:

<rewrite>
  <rules>
    <rule name="Seeker File Upload" stopProcessing="true">
      <match url="^file/upload" />
      <action type="Rewrite" url="@URL@/file/upload" />
    </rule>
    <rule name="Seeker File Download" stopProcessing="true">
      <match url="^file/download" />
      <action type="Rewrite" url="@URL@/file/download" />
    </rule>
  </rules>
</rewrite>

The only text you may need to customize is the @URL@ placeholder. Occurrences of @URL@ need to be replaced with the Seeker web service API. The installer defaults it to http://localhost/seeker.