- 
                Notifications
    You must be signed in to change notification settings 
- Fork 139
Specifying CrawlDB in Config
The config file in question is sparkler-default.yaml. However, there are 3 sparkler-default.yaml config files in use:
- sparkler-core/conf/sparkler-default.yaml
- sparkler-core/sparkler-api/src/test/resources/sparkler-default.yaml
- sparkler-core/sparkler-app/src/test/resources/sparkler-default.yaml
Changes to the config file should be made across all 3 files for consistency.
The section of the config file pertaining to crawldb is set up as following (subject to change):
  crawldb.backend: solr
  solr.uri: http://localhost:8983/solr/crawldb
  elasticsearch.uri: http://localhost:9200The 'crawldb.backend' field specifies which crawldb to use. Note, the value for 'crawldb.backend' must match one of the following '*.uri' fields. For example, the following specifies elasticsearch as the crawldb to use:
  crawldb.backend: elasticsearch
  solr.uri: http://localhost:8983/solr/crawldb
  elasticsearch.uri: http://localhost:9200To add a crawldb to this config file, add in the URI and specify the new crawldb. The following is an example done with an hypothetical crawldb called 'testdb'.
  crawldb.backend: testdb
  solr.uri: http://localhost:8983/solr/crawldb
  elasticsearch.uri: http://localhost:9200
  testdb.uri: http://localhost:9999  # replace http://localhost:9999 with the appropriate URIConstants.java holds an interface through which the config file values can be accessed. In code, this will look like:
Constants.key.CRAWLDB_BACKEND # for example, this may equal 'solr' or 'elasticsearch'
To get the crawldb URI, use SparklerConfiguration.java's getDatabaseURI() method. In code, this might look like:
config.getDatabaseURI() # where config is a SparklerConfiguration instance