Skip to content

Commit 4790e9e

Browse files
Merge pull request #50 from nsidc/srch-58
SRCH-58
2 parents 90a76fc + 0bbff08 commit 4790e9e

File tree

17 files changed

+224
-46
lines changed

17 files changed

+224
-46
lines changed

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
## Unreleased
2+
3+
- Adding logging functionality to the code, including the ability
4+
to specify log file destination and log level for both the file and
5+
console output
6+
17
## v6.4.1 (2023-09-15)
28

39
- Added GitHub Action workflows for continuous integration features

Gemfile.lock

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ PATH
44
search_solr_tools (6.4.1)
55
ffi-geos (~> 2.4.0)
66
iso8601 (~> 0.13.0)
7+
logging (~> 2.3.1)
78
multi_json (~> 1.15.0)
89
nokogiri (~> 1.15.4)
910
rest-client (~> 2.1.0)
@@ -65,6 +66,10 @@ GEM
6566
listen (3.8.0)
6667
rb-fsevent (~> 0.10, >= 0.10.3)
6768
rb-inotify (~> 0.9, >= 0.9.10)
69+
little-plugger (1.1.4)
70+
logging (2.3.1)
71+
little-plugger (~> 1.1)
72+
multi_json (~> 1.14)
6873
lumberjack (1.2.9)
6974
method_source (1.0.0)
7075
mime-types (3.5.1)

README.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,10 @@ the tests whenever the appropriate files are changed.
101101

102102
Please be sure to run them in the `bundle exec` context if you're utilizing bundler.
103103

104+
By default, tests are run with minimal logging - no log file and only fatal errors
105+
written to the console. This can be changed by setting the environment variables
106+
as described in [Logging](#logging) below.
107+
104108
### Creating Releases (NSIDC devs only)
105109

106110
Requirements:
@@ -208,6 +212,47 @@ which can be modified, or additional environments can be added by just adding a
208212
new YAML stanza with the right keys; this new environment can then be used with
209213
the `--environment` flag when running `search_solr_tools harvest`.
210214

215+
#### Logging
216+
217+
By default, when running the harvest, harvest logs are written to the file
218+
`/var/log/search-solr-tools.log` (set to `warn` level), as well as to the console
219+
at `info` level. These settings are configured in the `environments.yaml` config
220+
file, in the `common` section.
221+
222+
The keys in the `environments.yaml` file to consider are as follows:
223+
224+
* `log_file` - The full name and path of the file to which log output will be written
225+
to. If set to the special value `none`, no log file will be written to at all.
226+
Log output will be **appended** to the file, if it exists; otherwise, the file will
227+
be created.
228+
* `log_file_level` - Indicates the level of logging which should be written to the log file.
229+
* `log_stdout_level` - Indicates the level of logging which should be written to the console.
230+
This can be different than the level written to the log file.
231+
232+
You can also override the configuration file settings at the command line with the
233+
following environment variables (useful when for doing development work):
234+
235+
* `SEARCH_SOLR_LOG_FILE` - Overrides the `log_file` setting
236+
* `SEARCH_SOLR_LOG_LEVEL` - Overrides the `log_file_level` setting
237+
* `SEARCH_SOLR_STDOUT_LEVEL` - Overrides the `log_stdout_level` setting
238+
239+
When running the spec tests, `SEARCH_SOLR_LOG_FILE` is set to `none` and
240+
`SEARCH_SOLR_STDOUT_LEVEL` is set to `fatal`, unless you manually set those
241+
environment variables prior to running the tests. This is to keep the test output
242+
clean unless you need more detail for debugging.
243+
244+
The following are the levels of logging that can be specified. These levels are
245+
cumulative; for example, `error` will also output `fatal` log entries, and `debug`
246+
will output **all** log entries.
247+
248+
* `none` - No logging outputs will be written.
249+
* `fatal` - Only outputs errors which result in a crash.
250+
* `error` - Outputs any error that occurs while harvesting.
251+
* `warn` - Outputs warnings that occur that do not cause issues with the harvesting,
252+
but might indicate things that may need to be addressed (such as deprecations, etc)
253+
* `info` - Outputs general information, such as harvesting status
254+
* `debug` - Outputs detailed information that can be used for debugging and code tracing.
255+
211256
## Organization Info
212257

213258
### How to contact NSIDC

bin/search_solr_tools

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,14 @@ require 'thor'
66

77
# rubocop:disable Metrics/AbcSize
88
class SolrHarvestCLI < Thor
9+
include SSTLogger
10+
911
map %w[--version -v] => :__print_version
1012

13+
def self.exit_on_failure?
14+
false
15+
end
16+
1117
desc '--version, -v', 'print the version'
1218
def __print_version
1319
puts SearchSolrTools::VERSION
@@ -39,10 +45,11 @@ class SolrHarvestCLI < Thor
3945
rescue StandardError => e
4046
solr_status = false
4147
source_status = false
42-
puts "Error trying to ping for #{target}: #{e}"
48+
logger.error "Ping failed for #{target}: #{e}}"
4349
end
4450
solr_success &&= solr_status
4551
source_success &&= source_status
52+
4653
puts "Target: #{target}, Solr ping OK? #{solr_status}, data center ping OK? #{source_status}"
4754
end
4855

@@ -61,7 +68,7 @@ class SolrHarvestCLI < Thor
6168
option :die_on_failure, type: :boolean
6269
def harvest(die_on_failure = options[:die_on_failure] || false)
6370
options[:data_center].each do |target|
64-
puts "Target: #{target}"
71+
logger.info "Target: #{target}"
6572
begin
6673
harvest_class = get_harvester_class(target)
6774
harvester = harvest_class.new(options[:environment], die_on_failure:)
@@ -73,12 +80,12 @@ class SolrHarvestCLI < Thor
7380

7481
harvester.harvest_and_delete
7582
rescue SearchSolrTools::Errors::HarvestError => e
76-
puts "THERE WERE HARVEST STATUS ERRORS:\n#{e.message}"
83+
logger.error "THERE WERE HARVEST STATUS ERRORS:\n#{e.message}"
7784
exit e.exit_code
7885
rescue StandardError => e
7986
# If it gets here, there is an error that we aren't expecting.
80-
puts "harvest failed for #{target}: #{e.message}"
81-
puts e.backtrace
87+
logger.error "harvest failed for #{target}: #{e.message}"
88+
logger.error e.backtrace
8289
exit SearchSolrTools::Errors::HarvestError::ERRCODE_OTHER
8390
end
8491
end
@@ -93,28 +100,34 @@ class SolrHarvestCLI < Thor
93100
option :environment, required: true
94101
def delete_all
95102
env = SearchSolrTools::SolrEnvironments[options[:environment]]
103+
logger.info('DELETE ALL started')
96104
`curl 'http://#{env[:host]}:#{env[:port]}/solr/update' -H 'Content-Type: text/xml; charset=utf-8' --data '<delete><query>*:*</query></delete>'`
97105
`curl 'http://#{env[:host]}:#{env[:port]}/solr/update' -H 'Content-Type: text/xml; charset=utf-8' --data '<commit/>'`
106+
logger.info('DELETE ALL complete')
98107
end
99108

100109
desc 'delete_all_auto_suggest', 'Delete all documents from the auto_suggest index'
101110
option :environment, required: true
102111
def delete_all_auto_suggest
103112
env = SearchSolrTools::SolrEnvironments[options[:environment]]
113+
logger.info('DELETE ALL AUTO_SUGGEST started')
104114
`curl 'http://#{env[:host]}:#{env[:port]}/solr/update' -H 'Content-Type: text/xml; charset=utf-8' --data '<delete><query>*:*</query></delete>'`
105115
`curl 'http://#{env[:host]}:#{env[:port]}/solr/update' -H 'Content-Type: text/xml; charset=utf-8' --data '<commit/>'`
116+
logger.info('DELETE ALL AUTO_SUGGEST complete')
106117
end
107118

108119
desc 'delete_by_data_center', 'Force deletion of documents for a specific data center with timestamps before the passed timestamp in format iso8601 (2014-07-14T21:49:21Z)'
109120
option :timestamp, required: true
110121
option :environment, required: true
111122
option :data_center, required: true
112123
def delete_by_data_center
124+
logger.info("DELETE ALL for data center '#{options[:data_center]}' started")
113125
harvester = get_harvester_class(options[:data_center]).new options[:environment]
114126
harvester.delete_old_documents(options[:timestamp],
115127
"data_centers:\"#{SearchSolrTools::Helpers::SolrFormat::DATA_CENTER_NAMES[options[:data_center].upcase.to_sym][:long_name]}\"",
116128
SearchSolrTools::SolrEnvironments[harvester.environment][:collection_name],
117129
true)
130+
logger.info("DELETE ALL for data center '#{options[:data_center]}' complete")
118131
end
119132

120133
no_tasks do

lib/search_solr_tools.rb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
require_relative 'search_solr_tools/helpers/harvest_status'
77
require_relative 'search_solr_tools/errors/harvest_error'
88

9+
require_relative 'search_solr_tools/logging/sst_logger'
10+
911
%w[harvesters translators].each do |subdir|
1012
Dir[File.join(__dir__, 'search_solr_tools', subdir, '*.rb')].each { |file| require file }
1113
end

lib/search_solr_tools/config/environments.rb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,10 @@
55
module SearchSolrTools
66
# configuration to work with solr locally, or on integration/qa/staging/prod
77
module SolrEnvironments
8-
YAML_ENVS = YAML.load_file(File.expand_path('environments.yaml', __dir__))
8+
YAML_ENVS = YAML.load_file(File.expand_path('environments.yaml', __dir__), aliases: true)
99

1010
def self.[](env = :development)
11-
YAML_ENVS[:common].merge(YAML_ENVS[env.to_sym])
11+
YAML_ENVS[env.to_sym]
1212
end
1313
end
1414
end

lib/search_solr_tools/config/environments.yaml

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
:common:
1+
:common: &common
22
:auto_suggest_collection_name: auto_suggest
33
:collection_name: nsidc_oai
44
:collection_path: solr
@@ -9,33 +9,48 @@
99
# should be. GLA01.018 will show up if we use DCS API v2.
1010
:nsidc_oai_identifiers_url: oai?verb=ListIdentifiers&metadataPrefix=dif&retired=false
1111

12+
# Log details. Can be overridden by environment-specific values
13+
:log_file: /var/log/search-solr-tools.log
14+
:log_file_level: warn
15+
:log_stdout_level: info
16+
1217
:local:
18+
<<: *common
1319
:host: localhost
1420
:nsidc_dataset_metadata_url: http://integration.nsidc.org/api/dataset/metadata/
1521

16-
:dev:
22+
:dev: &dev
23+
<<: *common
1724
## For the below, you'll need to instantiate your own search-solr instance, and point host to that.
1825
:host: dev.search-solr.USERNAME.dev.int.nsidc.org
1926
## For the metadata content, either set up your own instance of dataset-catalog-services
2027
## or change the URL below to point to integration
2128
:nsidc_dataset_metadata_url: http://dev.dcs.USERNAME.dev.int.nsidc.org:1580/api/dataset/metadata/
2229

30+
:development:
31+
<<: *dev
32+
2333
:integration:
34+
<<: *common
2435
:host: integration.search-solr.apps.int.nsidc.org
2536
:nsidc_dataset_metadata_url: http://integration.nsidc.org/api/dataset/metadata/
2637

2738
:qa:
39+
<<: *common
2840
:host: qa.search-solr.apps.int.nsidc.org
2941
:nsidc_dataset_metadata_url: http://qa.nsidc.org/api/dataset/metadata/
3042

3143
:staging:
44+
<<: *common
3245
:host: staging.search-solr.apps.int.nsidc.org
3346
:nsidc_dataset_metadata_url: http://staging.nsidc.org/api/dataset/metadata/
3447

3548
:blue:
49+
<<: *common
3650
:host: blue.search-solr.apps.int.nsidc.org
3751
:nsidc_dataset_metadata_url: http://nsidc.org/api/dataset/metadata/
3852

3953
:production:
54+
<<: *common
4055
:host: search-solr.apps.int.nsidc.org
4156
:nsidc_dataset_metadata_url: http://nsidc.org/api/dataset/metadata/

lib/search_solr_tools/errors/harvest_error.rb

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,12 @@
11
# frozen_string_literal: true
22

3+
require_relative '../logging/sst_logger'
4+
35
module SearchSolrTools
46
module Errors
57
class HarvestError < StandardError
8+
include SSTLogger
9+
610
ERRCODE_SOLR_PING = 1
711
ERRCODE_SOURCE_PING = 2
812
ERRCODE_SOURCE_NO_RESULTS = 4
@@ -73,11 +77,11 @@ def initialize(status, message = nil)
7377
# rubocop:disable Metrics/AbcSize
7478
def exit_code
7579
if @status_data.nil?
76-
puts "OTHER ERROR REPORTED: #{@other_message}"
80+
logger.error "OTHER ERROR REPORTED: #{@other_message}"
7781
return ERRCODE_OTHER
7882
end
7983

80-
puts "EXIT CODE STATUS:\n#{@status_data.status}"
84+
logger.error "EXIT CODE STATUS:\n#{@status_data.status}"
8185

8286
code = 0
8387
code += ERRCODE_SOLR_PING unless @status_data.ping_solr

lib/search_solr_tools/harvesters/auto_suggest.rb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,10 +51,10 @@ def add_documents_to_solr(add_docs)
5151
status = insert_solr_doc add_docs, Base::JSON_CONTENT_TYPE, @env_settings[:auto_suggest_collection_name]
5252

5353
if status == Helpers::HarvestStatus::INGEST_OK
54-
puts "Added #{add_docs.size} auto suggest documents in one commit"
54+
logger.info "Added #{add_docs.size} auto suggest documents in one commit"
5555
Helpers::HarvestStatus.new(Helpers::HarvestStatus::INGEST_OK => add_docs)
5656
else
57-
puts "Failed adding #{add_docs.size} documents in single commit, retrying one by one"
57+
logger.error "Failed adding #{add_docs.size} documents in single commit, retrying one by one"
5858
new_add_docs = []
5959
add_docs.each do |doc|
6060
new_add_docs << { 'add' => { 'doc' => doc } }

lib/search_solr_tools/harvesters/base.rb

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ module SearchSolrTools
1515
module Harvesters
1616
# base class for solr harvesters
1717
class Base
18+
include SSTLogger
19+
1820
attr_accessor :environment
1921

2022
DELETE_DOCUMENTS_RATIO = 0.1
@@ -50,10 +52,10 @@ def ping_solr(core = SolrEnvironments[@environment][:collection_name])
5052
begin
5153
RestClient.get(url) do |response, _request, _result|
5254
success = response.code == 200
53-
puts "Error in ping request: #{response.body}" unless success
55+
logger.error "Error in ping request: #{response.body}" unless success
5456
end
5557
rescue StandardError => e
56-
puts "Rest exception while pinging Solr: #{e}"
58+
logger.error "Rest exception while pinging Solr: #{e}"
5759
end
5860
success
5961
end
@@ -62,7 +64,7 @@ def ping_solr(core = SolrEnvironments[@environment][:collection_name])
6264
# to "ping" the data center. Returns true if the ping is successful (or, as
6365
# in this default, no ping method was defined)
6466
def ping_source
65-
puts 'Harvester does not have ping method defined, assuming true'
67+
logger.info 'Harvester does not have ping method defined, assuming true'
6668
true
6769
end
6870

@@ -81,9 +83,9 @@ def delete_old_documents(timestamp, constraints, solr_core, force: false)
8183
solr = RSolr.connect url: solr_url + "/#{solr_core}"
8284
unchanged_count = (solr.get 'select', params: { wt: :ruby, q: delete_query, rows: 0 })['response']['numFound'].to_i
8385
if unchanged_count.zero?
84-
puts "All documents were updated after #{timestamp}, nothing to delete"
86+
logger.info "All documents were updated after #{timestamp}, nothing to delete"
8587
else
86-
puts "Begin removing documents older than #{timestamp}"
88+
logger.info "Begin removing documents older than #{timestamp}"
8789
remove_documents(solr, delete_query, constraints, force, unchanged_count)
8890
end
8991
end
@@ -99,13 +101,13 @@ def sanitize_data_centers_constraints(query_string)
99101
def remove_documents(solr, delete_query, constraints, force, numfound)
100102
all_response_count = (solr.get 'select', params: { wt: :ruby, q: constraints, rows: 0 })['response']['numFound']
101103
if force || (numfound / all_response_count.to_f < DELETE_DOCUMENTS_RATIO)
102-
puts "Deleting #{numfound} documents for #{constraints}"
104+
logger.info "Deleting #{numfound} documents for #{constraints}"
103105
solr.delete_by_query delete_query
104106
solr.commit
105107
else
106-
puts "Failed to delete records older than current harvest start because they exceeded #{DELETE_DOCUMENTS_RATIO} of the total records for this data center."
107-
puts "\tTotal records: #{all_response_count}"
108-
puts "\tNon-updated records: #{numfound}"
108+
logger.info "Failed to delete records older than current harvest start because they exceeded #{DELETE_DOCUMENTS_RATIO} of the total records for this data center."
109+
logger.info "\tTotal records: #{all_response_count}"
110+
logger.info "\tNon-updated records: #{numfound}"
109111
end
110112
end
111113

@@ -121,8 +123,8 @@ def insert_solr_docs(docs, content_type = XML_CONTENT_TYPE, core = SolrEnvironme
121123
status.record_status doc_status
122124
doc_status == Helpers::HarvestStatus::INGEST_OK ? success += 1 : failure += 1
123125
end
124-
puts "#{success} document#{success == 1 ? '' : 's'} successfully added to Solr."
125-
puts "#{failure} document#{failure == 1 ? '' : 's'} not added to Solr."
126+
logger.info "#{success} document#{success == 1 ? '' : 's'} successfully added to Solr."
127+
logger.info "#{failure} document#{failure == 1 ? '' : 's'} not added to Solr."
126128

127129
status
128130
end
@@ -146,14 +148,14 @@ def insert_solr_doc(doc, content_type = XML_CONTENT_TYPE, core = SolrEnvironment
146148
RestClient.post(url, doc_serialized, content_type:) do |response, _request, _result|
147149
success = response.code == 200
148150
unless success
149-
puts "Error for #{doc_serialized}\n\n response: #{response.body}"
151+
logger.error "Error for #{doc_serialized}\n\n response: #{response.body}"
150152
status = Helpers::HarvestStatus::INGEST_ERR_SOLR_ERROR
151153
end
152154
end
153155
rescue StandardError => e
154156
# TODO: Need to provide more detail re: this failure so we know whether to
155157
# exit the job with a status != 0
156-
puts "Rest exception while POSTing to Solr: #{e}, for doc: #{doc_serialized}"
158+
logger.error "Rest exception while POSTing to Solr: #{e}, for doc: #{doc_serialized}"
157159
status = Helpers::HarvestStatus::INGEST_ERR_SOLR_ERROR
158160
end
159161
status
@@ -177,11 +179,11 @@ def get_results(request_url, metadata_path, content_type = 'application/xml')
177179
request_url = encode_data_provider_url(request_url)
178180

179181
begin
180-
puts "Request: #{request_url}"
182+
logger.debug "Request: #{request_url}"
181183
response = URI.parse(request_url).open(read_timeout: timeout, 'Content-Type' => content_type)
182184
rescue OpenURI::HTTPError, Timeout::Error, Errno::ETIMEDOUT => e
183185
retries_left -= 1
184-
puts "## REQUEST FAILED ## #{e.class} ## Retrying #{retries_left} more times..."
186+
logger.error "## REQUEST FAILED ## #{e.class} ## Retrying #{retries_left} more times..."
185187

186188
retry if retries_left.positive?
187189

0 commit comments

Comments
 (0)