Skip to content

Commit 893b514

Browse files
committed
Fix logging for Tool Meisters
The debug-ability of the Tool Meisters running remotely is very difficult without a way to properly capture log output from them. We take three major steps to address this. The first step is to not use a Python logger until we will (or would have otherwise) daemonized the process. This allows us to emit error messages on stderr, which will be exposed in the callers context. Once we are daemonized, `stdout` and `stderr` are detached from the caller, so logging to some place other than `stdout`/`stderr` is the best option from that point forward. The second step is to recognize that logging to a file on a remote host is useless unless the operator can get a hold of that log file. However, remote Tool Meisters use a temporary directory, so writing to a file will just be lost when the temporary directory is removed by the process of shutting down the Tool Meister. Instead, we now log to `/dev/log` when daemonized. We then add the ability to run the Tool Meister without daemonizing so that in a container we can log to `stdout`/`stderr` to leverage the normal container logging mechanism. As a result of making this change, we introduce a method to request the Tool Meister not be daemonized, and re-organize the code a bit so that we can take advantage of those two environments. The third step is to add the ability for each Tool Meister to send logs back to the Tool Data Sink via Redis, so that the Tool Data Sink can collect them into a single log file making it easier to see what happened remotely. If the Redis server logging fails, then the operator can fall back to looking for Tool Meister logs in the container logs or in the remote host's normal syslog sink. Finally, we address dropped connections to the Redis server so that both the Tool Meister and the Tool Data Sink handle that a bit more gracefully.
1 parent 200de32 commit 893b514

File tree

10 files changed

+582
-258
lines changed

10 files changed

+582
-258
lines changed

agent/util-scripts/gold/test-client-tool-meister/test-53.txt

Lines changed: 24 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -71,8 +71,8 @@ Collecting system information
7171
/var/tmp/pbench-test-utils/pbench/mock-run/tm/redis.conf
7272
/var/tmp/pbench-test-utils/pbench/mock-run/tm/redis.log
7373
/var/tmp/pbench-test-utils/pbench/mock-run/tm/tm-default-testhost.example.com.err
74-
/var/tmp/pbench-test-utils/pbench/mock-run/tm/tm-default-testhost.example.com.log
7574
/var/tmp/pbench-test-utils/pbench/mock-run/tm/tm-default-testhost.example.com.out
75+
/var/tmp/pbench-test-utils/pbench/mock-run/tm/tm.logs
7676
/var/tmp/pbench-test-utils/pbench/mock-run/tools-default
7777
/var/tmp/pbench-test-utils/pbench/pbench.log
7878
/var/tmp/pbench-test-utils/pbench/tmp
@@ -105,7 +105,6 @@ Hit Ctrl-C to quit.
105105
INFO pbench-tool-data-sink run -- Running Bottle web server ...
106106
INFO pbench-tool-data-sink execute -- Terminate bottle server
107107
INFO pbench-tool-data-sink run -- Bottle web server exited
108-
INFO pbench-tool-data-sink main -- Remove pid file ... (pbench-tool-data-sink.pid)
109108
--- mock-run/tm/pbench-tool-data-sink.log file contents
110109
+++ mock-run/tm/pbench-tool-data-sink.out file contents
111110
--- mock-run/tm/pbench-tool-data-sink.out file contents
@@ -128,9 +127,6 @@ port 17001
128127
# Redis is now ready to exit, bye bye...
129128
--- mock-run/tm/redis.log file contents
130129
+++ mock-run/tm/tm-default-testhost.example.com.err file contents
131-
--- mock-run/tm/tm-default-testhost.example.com.err file contents
132-
+++ mock-run/tm/tm-default-testhost.example.com.log file contents
133-
INFO pbench-tool-meister main -- params_key (tm-default-testhost.example.com): b'{"benchmark_run_dir": "/var/tmp/pbench-test-utils/pbench/mock-run", "channel": "tool-meister-chan", "controller": "testhost.example.com", "group": "default", "hostname": "testhost.example.com", "tools": {"iostat": "--interval=42 --options=forty-two", "mpstat": "--interval=42 --options=forty-two"}}'
134130
INFO pbench-tool-meister sysinfo -- pbench-sysinfo-dump -- /var/tmp/pbench-test-utils/opt/pbench-agent/unittest-scripts/pbench-sysinfo-dump /var/tmp/pbench-test-utils/pbench/mock-run/sysinfo/beg block,security_mitigations,sos
135131
INFO pbench-tool-meister sysinfo -- testhost.example.com: sysinfo send (no-op) default /var/tmp/pbench-test-utils/pbench/mock-run/sysinfo/beg
136132
INFO pbench-tool-meister start -- iostat: start_tool -- /var/tmp/pbench-test-utils/opt/pbench-agent/tool-scripts/iostat --start --dir=/var/tmp/pbench-test-utils/pbench/mock-run/0-iter-zero/sample42/tools-default/testhost.example.com --interval=42 --options=forty-two
@@ -149,11 +145,31 @@ INFO pbench-tool-meister wait -- waiting for stop mpstat
149145
INFO pbench-tool-meister send_tools -- testhost.example.com: send_tools (no-op) default /var/tmp/pbench-test-utils/pbench/mock-run/1-iter-one/sample42/tools-default/testhost.example.com
150146
INFO pbench-tool-meister sysinfo -- pbench-sysinfo-dump -- /var/tmp/pbench-test-utils/opt/pbench-agent/unittest-scripts/pbench-sysinfo-dump /var/tmp/pbench-test-utils/pbench/mock-run/sysinfo/end block,security_mitigations,sos
151147
INFO pbench-tool-meister sysinfo -- testhost.example.com: sysinfo send (no-op) default /var/tmp/pbench-test-utils/pbench/mock-run/sysinfo/end
152-
INFO pbench-tool-meister main -- terminating
153-
INFO pbench-tool-meister main -- Remove pid file ... (tm-default-testhost.example.com.pid)
154-
--- mock-run/tm/tm-default-testhost.example.com.log file contents
148+
INFO pbench-tool-meister driver -- terminating
149+
--- mock-run/tm/tm-default-testhost.example.com.err file contents
155150
+++ mock-run/tm/tm-default-testhost.example.com.out file contents
156151
--- mock-run/tm/tm-default-testhost.example.com.out file contents
152+
+++ mock-run/tm/tm.logs file contents
153+
testhost.example.com 0000 INFO pbench-tool-meister sysinfo -- pbench-sysinfo-dump -- /var/tmp/pbench-test-utils/opt/pbench-agent/unittest-scripts/pbench-sysinfo-dump /var/tmp/pbench-test-utils/pbench/mock-run/sysinfo/beg block,security_mitigations,sos
154+
testhost.example.com 0001 INFO pbench-tool-meister sysinfo -- testhost.example.com: sysinfo send (no-op) default /var/tmp/pbench-test-utils/pbench/mock-run/sysinfo/beg
155+
testhost.example.com 0002 INFO pbench-tool-meister start -- iostat: start_tool -- /var/tmp/pbench-test-utils/opt/pbench-agent/tool-scripts/iostat --start --dir=/var/tmp/pbench-test-utils/pbench/mock-run/0-iter-zero/sample42/tools-default/testhost.example.com --interval=42 --options=forty-two
156+
testhost.example.com 0003 INFO pbench-tool-meister start -- mpstat: start_tool -- /var/tmp/pbench-test-utils/opt/pbench-agent/tool-scripts/mpstat --start --dir=/var/tmp/pbench-test-utils/pbench/mock-run/0-iter-zero/sample42/tools-default/testhost.example.com --interval=42 --options=forty-two
157+
testhost.example.com 0004 INFO pbench-tool-meister stop -- iostat: stop_tool -- /var/tmp/pbench-test-utils/opt/pbench-agent/tool-scripts/iostat --stop --dir=/var/tmp/pbench-test-utils/pbench/mock-run/0-iter-zero/sample42/tools-default/testhost.example.com --interval=42 --options=forty-two
158+
testhost.example.com 0005 INFO pbench-tool-meister stop -- mpstat: stop_tool -- /var/tmp/pbench-test-utils/opt/pbench-agent/tool-scripts/mpstat --stop --dir=/var/tmp/pbench-test-utils/pbench/mock-run/0-iter-zero/sample42/tools-default/testhost.example.com --interval=42 --options=forty-two
159+
testhost.example.com 0006 INFO pbench-tool-meister wait -- waiting for stop iostat
160+
testhost.example.com 0007 INFO pbench-tool-meister wait -- waiting for stop mpstat
161+
testhost.example.com 0008 INFO pbench-tool-meister send_tools -- testhost.example.com: send_tools (no-op) default /var/tmp/pbench-test-utils/pbench/mock-run/0-iter-zero/sample42/tools-default/testhost.example.com
162+
testhost.example.com 0009 INFO pbench-tool-meister start -- iostat: start_tool -- /var/tmp/pbench-test-utils/opt/pbench-agent/tool-scripts/iostat --start --dir=/var/tmp/pbench-test-utils/pbench/mock-run/1-iter-one/sample42/tools-default/testhost.example.com --interval=42 --options=forty-two
163+
testhost.example.com 0010 INFO pbench-tool-meister start -- mpstat: start_tool -- /var/tmp/pbench-test-utils/opt/pbench-agent/tool-scripts/mpstat --start --dir=/var/tmp/pbench-test-utils/pbench/mock-run/1-iter-one/sample42/tools-default/testhost.example.com --interval=42 --options=forty-two
164+
testhost.example.com 0011 INFO pbench-tool-meister stop -- iostat: stop_tool -- /var/tmp/pbench-test-utils/opt/pbench-agent/tool-scripts/iostat --stop --dir=/var/tmp/pbench-test-utils/pbench/mock-run/1-iter-one/sample42/tools-default/testhost.example.com --interval=42 --options=forty-two
165+
testhost.example.com 0012 INFO pbench-tool-meister stop -- mpstat: stop_tool -- /var/tmp/pbench-test-utils/opt/pbench-agent/tool-scripts/mpstat --stop --dir=/var/tmp/pbench-test-utils/pbench/mock-run/1-iter-one/sample42/tools-default/testhost.example.com --interval=42 --options=forty-two
166+
testhost.example.com 0013 INFO pbench-tool-meister wait -- waiting for stop iostat
167+
testhost.example.com 0014 INFO pbench-tool-meister wait -- waiting for stop mpstat
168+
testhost.example.com 0015 INFO pbench-tool-meister send_tools -- testhost.example.com: send_tools (no-op) default /var/tmp/pbench-test-utils/pbench/mock-run/1-iter-one/sample42/tools-default/testhost.example.com
169+
testhost.example.com 0016 INFO pbench-tool-meister sysinfo -- pbench-sysinfo-dump -- /var/tmp/pbench-test-utils/opt/pbench-agent/unittest-scripts/pbench-sysinfo-dump /var/tmp/pbench-test-utils/pbench/mock-run/sysinfo/end block,security_mitigations,sos
170+
testhost.example.com 0017 INFO pbench-tool-meister sysinfo -- testhost.example.com: sysinfo send (no-op) default /var/tmp/pbench-test-utils/pbench/mock-run/sysinfo/end
171+
testhost.example.com 0018 INFO pbench-tool-meister driver -- terminating
172+
--- mock-run/tm/tm.logs file contents
157173
+++ test-execution.log file contents
158174
/var/tmp/pbench-test-utils/opt/pbench-agent/unittest-scripts/pbench-sysinfo-dump /var/tmp/pbench-test-utils/pbench/mock-run/sysinfo/beg block,security_mitigations,sos
159175
/var/tmp/pbench-test-utils/opt/pbench-agent/unittest-scripts/pbench-sysinfo-dump /var/tmp/pbench-test-utils/pbench/mock-run/sysinfo/end block,security_mitigations,sos

0 commit comments

Comments
 (0)