Skip to content

Commit 9abe90d

Browse files
authored
Remove HTTP basic auth credentials from log, stacktrace, segment (#152)
1 parent 48eb4fa commit 9abe90d

File tree

11 files changed

+142
-42
lines changed

11 files changed

+142
-42
lines changed

CHANGELOG.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,15 @@
22

33
### 0.7.0
44

5-
- New plugins
5+
- Feature:
6+
- Support collecting and reporting logs to backend (#147)
7+
8+
- New plugins:
69
- Falcon Plugin (#146)
710

11+
- Fixes:
12+
- Now properly removes HTTP basic auth credentials from segments and logs (#152)
13+
814
### 0.6.0
915

1016
- Fixes:

docs/EnvVars.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,3 +38,4 @@ Environment Variable | Description | Default
3838
| `SW_AGENT_LOG_IGNORE_FILTER` | This config customizes whether to ignore the application-defined logger filters, if `True`, all logs are reported disregarding any filter rules. | `False` |
3939
| `SW_AGENT_LOG_REPORTER_FORMATTED` | If `True`, the log reporter will transmit the logs as formatted. Otherwise, puts logRecord.msg and logRecord.args into message content and tags(`argument.n`), respectively. Along with an `exception` tag if an exception was raised. | `True` |
4040
| `SW_AGENT_LOG_REPORTER_LAYOUT` | The log reporter formats the logRecord message based on the layout given. | `%(asctime)s [%(threadName)s] %(levelname)s %(name)s - %(message)s` |
41+
| `SW_AGENT_CAUSE_EXCEPTION_DEPTH` | This config limits agent to report up to `limit` stacktrace, please refer to [Python traceback](https://docs.python.org/3/library/traceback.html#traceback.print_tb) for more explanations. | `5` |

docs/LogReporter.md

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,13 @@ To utilize this feature, you will need to add some new configurations to the age
99
from skywalking import agent, config
1010

1111
config.init(collector_address='127.0.0.1:11800', service_name='your awesome service',
12-
log_grpc_reporter_active=True)
12+
log_reporter_active=True)
1313
agent.start()
1414
```
1515

16-
`log_grpc_reporter_active=True` - Enables the log reporter.
16+
`log_reporter_active=True` - Enables the log reporter.
1717

18-
`log_grpc_reporter_max_buffer_size` - The maximum queue backlog size for sending log data to backend, logs beyond this are silently dropped.
18+
`log_reporter_max_buffer_size` - The maximum queue backlog size for sending log data to backend, logs beyond this are silently dropped.
1919

2020
Alternatively, you can pass configurations through environment variables.
2121
Please refer to [EnvVars.md](EnvVars.md) for the list of environment variables associated with the log reporter.
@@ -24,7 +24,7 @@ Please refer to [EnvVars.md](EnvVars.md) for the list of environment variables a
2424
Only the logs with a level equal to or higher than the specified will be collected and reported.
2525
In other words, the agent ignores some unwanted logs based on your level threshold.
2626

27-
`log_grpc_reporter_level` - The string name of a logger level.
27+
`log_reporter_level` - The string name of a logger level.
2828

2929
Note that it also works with your custom logger levels, simply specify its string name in the config.
3030

@@ -40,7 +40,7 @@ class AppFilter(logging.Filter):
4040

4141
logger.addFilter(AppFilter())
4242
```
43-
However, if you do would like to report those filtered logs, set the `log_grpc_reporter_ignore_filter` to `True`.
43+
However, if you do would like to report those filtered logs, set the `log_reporter_ignore_filter` to `True`.
4444

4545

4646
## Formatting
@@ -51,20 +51,27 @@ Note that regardless of the formatting, Python agent will always report the foll
5151
`logger` - the logger name
5252

5353
`thread` - the thread name
54+
55+
### Limit stacktrace depth
56+
You can set the `cause_exception_depth` config entry to a desired level(defaults to 5), which limits the output depth of exception stacktrace in reporting.
57+
58+
This config limits agent to report up to `limit` stacktrace, please refer to [Python traceback](https://docs.python.org/3/library/traceback.html#traceback.print_tb) for more explanations.
59+
5460
### Customize the reported log format
5561
You can choose to report collected logs in a custom layout.
5662

57-
If not set, the agent uses the layout below by default, else the agent uses your custom layout set in `log_grpc_reporter_layout`.
63+
If not set, the agent uses the layout below by default, else the agent uses your custom layout set in `log_reporter_layout`.
5864

5965
`'%(asctime)s [%(threadName)s] %(levelname)s %(name)s - %(message)s'`
6066

61-
If the layout is set to `None`, the reported log content will only contain the pre-formatted `LogRecord.message`(`msg % args`) without any additional styles, information or extra fields.
67+
If the layout is set to `None`, the reported log content will only contain
68+
the pre-formatted `LogRecord.message`(`msg % args`) without any additional styles or extra fields, stacktrace will be attached if an exception was raised.
6269

6370
### Transmit un-formatted logs
6471
You can also choose to report the log messages without any formatting.
6572
It separates the raw log msg `logRecord.msg` and `logRecord.args`, then puts them into message content and tags starting from `argument.0`, respectively, along with an `exception` tag if an exception was raised.
6673

67-
Note when you set `log_grpc_reporter_formatted` to False, it ignores your custom layout introduced above.
74+
Note when you set `log_reporter_formatted` to False, it ignores your custom layout introduced above.
6875

6976
As an example, the following code:
7077
```python

skywalking/agent/__init__.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -99,8 +99,8 @@ def __init_threading():
9999
__report_thread.start()
100100
__command_dispatch_thread.start()
101101

102-
if config.log_grpc_reporter_active:
103-
__log_queue = Queue(maxsize=config.log_grpc_reporter_max_buffer_size)
102+
if config.log_reporter_active:
103+
__log_queue = Queue(maxsize=config.log_reporter_max_buffer_size)
104104
__log_report_thread = Thread(name='LogReportThread', target=__report_log, daemon=True)
105105
__log_report_thread.start()
106106

@@ -122,7 +122,7 @@ def __init():
122122
__protocol = KafkaProtocol()
123123

124124
plugins.install()
125-
if config.log_grpc_reporter_active: # todo - Add support for printing traceID/ context in logs
125+
if config.log_reporter_active: # todo - Add support for printing traceID/ context in logs
126126
from skywalking import log
127127
log.install()
128128

@@ -132,7 +132,7 @@ def __init():
132132
def __fini():
133133
__protocol.report(__queue, False)
134134
__queue.join()
135-
if config.log_grpc_reporter_active:
135+
if config.log_reporter_active:
136136
__protocol.report_log(__log_queue, False)
137137
__log_queue.join()
138138
__finished.set()

skywalking/config.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -68,17 +68,17 @@
6868
profile_task_query_interval = int(os.getenv('SW_PROFILE_TASK_QUERY_INTERVAL') or '20')
6969

7070
# NOTE - Log reporting requires a separate channel, will merge in the future.
71-
log_grpc_reporter_active = True if os.getenv('SW_AGENT_LOG_REPORTER_ACTIVE') and \
72-
os.getenv('SW_AGENT_LOG_REPORTER_ACTIVE') == 'True' else False # type: bool
73-
log_grpc_reporter_max_buffer_size = int(os.getenv('SW_AGENT_LOG_REPORTER_BUFFER_SIZE') or '10000') # type: int
74-
log_grpc_reporter_level = os.getenv('SW_AGENT_LOG_REPORTER_LEVEL') or 'WARNING' # type: str
75-
log_grpc_reporter_ignore_filter = True if os.getenv('SW_AGENT_LOG_IGNORE_FILTER') and \
71+
log_reporter_active = True if os.getenv('SW_AGENT_LOG_REPORTER_ACTIVE') and \
72+
os.getenv('SW_AGENT_LOG_REPORTER_ACTIVE') == 'True' else False # type: bool
73+
log_reporter_max_buffer_size = int(os.getenv('SW_AGENT_LOG_REPORTER_BUFFER_SIZE') or '10000') # type: int
74+
log_reporter_level = os.getenv('SW_AGENT_LOG_REPORTER_LEVEL') or 'WARNING' # type: str
75+
log_reporter_ignore_filter = True if os.getenv('SW_AGENT_LOG_IGNORE_FILTER') and \
7676
os.getenv('SW_AGENT_LOG_REPORTER_FORMATTED') == 'True' else False # type: bool
77-
log_grpc_reporter_formatted = False if os.getenv('SW_AGENT_LOG_REPORTER_FORMATTED') and \
77+
log_reporter_formatted = False if os.getenv('SW_AGENT_LOG_REPORTER_FORMATTED') and \
7878
os.getenv('SW_AGENT_LOG_REPORTER_FORMATTED') == 'False' else True # type: bool
79-
log_grpc_reporter_layout = os.getenv('SW_AGENT_LOG_REPORTER_LAYOUT') or \
79+
log_reporter_layout = os.getenv('SW_AGENT_LOG_REPORTER_LAYOUT') or \
8080
'%(asctime)s [%(threadName)s] %(levelname)s %(name)s - %(message)s' # type: str
81-
81+
cause_exception_depth = int(os.getenv('SW_AGENT_CAUSE_EXCEPTION_DEPTH') or '5') # type: int
8282

8383
options = {key for key in globals() if key not in options} # THIS MUST FOLLOW DIRECTLY AFTER LIST OF CONFIG OPTIONS!
8484

skywalking/log/formatter.py

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
#
2+
# Licensed to the Apache Software Foundation (ASF) under one or more
3+
# contributor license agreements. See the NOTICE file distributed with
4+
# this work for additional information regarding copyright ownership.
5+
# The ASF licenses this file to You under the Apache License, Version 2.0
6+
# (the "License"); you may not use this file except in compliance with
7+
# the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
#
17+
18+
import io
19+
import logging
20+
import traceback
21+
22+
23+
class SWFormatter(logging.Formatter):
24+
""" A slightly modified formatter that allows traceback depth """
25+
26+
def __init__(self, fmt, tb_limit):
27+
logging.Formatter.__init__(self, fmt)
28+
self.tb_limit = tb_limit
29+
30+
def formatException(self, ei):
31+
sio = io.StringIO()
32+
tb = ei[2]
33+
traceback.print_exception(ei[0], ei[1], tb, self.tb_limit, sio)
34+
s = sio.getvalue()
35+
sio.close()
36+
if s[-1:] == "\n":
37+
s = s[:-1]
38+
return s

skywalking/log/sw_logging.py

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -22,17 +22,19 @@
2222

2323
from skywalking import config, agent
2424
from skywalking.trace.context import get_context
25+
from skywalking.utils.filter import sw_traceback, sw_filter
2526

2627

2728
def install():
28-
from logging import Logger, Formatter
29+
from logging import Logger
2930

30-
layout = config.log_grpc_reporter_layout # type: str
31+
layout = config.log_reporter_layout # type: str
3132
if layout:
32-
formatter = Formatter(fmt=layout)
33+
from skywalking.log.formatter import SWFormatter
34+
formatter = SWFormatter(fmt=layout, tb_limit=config.cause_exception_depth)
3335

3436
_handle = Logger.handle
35-
log_reporter_level = logging.getLevelName(config.log_grpc_reporter_level) # type: int
37+
log_reporter_level = logging.getLevelName(config.log_reporter_level) # type: int
3638

3739
def _sw_handle(self, record):
3840
if record.name == "skywalking": # Ignore SkyWalking internal logger
@@ -41,7 +43,7 @@ def _sw_handle(self, record):
4143
if record.levelno < log_reporter_level:
4244
return _handle(self, record)
4345

44-
if not config.log_grpc_reporter_ignore_filter and not self.filter(record): # ignore filtered logs
46+
if not config.log_reporter_ignore_filter and not self.filter(record): # ignore filtered logs
4547
return _handle(self, record) # return handle to original if record is vetoed, just to be safe
4648

4749
def build_log_tags() -> LogTags:
@@ -53,15 +55,16 @@ def build_log_tags() -> LogTags:
5355
l_tags = LogTags()
5456
l_tags.data.extend(core_tags)
5557

56-
if config.log_grpc_reporter_formatted:
58+
if config.log_reporter_formatted:
5759
return l_tags
5860

5961
for i, arg in enumerate(record.args):
6062
l_tags.data.append(KeyStringValuePair(key='argument.' + str(i), value=str(arg)))
6163

6264
if record.exc_info:
63-
l_tags.data.append(KeyStringValuePair(key='exception', value=str(record.exc_info)))
64-
65+
l_tags.data.append(KeyStringValuePair(key='exception',
66+
value=sw_traceback()
67+
)) # \n doesn't work in tags for UI
6568
return l_tags
6669

6770
context = get_context()
@@ -73,7 +76,7 @@ def build_log_tags() -> LogTags:
7376
body=LogDataBody(
7477
type='text',
7578
text=TextLog(
76-
text=transform(record)
79+
text=sw_filter(transform(record))
7780
)
7881
),
7982
traceContext=TraceContext(
@@ -90,9 +93,8 @@ def build_log_tags() -> LogTags:
9093
Logger.handle = _sw_handle
9194

9295
def transform(record) -> str:
93-
if config.log_grpc_reporter_formatted:
96+
if config.log_reporter_formatted:
9497
if layout:
9598
return formatter.format(record=record)
96-
return record.getMessage()
97-
98-
return record.msg
99+
return record.getMessage() + '\n' + sw_traceback()
100+
return str(record.msg) # convert possible exception to str

skywalking/plugins/sw_requests.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,8 +31,8 @@ def _sw_request(this: Session, method, url,
3131
auth=None, timeout=None, allow_redirects=True, proxies=None,
3232
hooks=None, stream=None, verify=None, cert=None, json=None):
3333

34-
from urllib.parse import urlparse
35-
url_param = urlparse(url)
34+
from skywalking.utils.filter import sw_urlparse
35+
url_param = sw_urlparse(url)
3636

3737
# ignore trace skywalking self request
3838
if config.protocol == 'http' and config.collector_address.rstrip('/').endswith(url_param.netloc):
@@ -55,7 +55,7 @@ def _sw_request(this: Session, method, url,
5555
headers[item.key] = item.val
5656

5757
span.tag(TagHttpMethod(method.upper()))
58-
span.tag(TagHttpURL(url))
58+
span.tag(TagHttpURL(url_param.geturl()))
5959

6060
res = _request(this, method, url, params, data, headers, cookies, files, auth, timeout,
6161
allow_redirects,

skywalking/plugins/sw_urllib3.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,9 @@ def install():
2727
_request = RequestMethods.request
2828

2929
def _sw_request(this: RequestMethods, method, url, fields=None, headers=None, **urlopen_kw):
30-
from urllib.parse import urlparse
30+
from skywalking.utils.filter import sw_urlparse
3131

32-
url_param = urlparse(url)
32+
url_param = sw_urlparse(url)
3333

3434
span = NoopSpan(NoopContext()) if config.ignore_http_method_check(method) \
3535
else get_context().new_exit_span(op=url_param.path or "/", peer=url_param.netloc,
@@ -45,7 +45,7 @@ def _sw_request(this: RequestMethods, method, url, fields=None, headers=None, **
4545
headers[item.key] = item.val
4646

4747
span.tag(TagHttpMethod(method.upper()))
48-
span.tag(TagHttpURL(url))
48+
span.tag(TagHttpURL(url_param.geturl()))
4949

5050
res = _request(this, method, url, fields=fields, headers=headers, **urlopen_kw)
5151

skywalking/trace/span.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@
1616
#
1717

1818
import time
19-
import traceback
2019
from abc import ABC
2120
from collections import defaultdict
2221
from typing import List, Union, DefaultDict
@@ -85,9 +84,10 @@ def finish(self, segment: 'Segment') -> bool:
8584
return True
8685

8786
def raised(self) -> 'Span':
87+
from skywalking.utils.filter import sw_traceback
8888
self.error_occurred = True
8989
self.logs = [Log(items=[
90-
LogItem(key='Traceback', val=traceback.format_exc()),
90+
LogItem(key='Traceback', val=sw_traceback()),
9191
])]
9292
return self
9393

0 commit comments

Comments
 (0)