[Service Discovery] create a template cache to reduce calls to the KV store#3060
[Service Discovery] create a template cache to reduce calls to the KV store#3060
Conversation
9122643 to
6defada
Compare
6defada to
e3094d3
Compare
truthbk
left a comment
There was a problem hiding this comment.
This looks really good. Just a few minor comments/questions.
| init_tpl = auto_conf.get('init_config') or {} | ||
| instance_tpl = auto_conf.get('instances', []) | ||
| if image not in auto_conf_images: | ||
| auto_conf_images[image] = [[check_name], [init_tpl], [instance_tpl]] |
There was a problem hiding this comment.
Wouldn't it end up being less error-prone to have the init and instance configs/templates in a dict with check_name as the key?
There was a problem hiding this comment.
you mean like auto_conf_images[(image, check_name)] = [init_tpl, instance_tpl]?
Problem with that is if users have several instances of the same check for the same image (and a customer actually asked how to do that yesterday so it happens). In this case we still need to wrap init_tpl and instance_tpl in lists and end up with the same logic as we have now.
But maybe I didn't understand correctly your suggestion
There was a problem hiding this comment.
I think I originally meant auto_conf_images[image] = {'checkname': {'init': init_tpl, 'instances: [intance_tpl_0, instance_tpl_1,...]}}
But then I read on, and I saw it wasn't necessarily a good idea given the current structure/logic. I felt it would make the code a little more maintainable (perhaps). We can maybe just keep in mind for the future (it'd be a pretty big refactor), there's nothing wrong with the current logic.
| log.warning("Conflicting templates in auto_conf for image %s and check %s. " | ||
| "Please check your template files." % (image, check_name)) | ||
| continue | ||
| self.auto_conf_templates[image][0].append(check_name) |
There was a problem hiding this comment.
Likewise, I'm not sure if this wouldn't be better as self.auto_conf_templates[image][check_name] = (init_tpl, instance_tpl)
There was a problem hiding this comment.
Oh I get it, we stick to lists for "compatibility" across the board when dealing with the configs. I think we could maybe refactor in the future, but it's a legit choice.
|
|
||
|
|
||
| def get_auto_conf(agentConfig, check_name): | ||
| def get_auto_conf(check_name): |
There was a problem hiding this comment.
I feel like get_auto_conf() and get_auto_config() as names are a bit too close to each other...
| # if _issue_read doesn't return anything we still create a key in the cache | ||
| # so that subsequent reads don't trigger issue_read every time | ||
| else: | ||
| self.kv_templates[identifier] = None |
There was a problem hiding this comment.
I know this is a NOOP but I noticed that we do include in the results templates[CONFIG_FROM_TEMPLATE] = self.kv_templates[identifier] even when self.kv_templates[identifier] is None if we have a hit, but we don't do that with a miss.
Just saying because it occured to me we could refactor this as:
if identifier not in self.kv_templates:
tpls = self._issue_read(identifier)
if tpls:
templates[CONFIG_FROM_TEMPLATE] = tpls
self.kv_templates[identifier] = tpls
else:
self.kv_templates[identifier] = None
templates[CONFIG_FROM_TEMPLATE] = self.kv_templates[identifier]
because we always populate after a hit or fetch.
There was a problem hiding this comment.
man that snippet is on point! 👌
| # lastly fallback to auto_conf | ||
| return set(self.identifier_to_checks[self._get_image_ident(identifier)]) | ||
| # lastly fallback to legacy name for auto-conf | ||
| return self.template_cache.get_check_names(self._get_image_ident(identifier)) |
There was a problem hiding this comment.
Question: can we have checks under the legacy name that should also be applied? So a case where identifier and the legacy self._get_image_ident(identifier) would both apply to an image?
There was a problem hiding this comment.
Good catch! That's a use case we didn't support before (if user-defined templates were found we skipped auto config) but I added logic for that here and forgot to update this part. Will do it now.
| @mock.patch.object(AbstractConfigStore, '_issue_read', side_effect=issue_read) | ||
| def test_read_config_from_store(self, issue_read): | ||
| @mock.patch.object(_TemplateCache, '_issue_read', side_effect=issue_read) | ||
| def test_read_config_from_store(self, args): |
There was a problem hiding this comment.
Big 🙇 for all these awesome tests!
e3094d3 to
999f7bd
Compare
truthbk
left a comment
There was a problem hiding this comment.
Don't see anything wrong with this! ![]()
7543301 to
2eaab69
Compare
truthbk
left a comment
There was a problem hiding this comment.
Just a couple of nits, but I approve! ;)
agent.py
Outdated
| self.sd_backend = get_sd_backend(self._agentConfig) | ||
|
|
||
| if _is_affirmative(self._agentConfig.get('sd_jmx_enable')): | ||
| if _is_affirmative(self._agentConfig.get('sd_jmx_enable', '')): |
There was a problem hiding this comment.
This would probably be more readable as: _is_affirmative(self._agentConfig.get('sd_jmx_enable', False)).... Am I missing something?
| log.exception('Failed to retrieve a template for %s.' % identifier) | ||
| # create a key in the cache even if _issue_read doesn't return a tpl | ||
| # so that subsequent reads don't trigger issue_read | ||
| self.kv_templates[identifier] = tpls or None |
There was a problem hiding this comment.
I'm not sure you need the or None? tpls will already be something if self._issue_read(identifier) returns either a list or None, or the None we've set it in the except blocks.
This cache is invalidated when a template change is detected. It doesn't detect changes to the auto_conf folder or JMX checks, only to the KV store. Refreshing configs from files still requires a SIGHUP or a restart.
2eaab69 to
13d8a24
Compare
|
Thanks @truthbk ! I addressed your comments and squashed the commits. Will merge once the tests pass 🎉 |
Note: Please remember to review the Datadog Contribution Guidelines
if you have not yet done so.
What does this PR do?
Create a cache for configuration templates
SIGHUP.Minor clean ups and test updates
Motivation
Using this cache avoids querying the config store at every container update.
Testing Guidelines
Tests were updated accordingly. A custom docker image using this build is also available at the
TODOtag.