New aws-secret argument resolver & uuid value provider #425

AdityaSatnalika0202 · 2025-06-26T04:07:53Z

Adding a new Resolver to fetch AWS Secrets from Secrets Manager.

Usage :

    Supports two formats:
    1. 'ENV_VAR_NAME.json_key' - For JSON secrets, returns the specific JSON key value
    2. 'ENV_VAR_NAME' - For string secrets, returns the entire secret value

    Args:
        variable_name: The variable name in either format:
                     - 'ENV_VAR_NAME.json_key' for JSON secrets
                     - 'ENV_VAR_NAME' for string secrets

    Returns:
        The resolved value from the secret, or None if resolution failed

    Example:
        In nodestream.yaml:
            password: !aws-secret NEO4J_PASSWORD.password
            # OR
            password: !aws-secret NEO4J_PASSWORD

Adding a New UUID Value Provide ,

Usage:

Supports both simple string input and structured configuration:

Simple format:
    id: !uuid  # Random UUID v4
    id: !uuid "finding"  # Deterministic UUID v5 based on "finding"

Structured format:
    # Full configuration with both variable_name and namespace
    id: !uuid
      variable_name: "finding"
      namespace: "my-custom-namespace"

    # Only variable_name (uses default namespace "nodestream")
    id: !uuid
      variable_name: "exposure_finding"

    # Only namespace (generates random UUID v4 with custom namespace)
    id: !uuid
      namespace: "my-random-namespace"

    # Empty configuration (generates random UUID v4 with default namespace)
    id: !uuid

jbristow · 2025-07-02T21:46:50Z

nodestream/pipeline/argument_resolvers/aws_secret_resolver.py

+
+    def get_secret_value(self, secret_id: str) -> str | None:
+        """Return the secret value (or None if not found)."""
+        return self._secret_value or None


Where is self._secret_value defined or set?

done, initialized them in init function

# Initialize instance attributes to prevent AttributeError self._secret_value: Optional[str] = None self.default: Optional[str] = None

jbristow · 2025-07-02T21:47:14Z

nodestream/pipeline/argument_resolvers/aws_secret_resolver.py

+        """Return the secret value (or a default if not found)."""
+        return self._secret_value or self.default  # type: ignore[no-any-return]
+
+    def get_secret_value(self, secret_id: str) -> str | None:


What is the purpose of the parameter secret_id? It appears unused.

Yup it was redundant function, removed it

jbristow · 2025-07-02T21:50:29Z

nodestream/pipeline/argument_resolvers/aws_secret_resolver.py

+    cache_ttl: int = 300  # 5 minutes
+    max_retries: int = 3
+    retry_delay: float = 1.0
+    # todo get region from environment variable or config


Is there a reason that this TODO is unfinished?
Does it make sense to even have a default for region_name? If we don't want to add the ability to set from an environment variable/config for this initial pass, I think it might make more sense to fail if the region name is not set explicitly by the user.

@zprobst can provide better insights on this from our discussions.

@jbristow nodestream is lacking a way to configure argument resolvers. We can push these things to environment variables but there is no "smart" way to configure these from say... a nodestream.yaml file. I suggested deferring configuration of these until we think through how we want to support that.

jbristow · 2025-07-02T21:53:19Z

nodestream/pipeline/argument_resolvers/aws_secret_resolver.py

+# Initialize caches
+secret_cache = SecretCache()
+json_cache = SecretCache()


Should these be initialized on module load? I know it's not a huge deal, but it seems a little aggressive to load these caches whenever someone loads any part of the nodestream library.

How do we do this ?

can I do this : Initialize the caches only when the AWSSecretResolver is first instantiated or used.

@jbristow Moved the cache initialization to the first time they're actually needed

jbristow · 2025-07-02T21:55:35Z

nodestream/pipeline/argument_resolvers/aws_secret_resolver.py

+def retry_on_error(max_retries: int = 3, delay: float = 1.0) -> Callable[[F], F]:
+    """Decorator to retry a function on failure.
+
+    Args:
+        max_retries: Maximum number of retries
+        delay: Delay between retries in seconds
+
+    Returns:
+        Decorated function that will retry on failure
+
+    Example:
+        @retry_on_error(max_retries=3, delay=1.0)
+        def my_function():
+            # Function that may fail
+            pass
+    """
+
+    def decorator(func: F) -> F:
+        @wraps(func)
+        def wrapper(*args: Any, **kwargs: Any) -> Any:
+            last_exception = None
+            for attempt in range(max_retries):
+                try:
+                    return func(*args, **kwargs)
+                except Exception as e:
+                    last_exception = e
+                    if attempt < max_retries - 1:
+                        msg = (
+                            f"Attempt {attempt + 1} failed for {func.__name__}, "
+                            f"retrying in {delay} seconds... Error: {str(e)}"
+                        )
+                        logger.warning(msg)
+                        time.sleep(delay)
+            raise last_exception or Exception("Unknown error occurred")
+
+        return cast(F, wrapper)
+
+    return decorator


@zprobst Are we still ok with one-off custom retrier logic proliferating through nodestream? Should we think about consolidating around a specific implementation for nodestream or a library like tenacity?

Yeah I think its about time we start to get serious about this issue. I am good with tenacity if you think its the best choice forward @jbristow

Yeah, I default to tenacity because it's actively maintained and has a simple API for handling most cases, but still exposes granular config points so that you can do much more complicated things.

jbristow · 2025-07-02T22:33:13Z

nodestream/pipeline/argument_resolvers/aws_secret_resolver.py

+        try:
+            response = self._client.get_secret_value(SecretId=secret_name)
+            if "SecretString" in response:
+                return response["SecretString"]  # type: ignore[no-any-return]


I think this should be safely & explicitly converted to a string rather than ignoring the mypy error.

Done, type casted it to str

jbristow · 2025-07-02T22:34:01Z

nodestream/pipeline/argument_resolvers/aws_secret_resolver.py

+        """
+        try:
+            response = self._client.get_secret_value(SecretId=secret_name)
+            if "SecretString" in response:


All tests still pass after deleting this line. Why is that?

can you show me how, I tried and i saw some tests failing. tests/unit/pipeline/argument_resolvers/test_aws_secret_resolver.py ..FF... [ 10%]

The test are failing on a syntax error. Remove the indentation of the next line, and all tests pass.

its resolved :)

jbristow · 2025-07-02T22:35:11Z

nodestream/pipeline/argument_resolvers/aws_secret_resolver.py

+                raise SecretDecodeError(
+                    f"Secret '{secret_name}' is not valid JSON: {e}"
+                ) from e


add tests ensuring that this raises the expected exception

jbristow · 2025-07-02T22:38:02Z

nodestream/pipeline/argument_resolvers/aws_secret_resolver.py

+        except SecretResolverError as e:
+            logger.error(f"Error resolving JSON secret '{secret_name}': {e}")
+            return None


Why are we catching and not rethrowing here? Using a try-catch block as branch control is almost always a confusing and potentially bug-prone choice over if statements.

Please eliminate this nested try block.

yup makes sense, removed nested try & catch, replaced with a simplified flow and if else , returning None

jbristow · 2025-07-02T22:40:41Z

nodestream/pipeline/argument_resolvers/aws_secret_resolver.py

+                value, expiry = self._cache[key]
+                if time.time() < expiry:
+                    logger.debug(f"Cache HIT: {key}")
+                    return value
+                logger.debug(f"Cache EXPIRED: {key}")
+                del self._cache[key]
+            logger.debug(f"Cache MISS: {key}")
+            return None


All tests seem to avoid this inner section. I think we should have some unit tests that return a cache HIT and EXPIRED and check that the inner cache state has changed as expected.

Done , added test cases to check CACHE functions

…rst needed

codecov · 2025-07-28T12:45:08Z

Codecov Report

❌ Patch coverage is 91.12150% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 98.02%. Comparing base (c6c4652) to head (439db15).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
...pipeline/argument_resolvers/aws_secret_resolver.py	88.81%	18 Missing ⚠️
...am/pipeline/value_providers/uuid_value_provider.py	98.03%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #425      +/-   ##
==========================================
- Coverage   98.26%   98.02%   -0.24%     
==========================================
  Files         152      154       +2     
  Lines        6171     6385     +214     
==========================================
+ Hits         6064     6259     +195     
- Misses        107      126      +19

Flag	Coverage Δ
3.10-macos-latest	`97.99% <91.12%> (-0.26%)`	⬇️
3.10-ubuntu-latest	`97.99% <91.12%> (-0.24%)`	⬇️
3.10-windows-latest	`97.99% <91.12%> (-0.24%)`	⬇️
3.11-macos-latest	`98.01% <91.12%> (-0.23%)`	⬇️
3.11-ubuntu-latest	`97.99% <91.12%> (-0.24%)`	⬇️
3.11-windows-latest	`97.99% <91.12%> (-0.24%)`	⬇️
3.12-macos-latest	`98.01% <91.12%> (-0.23%)`	⬇️
3.12-ubuntu-latest	`97.99% <91.12%> (-0.24%)`	⬇️
3.12-windows-latest	`97.99% <91.12%> (-0.24%)`	⬇️
3.13-macos-latest	`98.01% <91.12%> (-0.24%)`	⬇️
3.13-ubuntu-latest	`97.99% <91.12%> (-0.24%)`	⬇️
3.13-windows-latest	`97.99% <91.12%> (-0.24%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Adding AWS Secret Resolver

a651d73

AdityaSatnalika0202 requested review from zprobst and ccloes as code owners June 26, 2025 04:07

Aditya Satnalika added 3 commits June 27, 2025 14:11

Adding test cases and fixing linting issues

4a83f0d

Adding UUID 4 & 5 value provider + test cases + doc strings for usage

abf03c3

Adding UUID 4 & 5 value provider + test cases + doc strings for usage

a575bfc

jbristow requested changes Jul 2, 2025

View reviewed changes

Initialization instance attributes, improving code flow and bug fixes

8592103

zprobst mentioned this pull request Jul 7, 2025

Splunk extractor #426

Open

AdityaSatnalika0202 changed the title ~~Adding AWS Secret Resolver~~ New aws-secret argument resolver & uuid value provider Jul 7, 2025

Aditya Satnalika and others added 3 commits July 7, 2025 22:55

Adding lazy loading, the cache are only initialized after they are fi…

04e3621

…rst needed

Clean of test cases

3d31927

Merge branch 'main' into main

439db15

New aws-secret argument resolver & uuid value provider #425

Are you sure you want to change the base?

New aws-secret argument resolver & uuid value provider #425

Uh oh!

Conversation

AdityaSatnalika0202 commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

AdityaSatnalika0202 commented Jun 26, 2025 •

edited

Loading

codecov bot commented Jul 28, 2025 •

edited

Loading