Skip to content

Commit 02fcccd

Browse files
authored
Merge pull request #132 from developmentseed/describe-s3-creds-caching
Describe temp s3 creds caching
2 parents 973080f + 710931c commit 02fcccd

File tree

1 file changed

+32
-9
lines changed

1 file changed

+32
-9
lines changed

README.md

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -44,18 +44,41 @@ The behavior of the application is controlled by the S3 authentication settings
4444

4545
### Direct from S3
4646

47-
When running in an AWS context (e.g. Lambda), you should configure the application to access the data directly from `S3`.
48-
You can do this in two ways:
47+
When running in an AWS context (e.g., Lambda), you should configure the
48+
application to access the data directly from `S3`. You can do this in two ways:
4949

50-
- Configure an AWS IAM role for your runtime environment that has read access to the NASA buckets so that `rasterio/GDAL` can find the AWS credentials when reading data
51-
- Set the `EARTHDATA_USERNAME` and `EARTHDATA_PASSWORD` environment variables so that the `earthaccess` package can issue temporary AWS credentials
50+
- **Option 1:** Configure an AWS IAM role for your runtime environment that has
51+
read access to the NASA buckets so that `rasterio/GDAL` can find the AWS
52+
credentials when reading data.
53+
- **Option 2:** Set the `EARTHDATA_USERNAME` and `EARTHDATA_PASSWORD`
54+
environment variables so that temporary AWS credentials can be retrieved for
55+
reading from the relevant NASA buckets.
5256

53-
> [!NOTE]
54-
> Direct S3 access configuration will only work if the application is running in the same AWS region as the data are stored!
57+
> [!IMPORTANT]
58+
>
59+
> Direct S3 access configuration will only work if the application is running in
60+
> the same AWS region as the data are stored!
5561
56-
> [!WARNING]
57-
> At the moment, setting Earthdata credentials as environment variables does not
58-
> work for the rasterio backend.
62+
> [!NOTE]
63+
>
64+
> To avoid placing heavy load on the endpoints that issue temporary AWS (S3)
65+
> credentials, and to improve Lambda performance, such credentials are fetched
66+
> only when necessary, and are held in a cache (keyed by the endpoint URL), and
67+
> are automatically refreshed 10 minutes prior to their expiration (as a
68+
> freshness leeway).
69+
>
70+
> However, this caching occurs on a per-Lambda basis. That is, each Lambda
71+
> function maintains its own cache, so the load on the credentials enpoints is
72+
> still greater than necessary. Each Lambda must repopulate its cache upon cold
73+
> start (but the cache is maintained across warm starts).
74+
>
75+
> Therefore, we plan to explore the use of a distributed cache to be shared
76+
> across all Lambda instances to not only absolutely minimize our load on the
77+
> endpoints, but also to improve overall Lambda performance by avoiding having
78+
> every Lambda fetch credentials for the same endpoints. Further, a distributed
79+
> cache would be unaffected by Lambda cold starts, so even a cold-starting
80+
> Lambda would avoid the need to fetch credentials that are already in the
81+
> cache.
5982
6083
### External access
6184

0 commit comments

Comments
 (0)