Skip to content

Commit bdd1951

Browse files
authored
Document preserveMetadata flag for Save/Restore Cache (#11569)
* Document preserveMetadata flag for Save/Restore Cache * Update docs/continuous-integration/shared/preserve-metadata.md
1 parent ee2b097 commit bdd1951

File tree

4 files changed

+68
-0
lines changed

4 files changed

+68
-0
lines changed
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
### Preserve File Metadata
2+
3+
By default, Harness cache steps don’t preserve inode metadata, which means restored files can appear "new" on every build. This can cause cache growth over time, also called **cache snowballing**.
4+
5+
To address this, Harness supports a `preserveMetadata` flag, which can be configured as a stage variable `PLUGIN_PRESERVE_METADATA` set to `true`.
6+
7+
#### How it works
8+
9+
- `preserveMetadata: true`
10+
Cache archives include inode metadata (timestamps, ownership, permissions).
11+
Restored files keep their original metadata, enabling tools like Gradle pruning or `find -atime` to work correctly.
12+
13+
- `preserveMetadata: false` (default)
14+
Metadata is not preserved. Restores are slightly faster, but tools that rely on timestamps treat all files as new.
15+
16+
#### YAML examples
17+
18+
Here’s an example of saving and restoring a Gradle cache with metadata preserved:
19+
20+
```yaml
21+
steps:
22+
- step:
23+
type: SaveCache
24+
name: save-gradle-cache
25+
spec:
26+
key: gradle-cache
27+
paths:
28+
- ~/.gradle
29+
preserveMetadata: true
30+
- step:
31+
type: RestoreCache
32+
name: restore-gradle-cache
33+
spec:
34+
key: gradle-cache
35+
preserveMetadata: true
36+
```
37+
38+
#### Backward compatibility
39+
40+
* The flag is optional and defaults to `false`.
41+
* Pipelines without this setting continue to work as before.
42+
* Old cache archives can still be restored even if `preserveMetadata` is enabled.
43+
44+
#### Benefits
45+
46+
* Prevents **cache snowballing** by keeping Gradle cache size stable across builds.
47+
* Accurate pruning of unused dependencies in Gradle and similar tools.
48+
* Supports other tools that rely on inode metadata (for example, `find`, cleanup scripts).
49+
50+
#### Troubleshooting
51+
52+
If your Gradle cache grows unexpectedly or pruning doesn’t work:
53+
54+
* Enable `preserveMetadata: true` in both **Save Cache** and **Restore Cache** steps.
55+
* Make sure the `key` values match across your Save/Restore steps.
56+
57+

docs/continuous-integration/use-ci/caching-ci-data/save-cache-azure.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ description: Caching improves build times and enables you to share data across s
44
sidebar_position: 41
55
---
66

7+
import PreserveMetadata from '/docs/continuous-integration/shared/preserve-metadata.md';
8+
79
You can use the [Cache plugin](https://github.com/drone-plugins/drone-meltwater-cache) in your CI pipelines to save and retrieve cached data from Azure storage.
810

911
:::warning
@@ -118,6 +120,8 @@ When using the `cache` plugin to save an Azure cache, the `cache_key` is the Azu
118120
<+pipeline.variables.AZURE_CONTAINER>/<+pipeline.identifier>/{{ .Commit.Branch }}-{{ checksum "<+pipeline.variables.BUILD_PATH>/build.gradle" }}
119121
```
120122

123+
<PreserveMetadata/>
124+
121125
### Set shared paths for cache locations outside the stage workspace
122126

123127
In a Harness pipeline, all steps in a given stage share the same [workspace](/docs/continuous-integration/use-ci/set-up-build-infrastructure/ci-stage-settings#workspace), which is `/harness`.

docs/continuous-integration/use-ci/caching-ci-data/save-cache-in-gcs.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ helpdocs_is_private: false
88
helpdocs_is_published: true
99
---
1010

11+
import PreserveMetadata from '/docs/continuous-integration/shared/preserve-metadata.md';
12+
1113
Modern continuous integration systems execute pipelines inside ephemeral environments that are provisioned solely for pipeline execution and are not reused from prior pipeline runs. As builds often require downloading and installing many library and software dependencies, caching these dependencies for quick retrieval at runtime can save a significant amount of time.
1214

1315
In addition to loading dependencies faster, you can also use caching to share data across stages in your Harness CI pipelines. You need to use caching to share data across stages because each stage in a Harness CI pipeline has its own build infrastructure.
@@ -198,6 +200,8 @@ Set the timeout limit for the step. Once the timeout limit is reached, the step
198200
* [Step Skip Condition settings](/docs/platform/pipelines/step-skip-condition-settings.md)
199201
* [Step Failure Strategy settings](/docs/platform/pipelines/failure-handling/define-a-failure-strategy-on-stages-and-steps)
200202

203+
<PreserveMetadata/>
204+
201205
### Set shared paths for cache locations outside the stage workspace
202206

203207
Steps in the same stage share the same [workspace](/docs/continuous-integration/use-ci/set-up-build-infrastructure/ci-stage-settings#workspace), which is `/harness`. If your steps need to use data in locations outside the stage workspace, you must specify these as [shared paths](/docs/continuous-integration/use-ci/caching-ci-data/share-ci-data-across-steps-and-stages#share-data-between-steps-in-a-stage). This is required if you want to cache directories outside `/harness`. For example:

docs/continuous-integration/use-ci/caching-ci-data/saving-cache.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ helpdocs_is_published: true
1111

1212
import Tabs from '@theme/Tabs';
1313
import TabItem from '@theme/TabItem';
14+
import PreserveMetadata from '/docs/continuous-integration/shared/preserve-metadata.md';
1415

1516
You can use caching to share data cross stages or run pipelines faster by reusing the expensive fetch operation data from previous builds.
1617

@@ -295,6 +296,8 @@ Set the timeout limit for the step. Once the timeout limit is reached, the step
295296
* [Step Skip Condition settings](/docs/platform/pipelines/step-skip-condition-settings.md)
296297
* [Step Failure Strategy settings](/docs/platform/pipelines/failure-handling/define-a-failure-strategy-on-stages-and-steps)
297298

299+
<PreserveMetadata/>
300+
298301
### Set shared paths for cache locations outside the stage workspace
299302

300303
Steps in the same stage share the same [workspace](/docs/continuous-integration/use-ci/set-up-build-infrastructure/ci-stage-settings#workspace), which is `/harness`. If your steps need to use data in locations outside the stage workspace, you must specify these as [shared paths](/docs/continuous-integration/use-ci/caching-ci-data/share-ci-data-across-steps-and-stages#share-data-between-steps-in-a-stage). This is required if you want to cache directories outside `/harness`. For example:

0 commit comments

Comments
 (0)