Conversation
ff5ecf3 to
3e921b6
Compare
b879c95 to
457b45b
Compare
8829e66 to
92702af
Compare
|
Needs rebase |
92702af to
1d5fd08
Compare
Done. |
1d5fd08 to
8694c9a
Compare
|
Just retrigger ci, no code change. |
|
@ktock I conducted experiments with several basic images, converting them to the estargz format and running them in containers with a simple 'echo "hello"' command. These tests used only background threads of stargz to pull images to the local machine. By measuring the overall memory and disk usage, I observed that implementing hardlinks resulted in a 20-30% reduction in both memory consumption and disk space requirements.
|
|
kindly ping @ktock |
efbcc75 to
01b70ae
Compare
1d0858b to
756bb86
Compare
|
cc @ktock |
|
We're very interested by this PR, is there any plan for review @ktock ? |
|
Needs rebase |
| } | ||
|
|
||
| // ChunkDigest option allows specifying a chunk digest for the cache | ||
| func ChunkDigest(digest string) Option { |
9669477 to
349ad77
Compare
|
CI failure in |
170ded5 to
d62a0aa
Compare
e85935b to
f39d6a2
Compare
|
ping @ktock |
3351d93 to
e90577f
Compare
6d9d2bb to
c3fa985
Compare
c3fa985 to
e376da9
Compare
Implement hardlink management for Stargz Snapshotter cache to reduce
storage usage by deduplicating identical content chunks.
Implementation:
- Store canonical files in {root}/hardlinks/ directory
- Use nlink-based garbage collection (remove when nlink == 1)
- Track chunk digests with LRU cache for efficient lookups
- Create hardlinks from canonical files to cache locations
Configuration:
- Add HardlinkRoot option for hardlink storage directory
API:
- Enroll(): Register canonical files by digest
- Get(): Retrieve canonical file paths
- Add(): Create hardlinks to target locations
Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
e376da9 to
c0aac13
Compare
|
ping @ktock |
| } | ||
|
|
||
| if sys, ok := stat.Sys().(*unix.Stat_t); ok { | ||
| if sys.Nlink == 1 { |
There was a problem hiding this comment.
Hi @ChengyuZhu6 ,
Really like the idea and see lots of benefits from this... have few Qs around the implementation -
-
when
Nlinkbecomes 1? as far as I understand - stargz never deletes its temp dir cache so Nlink will never become 1? https://github.com/containerd/stargz-snapshotter/blob/main/fs/layer/layer.go#L231 -
while evicting at
hardlink_managerside - what if cleanup never goes through i.e Nlink > 1 then these files will never get cleaned up as we are clearing LRU entries? -
what's the purpose of DS
digestToKeysandkeyToDigest? as I see#Getcompletely relies onchunkDigestto retrieve the link -
does hardlink_manager recover the existing state during start up? let's say stargz was running fine and it had to restart - should it populate
digestToFileduring initialization for existing hardlinks?
Thanks!



Propose the implementation of a hardlink feature in the caching mechanism to optimize memory usage, improve performance and save disk space.
Fixes: #1953