You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- add links to specs.ipfs.tech/unixfs throughout docs
- replace inline protobuf with link to spec
- update Go implementation link from kubo to boxo
- add HAMT spec link to glossary
- clarify raw leaves are recommended, not legacy
The UnixFS spec is now published at specs.ipfs.tech/unixfs,
replacing the old GitHub location. This commit updates all
references and reduces duplication by linking to the authoritative
specification instead of maintaining inline technical details.
When you add a _file_ to IPFS, it might be too big to fit in a single block, so it needs metadata to link all its blocks together. UnixFS is a [protocol-buffers](https://developers.google.com/protocol-buffers/)-based format for describing files, directories, and symlinks in IPFS. This data format is used to represent files and all their links and metadata in IPFS. UnixFS creates a block (or a tree of blocks) of linked objects.
220
+
When you add a _file_ to IPFS, it might be too big to fit in a single block, so it needs metadata to link all its blocks together. UnixFS is a [protocol-buffers](https://developers.google.com/protocol-buffers/)-based format for describing files, directories, and symlinks in IPFS. This data format is used to represent files and all their links and metadata in IPFS. UnixFS creates a block (or a tree of blocks) of linked objects. See the [UnixFS specification](https://specs.ipfs.tech/unixfs/) for the complete technical details.
221
221
222
-
UnixFS currently has [Javascript](https://github.com/ipfs/helia/tree/main/packages/unixfs) and [Go](https://github.com/ipfs/kubo/tree/b3faaad1310bcc32dc3dd24e1919e9edf51edba8/unixfs) implementations. These implementations have modules written in to run different functions:
222
+
UnixFS currently has [Javascript](https://github.com/ipfs/helia/tree/main/packages/unixfs) and [Go](https://github.com/ipfs/boxo/tree/v0.34.0/ipld/unixfs) implementations. These implementations have modules written in to run different functions:
223
223
224
224
-**Data Formats**: manage the serialization/deserialization of UnixFS objects to protocol buffers
225
225
@@ -229,50 +229,11 @@ UnixFS currently has [Javascript](https://github.com/ipfs/helia/tree/main/packag
229
229
230
230
### Data Formats
231
231
232
-
On UnixFS-v1 the data format is represented by this protobuf:
232
+
UnixFS uses protocol buffers to define how files and directories are represented in IPFS. The data format includes fields for file types, sizes, permissions, and timestamps.
233
233
234
-
```
235
-
message Data {
236
-
enum DataType {
237
-
Raw = 0;
238
-
Directory = 1;
239
-
File = 2;
240
-
Metadata = 3;
241
-
Symlink = 4;
242
-
HAMTShard = 5;
243
-
}
244
-
245
-
required DataType Type = 1;
246
-
optional bytes Data = 2;
247
-
optional uint64 filesize = 3;
248
-
repeated uint64 blocksizes = 4;
249
-
optional uint64 hashType = 5;
250
-
optional uint64 fanout = 6;
251
-
optional uint32 mode = 7;
252
-
optional UnixTime mtime = 8;
253
-
}
254
-
255
-
message Metadata {
256
-
optional string MimeType = 1;
257
-
}
258
-
259
-
message UnixTime {
260
-
required int64 Seconds = 1;
261
-
optional fixed32 FractionalNanoseconds = 2;
262
-
}
263
-
```
264
-
265
-
This `Data` object is used for all non-leaf nodes in UnixFS:
266
-
267
-
- For files that are comprised of more than a single block, the `Type` field will be set to `File`, the `filesize` field will be set to the total number of bytes in the files, and `blocksizes` will contain a list of the filesizes of each child node.
268
-
269
-
- For files comprised of a single block, the `Type` field will be set to `File`, `filesize` will be set to the total number of bytes in the file, and file data will be stored in the `Data` field.
270
-
271
-
UnixFS also supports two optional metadata format fields:
272
-
273
-
-`mode` - used for persisting the file permissions in [numeric notation](https://en.wikipedia.org/wiki/File_system_permissions#Numeric_notation). If unspecified, this field defaults to `0755` for directories/HAMT shards and `0644` for all the other types where applicable.
274
-
275
-
-`mtime` - is a two-element structure (`Seconds`, `FractionalNanoseconds`) representing the modification time in seconds relative to the Unix epoch `1970-01-01T00:00:00Z`.
234
+
::: tip Want to see the complete specification?
235
+
For the full protobuf definitions, field descriptions, and technical details about how UnixFS nodes are structured, visit the [official UnixFS specification](https://specs.ipfs.tech/unixfs/#dag-pb-node).
236
+
:::
276
237
277
238
### Importer
278
239
@@ -286,7 +247,7 @@ The leaf format takes two format options, UnixFS leaves and raw leaves:
286
247
287
248
- The UnixFS leaves format adds a data wrapper on newly added objects to produce UnixFS leaves with additional data sizes. This wrapper is used to determine whether newly added objects are files or directories. This format is the default for CIDv0.
288
249
289
-
- The raw leaves format on IPFS where nodes output from chunking will be raw data from the file with a CID codec of 'raw'. This is mainly configured for backward compatibility with formats that used a UnixFS Data object. This format is the default for CIDv1 created with `ipfs add --cid-version 1`, soon to become the global default.
250
+
- The raw leaves format on IPFS where nodes output from chunking will be raw data from the file with a CID codec of 'raw' (0x55). This format provides canonical CIDs for single-block files and is recommended over dag-pb wrapped blocks. This format is the default for CIDv1 created with `ipfs add --cid-version 1`.
290
251
291
252
The chunking strategy is used to determine the size options available during the chunking process. The strategy currently has two different options, 'fixed size' and 'rabin'.
292
253
@@ -315,5 +276,5 @@ You can find additional resources to familiarize with these file systems at:
-[Understanding how the InterPlanetary File System deals with Files](https://github.com/ipfs/camp/tree/master/CORE_AND_ELECTIVE_COURSES/CORE_COURSE_A), from IPFS Camp 2019
Copy file name to clipboardExpand all lines: docs/concepts/glossary.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -230,7 +230,7 @@ Graphsync is a legacy content replication protocol, similar to [Bitswap](#bitswa
230
230
231
231
### HAMT-sharding
232
232
233
-
The sharding technique used for [sharding](#sharding) big UnixFS directories. It leverages properties of hash array mapped tries (HAMT). [More about HAMT](https://en.wikipedia.org/wiki/Hash_array_mapped_trie).
233
+
The sharding technique used for [sharding](#sharding) big UnixFS directories. It leverages properties of hash array mapped tries (HAMT). [UnixFS HAMT specification](https://specs.ipfs.tech/unixfs/#dag-pb-hamtdirectory) | [More about HAMT](https://en.wikipedia.org/wiki/Hash_array_mapped_trie).
234
234
235
235
### Hash
236
236
@@ -512,7 +512,7 @@ In [IPLD](#ipld), the act of walking across the [Data Model](#data-model). [More
512
512
513
513
### UnixFS
514
514
515
-
The Unix File System (UnixFS) is the data format used to represent files and all their links and metadata in IPFS. It is loosely based on how files work in Unix. Adding a file to IPFS creates a block, or a _tree_ of blocks, in the UnixFS format and protects it from being garbage-collected. [More about UnixFS](file-systems.md#unix-file-system-unixfs)
515
+
The Unix File System (UnixFS) is the data format used to represent files and all their links and metadata in IPFS. It is loosely based on how files work in Unix. Adding a file to IPFS creates a block, or a _tree_ of blocks, in the UnixFS format and protects it from being garbage-collected. [UnixFS specification](https://specs.ipfs.tech/unixfs/) | [More about UnixFS](file-systems.md#unix-file-system-unixfs)
Copy file name to clipboardExpand all lines: docs/concepts/lifecycle.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ description: Learn about the lifecycle of data in IPFS.
14
14
15
15
The first stage in the lifecycle of data in IPFS is to address it by CID. This is a local operation that takes arbitrary data and encodes it so it can be addressed by a CID. This is also known as _merkleizing_ the data, because the input data is transformed into a [Merkle DAG](./merkle-dag.md).
16
16
17
-
The exact process depends on the type of data. For files and directories, this is done by constructing a [UnixFS](./file-systems.md#unix-file-system-unixfs)[Merkle DAG](./merkle-dag.md). For other data types, such as dag-cbor, this is done by encoding the data with [dag-cbor](https://ipld.io/docs/codecs/known/dag-cbor/) which is hashed to produce a CID.
17
+
The exact process depends on the type of data. For files and directories, this is done by constructing a [UnixFS](./file-systems.md#unix-file-system-unixfs)[Merkle DAG](./merkle-dag.md) ([specification](https://specs.ipfs.tech/unixfs/)). For other data types, such as dag-cbor, this is done by encoding the data with [dag-cbor](https://ipld.io/docs/codecs/known/dag-cbor/) which is hashed to produce a CID.
18
18
19
19
For example, merkleizing a static web application into a UnixFS DAG looks like this, where the whole application is addressed by the CID in the top block (`bafy...jomu`):
0 commit comments