Skip to content

Commit 66a95a4

Browse files
New "serialized-file" command with docs and new test data (#47)
* New command for quick access to information from the header of a serialized file (external references and object list) * In future we can expose typetree information * Add a more complete (but tiny) Player build into the test data (with and without typetrees) * Introduce list of common types * In a follow up could use it to improve the BuildReport importing (#55 )
1 parent c643ac8 commit 66a95a4

26 files changed

+1243
-8
lines changed

AGENTS.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,10 @@ UnityDataTool dump /path/to/file.bundle -o /output/path
7878
# Extract archive contents
7979
UnityDataTool archive extract file.bundle -o contents/
8080

81+
# Quick inspect SerializedFile metadata
82+
UnityDataTool serialized-file objectlist level0
83+
UnityDataTool sf externalrefs sharedassets0.assets --format json
84+
8185
# Find reference chains to an object
8286
UnityDataTool find-refs database.db -n "ObjectName" -t "Texture2D"
8387
```
@@ -139,13 +143,15 @@ UnityDataTool (CLI executable)
139143

140144
**Entry Points**:
141145
- `UnityDataTool/Program.cs` - CLI using System.CommandLine
142-
- `UnityDataTool/Commands/` - Command handlers (Analyze.cs, Dump.cs, Archive.cs, FindReferences.cs)
143-
- `Documentation/` - Command documentation (command-analyze.md, command-dump.md, command-archive.md, command-find-refs.md)
146+
- `UnityDataTool/SerializedFileCommands.cs` - SerializedFile inspection handlers
147+
- `UnityDataTool/Archive.cs` - Archive manipulation handlers
148+
- `Documentation/` - Command documentation (command-analyze.md, command-dump.md, command-archive.md, command-serialized-file.md, command-find-refs.md)
144149

145150
**Core Libraries**:
146151
- `UnityFileSystem/UnityFileSystem.cs` - Init(), MountArchive(), OpenSerializedFile()
147152
- `UnityFileSystem/DllWrapper.cs` - P/Invoke bindings to native library
148153
- `UnityFileSystem/SerializedFile.cs` - Represents binary data files
154+
- `UnityFileSystem/TypeIdRegistry.cs` - Built-in TypeId to type name mappings
149155
- `UnityFileSystem/RandomAccessReader.cs` - TypeTree property navigation
150156

151157
**Analyzer**:

Analyzer/SQLite/Handlers/PackedAssetsHandler.cs

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ public void Init(SqliteConnection db)
6464
public void Process(Context ctx, long objectId, RandomAccessReader reader, out string name, out long streamDataSize)
6565
{
6666
var packedAssets = PackedAssets.Read(reader);
67-
67+
6868
m_InsertPackedAssetsCommand.Transaction = ctx.Transaction;
6969
m_InsertPackedAssetsCommand.Parameters["@id"].Value = objectId;
7070
m_InsertPackedAssetsCommand.Parameters["@path"].Value = packedAssets.Path;
@@ -96,6 +96,11 @@ public void Process(Context ctx, long objectId, RandomAccessReader reader, out s
9696
m_InsertContentsCommand.Transaction = ctx.Transaction;
9797
m_InsertContentsCommand.Parameters["@packed_assets_id"].Value = objectId;
9898
m_InsertContentsCommand.Parameters["@object_id"].Value = content.ObjectID;
99+
100+
// TODO: Ideally we would also populate the type table if the content.Type is
101+
// not already in that table, and if we have a string value for it in TypeIdRegistry. That would
102+
// make it possible to view object types as strings, for the most common types, when importing a BuildReport
103+
// without the associated built content.
99104
m_InsertContentsCommand.Parameters["@type"].Value = content.Type;
100105
m_InsertContentsCommand.Parameters["@size"].Value = (long)content.Size;
101106
m_InsertContentsCommand.Parameters["@offset"].Value = (long)content.Offset;

Analyzer/SQLite/Parsers/SerializedFileParser.cs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ bool ShouldIgnoreFile(string file)
6464
private static readonly HashSet<string> IgnoredExtensions = new()
6565
{
6666
".txt", ".resS", ".resource", ".json", ".dll", ".pdb", ".exe", ".manifest", ".entities", ".entityheader",
67-
".ini", ".config", ".hash"
67+
".ini", ".config", ".hash", ".md"
6868
};
6969

7070
bool ProcessFile(string file, string rootDirectory)

Analyzer/SerializedObjects/BuildReport.cs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@ public static string GetBuildTypeString(int buildType)
120120
{
121121
1 => "Player",
122122
2 => "AssetBundle",
123-
3 => "Player, AssetBundle",
123+
3 => "ContentDirectory",
124124
_ => buildType.ToString()
125125
};
126126
}
Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
# serialized-file Command
2+
3+
The `serialized-file` command (alias: `sf`) provides utilities for quickly inspecting SerializedFile metadata without performing a full analysis.
4+
5+
## Sub-Commands
6+
7+
| Sub-Command | Description |
8+
|-------------|-------------|
9+
| [`externalrefs`](#externalrefs) | List external file references |
10+
| [`objectlist`](#objectlist) | List all objects in the file |
11+
12+
---
13+
14+
## externalrefs
15+
16+
Lists the external file references (dependencies) in a SerializedFile. This shows which other files the SerializedFile depends on.
17+
18+
### Quick Reference
19+
20+
```
21+
UnityDataTool serialized-file externalrefs <filename> [options]
22+
UnityDataTool sf externalrefs <filename> [options]
23+
```
24+
25+
| Option | Description | Default |
26+
|--------|-------------|---------|
27+
| `<filename>` | Path to the SerializedFile | *(required)* |
28+
| `-f, --format <format>` | Output format: `Text` or `Json` | `Text` |
29+
30+
### Example - Text Output
31+
32+
```bash
33+
UnityDataTool serialized-file externalrefs level0
34+
```
35+
36+
**Output:**
37+
```
38+
Index: 1, Path: globalgamemanagers.assets
39+
Index: 2, Path: sharedassets0.assets
40+
Index: 3, Path: Library/unity default resources
41+
```
42+
43+
### Example - JSON Output
44+
45+
```bash
46+
UnityDataTool sf externalrefs sharedassets0.assets --format json
47+
```
48+
49+
**Output:**
50+
```json
51+
[
52+
{
53+
"index": 1,
54+
"path": "globalgamemanagers.assets",
55+
"guid": "00000000000000000000000000000000",
56+
"type": "NonAssetType"
57+
},
58+
{
59+
"index": 2,
60+
"path": "Library/unity default resources",
61+
"guid": "0000000000000000e000000000000000",
62+
"type": "NonAssetType"
63+
}
64+
]
65+
```
66+
67+
---
68+
69+
## objectlist
70+
71+
Lists all objects contained in a SerializedFile, showing their IDs, types, offsets, and sizes.
72+
73+
### Quick Reference
74+
75+
```
76+
UnityDataTool serialized-file objectlist <filename> [options]
77+
UnityDataTool sf objectlist <filename> [options]
78+
```
79+
80+
| Option | Description | Default |
81+
|--------|-------------|---------|
82+
| `<filename>` | Path to the SerializedFile | *(required)* |
83+
| `-f, --format <format>` | Output format: `Text` or `Json` | `Text` |
84+
85+
### Example - Text Output
86+
87+
```bash
88+
UnityDataTool sf objectlist sharedassets0.assets
89+
```
90+
91+
**Output:**
92+
```
93+
Id Type Offset Size
94+
------------------------------------------------------------------------------------------
95+
1 PreloadData 83872 49
96+
2 Material 83936 268
97+
3 Shader 84208 6964
98+
4 Cubemap 91184 240
99+
5 MonoBehaviour 91424 60
100+
6 MonoBehaviour 91488 72
101+
```
102+
103+
### Example - JSON Output
104+
105+
```bash
106+
UnityDataTool serialized-file objectlist level0 --format json
107+
```
108+
109+
**Output:**
110+
```json
111+
[
112+
{
113+
"id": 1,
114+
"typeId": 1,
115+
"typeName": "GameObject",
116+
"offset": 4864,
117+
"size": 132
118+
},
119+
{
120+
"id": 2,
121+
"typeId": 4,
122+
"typeName": "Transform",
123+
"offset": 5008,
124+
"size": 104
125+
}
126+
]
127+
```
128+
129+
---
130+
131+
## Use Cases
132+
133+
### Quick File Inspection
134+
135+
Use `serialized-file` when you need quick information about a SerializedFile without generating a full SQLite database:
136+
137+
```bash
138+
# Check what objects are in a file
139+
UnityDataTool sf objectlist sharedassets0.assets
140+
141+
# Check file dependencies
142+
UnityDataTool sf externalrefs level0
143+
```
144+
145+
### Scripting and Automation
146+
147+
The JSON output format is ideal for scripts and automated processing:
148+
149+
```bash
150+
# Extract object count
151+
UnityDataTool sf objectlist level0 -f json | jq 'length'
152+
153+
# Find specific object types
154+
UnityDataTool sf objectlist sharedassets0.assets -f json | jq '.[] | select(.typeName == "Material")'
155+
```
156+
157+
---
158+
159+
## SerializedFile vs Archive
160+
161+
When working with AssetBundles (or a compressed Player build) you need to extract the contents first (with `archive extract`), then run the `serialized-file` command on individual files in the extracted output.
162+
163+
**Example workflow:**
164+
```bash
165+
# 1. List contents of an archive
166+
UnityDataTool archive list scenes.bundle
167+
168+
# 2. Extract the archive
169+
UnityDataTool archive extract scenes.bundle -o extracted/
170+
171+
# 3. Inspect individual SerializedFiles
172+
UnityDataTool sf objectlist extracted/CAB-5d40f7cad7c871cf2ad2af19ac542994
173+
```
174+
175+
---
176+
177+
## Notes
178+
179+
- This command only supports extracting information from the SerializedFile header of individual files. It does not extract detailed type-specific properties. Use `analyze` for full analysis of one or more SerializedFiles.
180+
- The command uses the same native library (UnityFileSystemApi) as other UnityDataTool commands, ensuring consistent file reading across all Unity versions.
181+

Documentation/unity-content-format.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -108,8 +108,8 @@ However in cases where you want to understand what contributes to the size your
108108

109109
Often the source of content can be easily inferred, based on your own knowledge of your project, and the names of objects. For example the name of a Shader should be unique, and typically has a filename that closely matches the Shader name.
110110

111-
You can also use the [BuildReport](https://docs.unity3d.com/Documentation/ScriptReference/Build.Reporting.BuildReport.html) for Player and AssetBundle builds (excluding Addressables). The [Build Report Inspector](https://github.com/Unity-Technologies/BuildReportInspector) is a tool to aid in analyzing that data.
111+
You can include a Unity BuildReport file when running `UnityDataTools analyze`. This will import the PackedAsset information, tracking the source asset information for each object in the build output. See [Build Reports](./build-reports.md) for more information, including alternative ways to view the build report.
112112

113-
For AssetBundles built by [BuildPipeline.BuildAssetBundles()](https://docs.unity3d.com/ScriptReference/BuildPipeline.BuildAssetBundles.html), there is also source information available in the .manifest files for each bundle.
113+
`UnityDataTools analyze` can also import Addressables build layout files, which include source asset information. See [Addressable Build Reports](./addressables-build-reports.md).
114114

115-
Addressables builds do not produce a BuildReport or .manifest files, but it offers similar build information in the user interface.
115+
For AssetBundles built by [BuildPipeline.BuildAssetBundles()](https://docs.unity3d.com/ScriptReference/BuildPipeline.BuildAssetBundles.html) Unity creates a .manifest file for each AssetBundle that has source information. This is a text-base format.

Documentation/unitydatatool.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ A command-line tool for analyzing and inspecting Unity build output—AssetBundl
99
| [`analyze`](command-analyze.md) | Extract data from Unity files into a SQLite database |
1010
| [`dump`](command-dump.md) | Convert SerializedFiles to human-readable text |
1111
| [`archive`](command-archive.md) | List or extract contents of Unity Archives |
12+
| [`serialized-file`](command-serialized-file.md) | Quick inspection of SerializedFile metadata |
1213
| [`find-refs`](command-find-refs.md) | Trace reference chains to objects *(experimental)* |
1314

1415
---
@@ -28,6 +29,10 @@ UnityDataTool dump /path/to/file.bundle -o /output/path
2829
# Extract archive contents
2930
UnityDataTool archive extract file.bundle -o contents/
3031

32+
# Quick inspect SerializedFile
33+
UnityDataTool serialized-file objectlist level0
34+
UnityDataTool sf externalrefs sharedassets0.assets --format json
35+
3136
# Find reference chains to an object
3237
UnityDataTool find-refs database.db -n "ObjectName" -t "Texture2D"
3338
```
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
This is a partial copy of the same build as PlayerWithTypeTrees, but with typetrees turned off.
2+
3+
Without typetrees the information that can be retrieved is quite limited.
1.21 KB
Binary file not shown.
1.25 KB
Binary file not shown.

0 commit comments

Comments
 (0)