Skip to content

Commit d81b014

Browse files
woodruffwteresajohnson
authored andcommitted
[NFC][Bitstream] Improve the dumpability of bitstream/bitcode headers
The `LLVMBitCodes.h` header contains various enums that are updated whenever LLVM's bitcode fundamentally changes. It would be nice to track these changes in a semi-automated way, so that external tools that attempt to parse LLVM's bitstream and bitcode can remain in sync. Before this change, `LLVMBitCodes.h` had a single dependency -- it needed the `FIRST_APPLICATION_BLOCKID` enum value from `BitCodes.h`. `BitCodes.h`, in turn, had a whole tree of include dependencies that boiled down to `llvm-config.h`, meaning that it was impossible to dump the AST of either file without having a partial or full LLVM build tree already present. To eliminate that requirement, this patch introduces a new leaf-only header, `BitCodeEnums.h`, which includes the "core" enums originally in `BitCodes.h`. `LLVMBitCodes.h` and `BitCodes.h` both include this new header in turn, preserving the current header relationships while allowing `LLVMBitCodes.h` to be dumped fully independently with a command like this (run from the repository root): ``` clang -fsyntax-only -x c++ -Illvm/include -Xclang -ast-dump=json -Xclang -ast-dump-filter -Xclang llvm::bitc::BlockIDs llvm/include/llvm/Bitcode/LLVMBitCodes.h ``` I recognize that this is a pretty unusual change and perhaps not a guarantee that the LLVM authors would like to make in the general case (i.e., that individual files within LLVM can have their AST dumped with minimal dependencies). However, I believe the criticality/limited scope of the file(s) in this patch warrants an exception. Please let me know if there's any other information I can provide, or anything else I can do to improve this patch! Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D108438
1 parent 96e9b6c commit d81b014

File tree

3 files changed

+96
-66
lines changed

3 files changed

+96
-66
lines changed

llvm/include/llvm/Bitcode/LLVMBitCodes.h

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,10 @@
1717
#ifndef LLVM_BITCODE_LLVMBITCODES_H
1818
#define LLVM_BITCODE_LLVMBITCODES_H
1919

20-
#include "llvm/Bitstream/BitCodes.h"
20+
// This is the only file included, and it, in turn, is a leaf header.
21+
// This allows external tools to dump the AST of this file and analyze it for
22+
// changes without needing to fully or partially build LLVM itself.
23+
#include "llvm/Bitstream/BitCodeEnums.h"
2124

2225
namespace llvm {
2326
namespace bitc {
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
//===- BitCodeEnums.h - Core enums for the bitstream format -----*- C++ -*-===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
//
9+
// This header defines "core" bitstream enum values.
10+
// It has been separated from the other header that defines bitstream enum
11+
// values, BitCodes.h, to allow tools to track changes to the various
12+
// bitstream and bitcode enums without needing to fully or partially build
13+
// LLVM itself.
14+
//
15+
// The enum values defined in this file should be considered permanent. If
16+
// new features are added, they should have values added at the end of the
17+
// respective lists.
18+
//
19+
//===----------------------------------------------------------------------===//
20+
21+
#ifndef LLVM_BITSTREAM_BITCODEENUMS_H
22+
#define LLVM_BITSTREAM_BITCODEENUMS_H
23+
24+
namespace llvm {
25+
/// Offsets of the 32-bit fields of bitstream wrapper header.
26+
enum BitstreamWrapperHeader : unsigned {
27+
BWH_MagicField = 0 * 4,
28+
BWH_VersionField = 1 * 4,
29+
BWH_OffsetField = 2 * 4,
30+
BWH_SizeField = 3 * 4,
31+
BWH_CPUTypeField = 4 * 4,
32+
BWH_HeaderSize = 5 * 4
33+
};
34+
35+
namespace bitc {
36+
enum StandardWidths {
37+
BlockIDWidth = 8, // We use VBR-8 for block IDs.
38+
CodeLenWidth = 4, // Codelen are VBR-4.
39+
BlockSizeWidth = 32 // BlockSize up to 2^32 32-bit words = 16GB per block.
40+
};
41+
42+
// The standard abbrev namespace always has a way to exit a block, enter a
43+
// nested block, define abbrevs, and define an unabbreviated record.
44+
enum FixedAbbrevIDs {
45+
END_BLOCK = 0, // Must be zero to guarantee termination for broken bitcode.
46+
ENTER_SUBBLOCK = 1,
47+
48+
/// DEFINE_ABBREV - Defines an abbrev for the current block. It consists
49+
/// of a vbr5 for # operand infos. Each operand info is emitted with a
50+
/// single bit to indicate if it is a literal encoding. If so, the value is
51+
/// emitted with a vbr8. If not, the encoding is emitted as 3 bits followed
52+
/// by the info value as a vbr5 if needed.
53+
DEFINE_ABBREV = 2,
54+
55+
// UNABBREV_RECORDs are emitted with a vbr6 for the record code, followed by
56+
// a vbr6 for the # operands, followed by vbr6's for each operand.
57+
UNABBREV_RECORD = 3,
58+
59+
// This is not a code, this is a marker for the first abbrev assignment.
60+
FIRST_APPLICATION_ABBREV = 4
61+
};
62+
63+
/// StandardBlockIDs - All bitcode files can optionally include a BLOCKINFO
64+
/// block, which contains metadata about other blocks in the file.
65+
enum StandardBlockIDs {
66+
/// BLOCKINFO_BLOCK is used to define metadata about blocks, for example,
67+
/// standard abbrevs that should be available to all blocks of a specified
68+
/// ID.
69+
BLOCKINFO_BLOCK_ID = 0,
70+
71+
// Block IDs 1-7 are reserved for future expansion.
72+
FIRST_APPLICATION_BLOCKID = 8
73+
};
74+
75+
/// BlockInfoCodes - The blockinfo block contains metadata about user-defined
76+
/// blocks.
77+
enum BlockInfoCodes {
78+
// DEFINE_ABBREV has magic semantics here, applying to the current SETBID'd
79+
// block, instead of the BlockInfo block.
80+
81+
BLOCKINFO_CODE_SETBID = 1, // SETBID: [blockid#]
82+
BLOCKINFO_CODE_BLOCKNAME = 2, // BLOCKNAME: [name]
83+
BLOCKINFO_CODE_SETRECORDNAME = 3 // BLOCKINFO_CODE_SETRECORDNAME:
84+
// [id, name]
85+
};
86+
87+
} // namespace bitc
88+
} // namespace llvm
89+
90+
#endif

llvm/include/llvm/Bitstream/BitCodes.h

Lines changed: 2 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -19,75 +19,12 @@
1919

2020
#include "llvm/ADT/SmallVector.h"
2121
#include "llvm/ADT/StringExtras.h"
22+
#include "llvm/Bitstream/BitCodeEnums.h"
2223
#include "llvm/Support/DataTypes.h"
2324
#include "llvm/Support/ErrorHandling.h"
2425
#include <cassert>
2526

2627
namespace llvm {
27-
/// Offsets of the 32-bit fields of bitstream wrapper header.
28-
enum BitstreamWrapperHeader : unsigned {
29-
BWH_MagicField = 0 * 4,
30-
BWH_VersionField = 1 * 4,
31-
BWH_OffsetField = 2 * 4,
32-
BWH_SizeField = 3 * 4,
33-
BWH_CPUTypeField = 4 * 4,
34-
BWH_HeaderSize = 5 * 4
35-
};
36-
37-
namespace bitc {
38-
enum StandardWidths {
39-
BlockIDWidth = 8, // We use VBR-8 for block IDs.
40-
CodeLenWidth = 4, // Codelen are VBR-4.
41-
BlockSizeWidth = 32 // BlockSize up to 2^32 32-bit words = 16GB per block.
42-
};
43-
44-
// The standard abbrev namespace always has a way to exit a block, enter a
45-
// nested block, define abbrevs, and define an unabbreviated record.
46-
enum FixedAbbrevIDs {
47-
END_BLOCK = 0, // Must be zero to guarantee termination for broken bitcode.
48-
ENTER_SUBBLOCK = 1,
49-
50-
/// DEFINE_ABBREV - Defines an abbrev for the current block. It consists
51-
/// of a vbr5 for # operand infos. Each operand info is emitted with a
52-
/// single bit to indicate if it is a literal encoding. If so, the value is
53-
/// emitted with a vbr8. If not, the encoding is emitted as 3 bits followed
54-
/// by the info value as a vbr5 if needed.
55-
DEFINE_ABBREV = 2,
56-
57-
// UNABBREV_RECORDs are emitted with a vbr6 for the record code, followed by
58-
// a vbr6 for the # operands, followed by vbr6's for each operand.
59-
UNABBREV_RECORD = 3,
60-
61-
// This is not a code, this is a marker for the first abbrev assignment.
62-
FIRST_APPLICATION_ABBREV = 4
63-
};
64-
65-
/// StandardBlockIDs - All bitcode files can optionally include a BLOCKINFO
66-
/// block, which contains metadata about other blocks in the file.
67-
enum StandardBlockIDs {
68-
/// BLOCKINFO_BLOCK is used to define metadata about blocks, for example,
69-
/// standard abbrevs that should be available to all blocks of a specified
70-
/// ID.
71-
BLOCKINFO_BLOCK_ID = 0,
72-
73-
// Block IDs 1-7 are reserved for future expansion.
74-
FIRST_APPLICATION_BLOCKID = 8
75-
};
76-
77-
/// BlockInfoCodes - The blockinfo block contains metadata about user-defined
78-
/// blocks.
79-
enum BlockInfoCodes {
80-
// DEFINE_ABBREV has magic semantics here, applying to the current SETBID'd
81-
// block, instead of the BlockInfo block.
82-
83-
BLOCKINFO_CODE_SETBID = 1, // SETBID: [blockid#]
84-
BLOCKINFO_CODE_BLOCKNAME = 2, // BLOCKNAME: [name]
85-
BLOCKINFO_CODE_SETRECORDNAME = 3 // BLOCKINFO_CODE_SETRECORDNAME:
86-
// [id, name]
87-
};
88-
89-
} // End bitc namespace
90-
9128
/// BitCodeAbbrevOp - This describes one or more operands in an abbreviation.
9229
/// This is actually a union of two different things:
9330
/// 1. It could be a literal integer value ("the operand is always 17").
@@ -183,6 +120,6 @@ class BitCodeAbbrev {
183120
OperandList.push_back(OpInfo);
184121
}
185122
};
186-
} // End llvm namespace
123+
} // namespace llvm
187124

188125
#endif

0 commit comments

Comments
 (0)