-
Notifications
You must be signed in to change notification settings - Fork 0
[PowerPC][AIX] Support #pragma comment copyright for AIX #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: XCOFFAssociatedMetadata
Are you sure you want to change the base?
[PowerPC][AIX] Support #pragma comment copyright for AIX #1
Conversation
| /*Name=*/"__aix_copyright_str"); | ||
| StrGV->setUnnamedAddr(GlobalValue::UnnamedAddr::Global); | ||
| StrGV->setAlignment(Align(1)); | ||
| StrGV->setSection("__llvm_copyright"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments for this transformation do not say anything about this. What is the benefit for doing this? It increases the chances of the string being kept under LTO situations, but is that always a good thing?
Also, if this is AIX-specific, why call this section __llvm_copyright?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setUnnamedAddrwill allow linker to merge identical strings. I think using InternalLinkage will do the exact same thing.- I will change the name to
__aix_copyright.
xlcalso emitted a section and.refwas pointing to it as shown below.
.csect H.10.NO_SYMBOL{PR}, 7
.func1: # 0x00000000 (H.10.NO_SYMBOL)
cal r3,1(r0)
bcr BO_ALWAYS,CR0_LT
.long 0x00000000
.ref E.16.__STATIC{RW}. ############# .ref to E.16.__STATIC section
.csect E.16.__STATIC{RW}, 3
.long 0x436f7079 # "Copy"
.long 0x72696768 # "righ"
.long 0x74203230 # "t 20"
.long 0x32352043 # "25 C"
.long 0x6f6d7061 # "ompa"
.long 0x6e792041 # "ny A"
# End csect E.16.__STATIC{RW}
.long 0x00000000 # "\0\0\0\0"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xlc puts all internal linkage data into the same section (which unnecessarily bloats the loaded image if some of the data is unreferenced and could otherwise be garbage-collected, but it has the benefit of reducing TOC usage).
Providing the disassembly here for discussion is a good call: It seems xlc put the string into a writable section. There is no documented rationale for that. We should consider marking the string as read-only (and testing for that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are generating a csect with RO mapping class as shown below:
.file "file1.c",,"LLVM version 22.0.0git"
.csect ..text..[PR],5
.rename ..text..[PR],""
.machine "PWR7"
.globl func1[DS] # -- Begin function func1
.globl .func1
.align 4
.csect func1[DS],2
.vbyte 4, .func1 # @func1
.vbyte 4, TOC[TC0]
.vbyte 4, 0
.csect ..text..[PR],5
.func1:
.ref __aix_copyright_str
# %bb.0: # %entry
blr
L..func10:
.vbyte 4, 0x00000000 # Traceback table begin
.byte 0x00 # Version = 0
.byte 0x09 # Language = CPlusPlus
.byte 0x20 # -IsGlobaLinkage, -IsOutOfLineEpilogOrPrologue
# +HasTraceBackTableOffset, -IsInternalProcedure
# -HasControlledStorage, -IsTOCless
# -IsFloatingPointPresent
# -IsFloatingPointOperationLogOrAbortEnabled
.byte 0x40 # -IsInterruptHandler, +IsFunctionNamePresent, -IsAllocaUsed
# OnConditionDirective = 0, -IsCRSaved, -IsLRSaved
.byte 0x00 # -IsBackChainStored, -IsFixup, NumOfFPRsSaved = 0
.byte 0x00 # -HasExtensionTable, -HasVectorInfo, NumOfGPRsSaved = 0
.byte 0x00 # NumberOfFixedParms = 0
.byte 0x01 # NumberOfFPParms = 0, +HasParmsOnStack
.vbyte 4, L..func10-.func1 # Function size
.vbyte 2, 0x0005 # Function name len = 5
.byte "func1" # Function Name
# -- End function
.csect __aix_copyright[RO],2
.lglobl __aix_copyright_str # @__aix_copyright_str
__aix_copyright_str:
.string "Copyright 2025 TU A"
.toc
We should consider marking the string as read-only (and testing for that).
Marking the csect as RO would suffice right?
To test this, we can do // RUN: llc -mtriple=powerpc-ibm-aix -filetype=asm %t/file1.bc -o - | FileCheck %s --check-prefix=CHECK-ASM like we did in the previous commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reintroduced llvm/test/CodeGen/PowerPC/pragma-comment-copyright-aix.ll for backend testing. We can remove this if not required.
| @@ -0,0 +1,110 @@ | |||
| // REQUIRES: powerpc-registered-target, system-aix, clang | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test fits with the LLVM unit testing philosophy.
My understanding is that we have tests
- for the front-end generation of the
aix.copyright.commentmetadata, and - for the transformation of the
aix.copyright.commentmetadata toimplicit.ref, etc.
What we do not have are targeted tests that specifically verify that the transformation is run at the right point in the various possible pipelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right. Removed this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we do not have are targeted tests that specifically verify that the transformation is run at the right point in the various possible pipelines.
Added the opt levels testing to llvm/test/Transforms/CopyrightMetadata/copyright-metadata.ll. Do we need an end-to-end test ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a end-to-end test ?
I can't say that we shouldn't have one. It is just that I am not aware of there being any in upstream LLVM that involve the linker (except when it's an lld test).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find anything similar as well.
438f04c to
cfa6019
Compare
…slation unit, add backend tests
252f063 to
82254ee
Compare
|
Added the test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some partial review comments...
| ; RUN: opt --O0 -S %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-O0 | ||
| ; RUN: opt --O1 -S %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-ON | ||
| ; RUN: opt --O2 -S %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-ON | ||
| ; RUN: opt --O3 -S %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-ON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar enough with how the pipelines are built between different tools (clang, opt, etc.) to know if this is the right way to check that the pass is in the pipeline at the right place (especially for LTO).
@w2yehia @mandlebug, your input would be appreciated.
| ; - Emits the string in a dedicated read-only csect | ||
| ; | ||
| ; CHECK-ASM: .ref __aix_copyright_str | ||
| ; CHECK-ASM: .csect __aix_copyright[RO],2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the churn. The check in llvm/test/Transforms/CopyrightMetadata/copyright-metadata.ll for constant in the IR is sufficient for establishing that we are generating as read-only, so the testing request from #1 (comment) was already satisfied (and we don't need this test just to cover that).
It is still a question whether we want to group the strings into one csect (which probably has no effect except under "full" LTO where it will increase the chance of the string being kept).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, sure. Removing the llvm/test/CodeGen/PowerPC/pragma-comment-copyright-aix.ll.
It is still a question whether we want to group the strings into one csect (which probably has no effect except under "full" LTO where it will increase the chance of the string being kept).
Are there any downsides to the current approach ? I think it is cleaner to group them seperately in a csect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any downsides to the current approach ?
It generates "false positives" to the question of whether the content associated with the copyright string was actually retained/relevant to the result.
clang/lib/CodeGen/CodeGenModule.cpp
Outdated
|
|
||
| // Emit copyright metadata for AIX | ||
| if (AIXCopyrightComment) { | ||
| auto *NMD = getModule().getOrInsertNamedMetadata("aix.copyright.comment"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename "aix.copyright.comment" to be something generic so other platforms can use it. Something like "comment.module". Hubert had a better suggestion than the word "module". I'd use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to loadtime.copyright.comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop copyright from the name. That reflects a use, not the purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, it may be helpful (in case there are additional #pragma comment IR annotations in the future) to name it comment_string.loadtime so that we can have a family of comment_string.* annotations with the "high-order qualifier" first (assuming that the comments will uniformly be generated as strings out of the front-end).
clang/lib/CodeGen/CodeGenModule.h
Outdated
|
|
||
| /// Single module-level copyright comment for AIX (if any). | ||
| /// We only ever accept one per TU. | ||
| llvm::MDNode *AIXCopyrightComment = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this a SmallVector<> too. AIX might limit this to only one pragma, other platforms may not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to evaluate the impact of making this a SmallVector. Will it cause the code that processes it to become a loop? If so, can that loop be truly tested in this PR? We should not be leaving "inoperative" code paths lying around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with Hubert here, we can always change it to a vector if a need arises.
| // verify that module interface emit copyright string when compiled to assembly | ||
| // RUN: %clang_cc1 -std=c++20 -triple powerpc-ibm-aix -S %t/copymod.cppm -o - \ | ||
| // RUN: | FileCheck %s --check-prefix=CHECK-MOD | ||
| // CHECK-MOD: .string "module me" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test can be written to check LLVM IR (to avoid the PowerPC backend requirement).
clang/lib/CodeGen/CodeGenModule.cpp
Outdated
|
|
||
| // Emit copyright metadata for AIX | ||
| if (AIXCopyrightComment) { | ||
| auto *NMD = getModule().getOrInsertNamedMetadata("aix.copyright.comment"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, it may be helpful (in case there are additional #pragma comment IR annotations in the future) to name it comment_string.loadtime so that we can have a family of comment_string.* annotations with the "high-order qualifier" first (assuming that the comments will uniformly be generated as strings out of the front-end).
| void CodeGenModule::ProcessPragmaComment(PragmaMSCommentKind Kind, | ||
| StringRef Comment) { | ||
|
|
||
| if (!getTriple().isOSAIX() || Kind != PCK_Copyright) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider asserting either if this is called not for copyright or if it is called for copyright but not for AIX.
Silently doing nothing seems questionable (e.g., why was the function called to do nothing in the first place?).
| /// A vector of metadata strings for dependent libraries for ELF. | ||
| SmallVector<llvm::MDNode *, 16> ELFDependentLibraries; | ||
|
|
||
| /// Single module-level copyright comment (if any). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid "module" without qualification. "Single" is also redundant given the next sentence.
| /// Single module-level copyright comment (if any). | |
| /// Metadata for copyright pragma comment (if present). |
| // Restrict copyright to AIX targets only | ||
| if (!PP.getTargetInfo().getTriple().isOSAIX()) { | ||
| switch (Kind) { | ||
| case PCK_Copyright: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This choice of control flow does not stand on its own. To explain the choice, a comment should be added to the code expressing future direction that justifies this structure.
For example, we can say that this should be applied for z/OS and extended with other IBM pragma comment kinds.
| // into concrete, translation-unit–local globals to ensure that copyright | ||
| // strings: | ||
| // - survive all optimization and LTO pipelines, | ||
| // - are not removed by linker garbage collection, and | ||
| // - remain visible in the final XCOFF binary. | ||
| // |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this. It is inaccurate and otherwise redundant with the detailed description below.
| // - are not removed by linker garbage collection, and | ||
| // - remain visible in the final XCOFF binary. | ||
| // | ||
| // For each module (translation unit), the pass performs the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add "only for AIX".
| // 1. Create a single NULL-terminated string global | ||
| Constant *StrInit = ConstantDataArray::getString(Ctx, Text, /*AddNull=*/true); | ||
|
|
||
| // Internal, constant, TU-local — avoids duplicate symbol issues across TUs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try to avoid gratuitous UTF-8.
| // Internal, constant, TU-local — avoids duplicate symbol issues across TUs. | |
| // Internal, constant, TU-local--avoids duplicate symbol issues across TUs. |
| // Backend maps this to an appropriate XCOFF csect (typically [RO]) | ||
| // The section will appear in assembly as: | ||
| // .csect __copyright_comment[RO],2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's limit the amount that we say about XCOFF in this IR pass.
| // Backend maps this to an appropriate XCOFF csect (typically [RO]) | |
| // The section will appear in assembly as: | |
| // .csect __copyright_comment[RO],2 | |
| // The GV is constant, so we expect a read-only section. |
| MDNode *ImplicitRefMD = MDNode::get(Ctx, Ops); | ||
|
|
||
| // Lambda to attach implicit.ref metadata to a function. | ||
| // The backend will translate this into .ref assembly directives. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // The backend will translate this into .ref assembly directives. |
Add support for the #pragma comment (copyright, "Copyright info") directive for AIX.
This builds on top of PR #159096
Refernce: XL documentation for #pragma comment