-
Notifications
You must be signed in to change notification settings - Fork 52
feat: add support for LLVM IR files in the linker #323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!
@tamird reviewed 4 of 5 files at r1, all commit messages.
Reviewable status: 4 of 5 files reviewed, 10 unresolved discussions (waiting on @alessandrod)
-- commits line 2 at r1:
it would be great to include your motivation here
src/llvm/mod.rs line 144 at r1 (raw file):
linked } #[must_use]
newline between functions please
src/llvm/mod.rs line 148 at r1 (raw file):
context: &'ctx LLVMContext, module: &mut LLVMModule<'ctx>, buffer: &CStr,
why does this need a c-string? LLVMCreateMemoryBufferWithMemoryRange takes a pointer and a length, i think this can be &[u8]
src/llvm/mod.rs line 176 at r1 (raw file):
linked = unsafe { LLVMLinkModules2(module.as_mut_ptr(), temp_module) } == 0; } else { if !error_msg.is_null() {
we should return the error message.
src/llvm/mod.rs line 179 at r1 (raw file):
unsafe { LLVMDisposeMessage(error_msg) }; } if !temp_module.is_null() {
this should be impossible, no?
Cargo.toml line 43 at r1 (raw file):
rustc-build-sysroot = { workspace = true } which = { version = "8.0.0", default-features = false, features = ["real-sys", "regex"] } tempfile = "3.13"
could you keep this alphabetical please and use { version = ... } for consistency?
src/linker.rs line 514 at r1 (raw file):
// buffer used to perform file type detection let mut buf = [0u8; 1024];
could you add a comment here, or maybe make this more general so that this arbitrary size isn't needed? e.g. i wonder if BufRead would help?
src/linker.rs line 600 at r1 (raw file):
} InputType::Ir => { data.push(0); // force push null terminator
I think you can just use CString::new(data) which will internally append the nul byte.
src/linker.rs line 914 at r1 (raw file):
Some(position) => &data[position..], None => return false, };
Code quote:
// Trim whitespace from the start of the data
let trimmed = match data.iter().position(|b| !b.is_ascii_whitespace()) {
Some(position) => &data[position..],
None => return false,
};src/linker.rs line 923 at r1 (raw file):
|| trimmed.starts_with(b"target ") || trimmed.starts_with(b"define") || trimmed.starts_with(b"!llvm")
might be easier to read as [.....].iter().any(|prefix| trimmed.starts_with(prefix))
Code quote:
trimmed.starts_with(b"; ModuleID")
|| trimmed.starts_with(b"target triple")
|| trimmed.starts_with(b"target datalayout")
|| trimmed.starts_with(b"source_filename")
|| trimmed.starts_with(b"target ")
|| trimmed.starts_with(b"define")
|| trimmed.starts_with(b"!llvm")
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 5 files reviewed, 10 unresolved discussions (waiting on @alessandrod and @tamird)
src/linker.rs line 514 at r1 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
could you add a comment here, or maybe make this more general so that this arbitrary size isn't needed? e.g. i wonder if BufRead would help?
What you suggest as a comment for this one?
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 5 files reviewed, 10 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/linker.rs line 514 at r1 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
What you suggest as a comment for this one?
I would avoid the arbitrary-size buffer if possible - in the case of IR you could have unbounded whitespace before the thing you look for in is_llvm_ir.
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamird reviewed 1 of 4 files at r2, all commit messages.
Reviewable status: 2 of 5 files reviewed, 11 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/linker.rs line 915 at r2 (raw file):
fn is_llvm_ir(data: &[u8]) -> bool { let trimmed = data.trim_ascii_start(); if trimmed.is_empty() {
this check is not needed, right?
src/llvm/mod.rs line 152 at r2 (raw file):
let buffer_name = c"ir_buffer"; let buffer = buffer.to_bytes(); let mem_buffer = unsafe {
don't you need to unsafe { LLVMDisposeMemoryBuffer(buffer) }; like the function above?
src/llvm/mod.rs line 157 at r2 (raw file):
buffer.len(), buffer_name.as_ptr(), 1, // LLVM internally sets RequiresTerminator=true
I am now confused by this. Can we just set it to 0 and then we wouldn't need to create a C-string?
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 11 unresolved discussions (waiting on @alessandrod and @tamird)
src/linker.rs line 915 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
this check is not needed, right?
Now that we switched to the approach of no hardcoded buffer I need to remove this, sorry
src/llvm/mod.rs line 152 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
don't you need to
unsafe { LLVMDisposeMemoryBuffer(buffer) };like the function above?
On some tests I was doing this but, if we drop the buffer here it can't hold the internal reference to it and perform the checks, but this was with the previous approach, I need to check now!
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
I am now confused by this. Can we just set it to 0 and then we wouldn't need to create a C-string?
So because of the llvm ir parser we need to set this to one because of that hardcoded value setting null termination to true even if the buffer don't have it... This was a issue alessandro found it with llvm in debug mode, but maybe he can answer this better
cc @alessandrod
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 11 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
So because of the llvm ir parser we need to set this to one because of that hardcoded value setting null termination to true even if the buffer don't have it... This was a issue alessandro found it with llvm in debug mode, but maybe he can answer this better
cc @alessandrod
Now that I am looking at the screenshot he posted, I think that was just because of this 1. If you change it to 0 I think you do not need a null terminator.
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 11 unresolved discussions (waiting on @alessandrod and @tamird)
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
Now that I am looking at the screenshot he posted, I think that was just because of this
1. If you change it to0I think you do not need a null terminator.
That print was testing with 0
|
Done, removed the unnecessary check and also tested with |
|
hey @tamird gm! any blockers here ? |
|
@tamird wrt the memory buffer thing, I think we actually need to change all the other instances to be null terminated The RequiresNullTerminator field isn't stored anywhere, it's only used to do a check when the buffer is created https://github.com/llvm/llvm-project/blob/bde90624185ea2cead0a8d7231536e2625d78798/llvm/lib/Support/MemoryBuffer.cpp#L48 Then this is why we get the assertion in the screenshot: LLVMParseIRInContext => parseIR => parseAssembly => parseAssemblyInto
so RequiresNullTerminator=true and we hit the assertion if we don't null terminate our buffer I think in the other instances we're lucky and we're not hitting the assertion by accident |
src/linker.rs
Outdated
| let mut buf = BufReader::new(input); | ||
|
|
||
| // Peek at the buffer to determine file type | ||
| let preview = buf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is still an arbitrary size buffer (the size of the internal BufReader
buffer which is 4096 by default)
You don't need BufReader here. What you want to do is pass input to
detect_input_type instead of passing &[u8]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, just saw on the implementation
"BufReader can improve the speed of programs that make small and repeated read calls to the same file or network socket. It does not help when reading very large amounts at once, or reading just one or a few times. It also provides no advantage when reading from a source that is already in memory, like a Vec"
| ) -> Result<bool, String> { | ||
| let buffer_name = c"ir_buffer"; | ||
| let buffer = buffer.to_bytes(); | ||
| let mem_buffer = unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're leaking this you need to call LLVMDisposeMemoryBuffer before returning
tests/ir_file_test.rs
Outdated
|
|
||
| // Corrupting IR content | ||
| let invalid_content = valid_content | ||
| .replace("define", "defXne") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to corrupt 3 things, pick one :D
…ter and fix memory leak. remove unnecessary replacements on test


Context
Recently I was testing generating IR with different languages and using LLVM to generate the bitcode, then using sbpf-linker to generate a .so file
Problem
In the process we need this extra step to generate the .bc with LLVM. I started a thread on X about adding .ll files as parameters for the sbpf-linker and then and alessandro mention the possibility of adding it for bpf-linker and here we are!
This change is