Skip to content

Conversation

@Lut99
Copy link

@Lut99 Lut99 commented May 5, 2025

This is the brane counterpart to BraneFramework/policy-reasoner#61

This PR see a complete rework of brane-chk to incorporate changes introduced by the policy-reasoner refactor. Specifically:

  1. Moved the interface-part (API) to the brane-chk code. As such, the code here has become more complex (as in, there is some now) but removes the co-dependency between the Brane and reasoner repos.
  2. Separated out the reasoner and store APIs, as per the new split in the reasoner itself. This mirrors the old deliberation/management API splits.
    • However, note that the reasoner context is the only oddball out here. It is a reasoner concept but used in the store API, and as such, the interface on the Brane side does a little wizardry to inject that concept into the store API.
  3. Moved to the eFLINT Haskell backend. This means the brane-chk-container is no longer compiling Go but Haskell instead.

The merge is almost complete. See #269 and BraneFramework/manual#5 for things left to do.

In addition:

  1. Removed schemes (e.g., http://) from endpoints in config files as these were also fixed to the same anyway and caused confusion when writing the files. Instead, the code using the addresses infer it themselves.
  2. Docker containers now run as non-root (brane user instead) for added security (save for brane-job, as it needs to access the Docker socket, also on mac).
  3. Tested & runs on an ARM mac.
  4. Added the make brane-let-docker command, which compiles the branelet executable in a container to support both Linux and macOS branelet incompatibility issues.

IMPORTANT: Before this is merged, merge the equivalent PR at the policy-reasoner side first. Then, update the dependencies in this repo to remove the branch = "lib-refactor"-keys. Then this PR may be merged.

(Also fix the rebase issues first lel)

@Lut99
Copy link
Author

Lut99 commented May 5, 2025

Oh no, I see what went wrong with the rebase now

WHY DID IT PUT MAIN ON MY CHANGES AND THEN ADDED THESE CHANGES ATOP THE MAIN ONES AGAIN???

Daniel it's a mess 😂 help

@Lut99
Copy link
Author

Lut99 commented May 5, 2025

OK, I completely see what I did wrong. I got confused after I finished the rebase and then I merged the remote back into it 🤦‍♂️ Clearly, since the histories now diverge, I should've just force pushed.

Is there an easy way to fix this? I think I added commits since that happened whoop

@Lut99
Copy link
Author

Lut99 commented May 6, 2025

I'm having a bad time fixing this xD man I hate rebases (of this size)

I tried to rebase the last couple of commits (7f33db8 - 5b77c67) onto the commit where this starts to differ from main. Completely failed. 50% of my changes didn't even make it to the final product (which took me well over an hour). Idk man, I don't think I quite get how to do rebases just yet xD

Whatever's in this branch now works. I just want to end up with this but then without these double-changes commits. Should I just squash the whole branch to get it over with? Would be a shame, because it's an important refactor and a lot of history would be lost. But at least it'd be a straightforward history.

Oh yeah, PS, this would need an additional commit syncing with the latest version of policy-reasoner s.t. we work around the Gitlab syncing issue. Fails to do anything meaningful with Cargo otherwise.

@DanielVoogsgerd
Copy link
Collaborator

Yeah, this is not a fun rebase. I can take a look tomorrow, I just need a working version of the working tree so I know what we are working towards.

@Lut99
Copy link
Author

Lut99 commented May 6, 2025

The current state of lib-refactor passes my "integration tests" (i.e., can remotely run workflows and do data transfers, both to other workers and end users). If that's what you mean. Would be great if you can take a look :) but don't rush your deadline of course. Let me know if it's too much work, then I'll see what I can do on my end.

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

devskim found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Copy link
Collaborator

@DanielVoogsgerd DanielVoogsgerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to need a couple of passes through this code if I want to catch most of the "issues". This first pass is very course and more focussed on style and blatant issues. Once I am happy with where the code is at, I will move on to testing and running it to see what has happened from an architectural perspective.

I hope I can get a mental model of it all, but such a large changeset without documentation is really pushing/exceeding my boundaries if I want to form a proper idea what has changed and what all the implications are.

Feel free to leave the changes to me. A lot of this stuff is trivial and sometimes subjective. I will gladly make the changes, myself, but maybe it is nice to get some feedback on this stuff.

Finally, most of it actually looks quite good, and I really like some structural changes I am seeing, so I really feel like this is a change for the better.

@Lut99
Copy link
Author

Lut99 commented May 12, 2025

OK; see my comments. What do you think?

Copy link
Collaborator

@DanielVoogsgerd DanielVoogsgerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A last couple of comments so I can fix them in #276

Comment on lines 601 to 334
struct CallFinder<'w> {
/// The workflow ID (for debugging)
wf_id: &'w str,
/// The task to find.
call: &'w str,
/// Whether we already found it or not.
found: bool,
}
impl<'w> CallFinder<'w> {
/// Constructor for the CallFinder.
///
/// # Arguments
/// - `wf_id`: The ID of the workflow we're asserting.
/// - `call`: The ID of the call to find.
///
/// # Returns
/// A new instance of Self, ready to sniff out the call!
#[inline]
fn new(wf_id: &'w str, call: &'w str) -> Self { Self { wf_id, call, found: false } }
}
impl<'w> Visitor<'w> for CallFinder<'w> {
type Error = Error;

#[inline]
fn visit_call(&mut self, elem: &'w ElemCall) -> Result<Option<&'w Elem>, Self::Error> {
// Check if it's the one
if self.call == elem.id {
if !self.found {
self.found = true;
} else {
return Err(Error::DuplicateCallId { workflow: self.wf_id.into(), call: elem.id.clone() });
}
}

// OK, continue
Ok(Some(&elem.next))
}
}

/// Asserts that the given task occurs exactly once in the workflow and that it has exactly one
/// input with the given name.
#[derive(Debug)]
struct CallInputFinder<'w> {
/// The workflow ID (for debugging)
wf_id: &'w str,
/// The task to find.
call: &'w str,
/// The input to find.
input: &'w str,
/// Whether we already found the call it or not.
found_call: bool,
}
impl<'w> CallInputFinder<'w> {
/// Constructor for the CallInputFinder.
///
/// # Arguments
/// - `wf_id`: The ID of the workflow we're asserting.
/// - `call`: The ID of the call to find.
/// - `input`: The ID of the input to the given call to find.
///
/// # Returns
/// A new instance of Self, ready to scooby the input to call.
#[inline]
fn new(wf_id: &'w str, call: &'w str, input: &'w str) -> Self { Self { wf_id, call, input, found_call: false } }
}
impl<'w> Visitor<'w> for CallInputFinder<'w> {
type Error = Error;

#[inline]
fn visit_call(&mut self, elem: &'w ElemCall) -> Result<Option<&'w Elem>, Self::Error> {
// Check if it's the one
if self.call == elem.id {
// It is, so mark it (or complain we've seen it before)
if !self.found_call {
self.found_call = true;
} else {
return Err(Error::DuplicateCallId { workflow: self.wf_id.into(), call: elem.id.clone() });
}

// Also verify the input exists in this call
let mut found_input: bool = false;
for i in &elem.input {
if self.input == i.id {
if !found_input {
found_input = true;
} else {
return Err(Error::DuplicateInputId { workflow: self.wf_id.into(), call: elem.id.clone(), input: i.id.clone() });
}
}
}
if !found_input {
return Err(Error::UnknownInputToCall { workflow: self.wf_id.into(), call: elem.id.clone(), input: self.input.into() });
}
}

// OK, continue
Ok(Some(&elem.next))
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a Brane thing or a policy reasoner thing? I am not sure if policy reasoner wants to make these assertions of a workflow in general, or whether this is specific for our usecase.

Also, a finder doesn't match with what these visitors seem to do.

DanielVoogsgerd and others added 5 commits July 30, 2025 17:40
Furthermore, in order to do so:
- Move to main branch for policy store
- Remove removed generic from instantiated_path
Required an insert of a missing `Send` bound over in `policy-reasoner`
land.

Also, unintentionally, got a bunch of TOML formats now I switched
editors since last time, whoops
DanielVoogsgerd and others added 5 commits July 30, 2025 17:43
Because `tonic` `1.13.0` requires it, if I read this right
This because `policy-store` (which `specifications`, among others,
depends on) requires it.
It was both testing an unsupported test and using a wrong (and brittle)
test dir path
@Lut99
Copy link
Author

Lut99 commented Jul 30, 2025

OK hold your breath fellas

@Lut99
Copy link
Author

Lut99 commented Jul 30, 2025

(Assuming you can hold your breath for 15+ minutes)

@Lut99
Copy link
Author

Lut99 commented Jul 30, 2025

NO

@Lut99
Copy link
Author

Lut99 commented Jul 30, 2025

Of course now nightly clippy wakes up. Great.

@Lut99
Copy link
Author

Lut99 commented Jul 30, 2025

Time to hold your breath again

@Lut99
Copy link
Author

Lut99 commented Jul 31, 2025

image

This runner cannot be having a good time

@Lut99
Copy link
Author

Lut99 commented Jul 31, 2025

W-What...? It literally failed again after 19 minutes of compilation...?

@Lut99
Copy link
Author

Lut99 commented Jul 31, 2025

Guess I'll never go to sleep, huh

@Lut99
Copy link
Author

Lut99 commented Jul 31, 2025

... (I locally tested building build-eflint-repl only, not brane-chk. Take a lesson from that, kids, or something)

@Lut99
Copy link
Author

Lut99 commented Jul 31, 2025

Queue breath holding time *again*

@Lut99
Copy link
Author

Lut99 commented Jul 31, 2025

image image

Time to close this chapter 😎

@Lut99 Lut99 merged commit ad288df into main Jul 31, 2025
22 checks passed
@DanielVoogsgerd
Copy link
Collaborator

DanielVoogsgerd commented Jul 31, 2025

I will bookmark this under the name: Tim slowly rapidly losing his mind. Thanks for carrying this over the finish line.

It was a horrible refactor to land, but I think it is worth the struggle for Brane in the end. Both in removing the circular dependency (thank god), but also in the scope creep changes that inevitably made it into the PR. Thanks for all the hard work! ❤️

@Lut99
Copy link
Author

Lut99 commented Jul 31, 2025

Haha yeah it was an adventure alright. But it's in a really good shape now, and quite the milestone. Thanks to you too for all the hard work, patience and lessons!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants