Skip to content

Conversation

@heeplr
Copy link
Contributor

@heeplr heeplr commented Nov 25, 2023

This PR adds 3 new targets to the Makefile:

  • format-python - that uses black to format all python code within src
  • format-cpp - that uses clang-format to format C/C++ files within src
  • format- to combine the above targets

the .clang-format config is based on @rmu75's work with minor changes and lots of disabled options with the goal to

  • invoke as little changes to the source as possible while
  • gaining readability and
  • gaining uniformity

I'd like to suggest the following steps:

  1. test the output of the formatting and adjust the config accordingly.
  2. reformat the whole codebase in one giant commit
  3. wait for dev feedback and adjust the config accordingly, if needed
  4. reformat the whole codebase again, if needed
  5. add some action to the CI that checks/adjusts/enforces formatting in future PRs

@andypugh
Copy link
Collaborator

I am not sure that I have an opinion on this.
Whilst the inconsistency in the file formatting is annoying, a massive re-format would make it practically impossible to do any meaningful code-comparison between the code before the re-format and after.
I would like to hear opinions from actual proper programmers before doing this. (I realise that this commit only makes the major change possible, it doesn't actually do it)

@heeplr
Copy link
Contributor Author

heeplr commented Dec 14, 2023

I'd also be interested in opinions from the community.
Trustwise, i'd suggest that you run the target and do the commit (when there's wide consent on everything).

a massive re-format would make it practically impossible to do any meaningful code-comparison between the code before the re-format and after.

Not impossible but it certainly adds a step. It's a price to pay and I currently don't know any way around it.
Although, I wouldn't say the step is too high and it won't be forever (when at some point really old code isn't needed to be diffed anymore).

For the meantime, there's

  • Tools like astdiff that can deal with reformatting and only show real semantic changes.
  • git diff --ignore-all-space --ignore-blank-lines ... which is useful to clean diffs a bit when no heavy reformatting took place.

@rmu75
Copy link
Contributor

rmu75 commented Dec 14, 2023

In theory, for code comparisions, it should be possible to pipe the old code through clang-format before creating the diff, but I'mt not sure that can be automated with git / github.

@smoe
Copy link
Collaborator

smoe commented Dec 30, 2023

I am convinced that source code formatting is important to attract new devs and to make a good first impression on whoever is making decisions to adopt LinuxCNC. So, I would go for it. Concerning diffs, I think we need do talk with GitHub about that.

@petterreinholdtsen
Copy link
Collaborator

petterreinholdtsen commented Jan 2, 2024 via email

@heeplr
Copy link
Contributor Author

heeplr commented Jan 2, 2024

introduce any formatting errors in small steps
do any reformattings as separate commits

What's the advantage here? I can't see any but think, that it makes a nasty process even more complicated.

once any code needing reformating was touched for other reasons.

Isn't that exactly what you want to prevent? It's very easy to confirm a commit that has only reformatting (astdiff must be empty) and you'd want a clean diff without format changes for any commit with semantic changes.

Personally I don't care about diffing very old code with properly formatted newer code. It's a solvable problem and a rare edge case.

needing reformating

I have yet to find a part of the code (.c, .cpp) that is properly formatted. I'm using vscodium and pretty much every file contains wrong/misleading indentation.
For python it's easy since there's pep8 as official formatting standard which lots of our files violate.

@petterreinholdtsen
Copy link
Collaborator

petterreinholdtsen commented Jan 2, 2024 via email

@heeplr
Copy link
Contributor Author

heeplr commented Jan 2, 2024

One advantage is that 'git blame' and 'git log'

Yes, that's basically the problem @andypugh mentioned. It affects all reformatting that isn't white space only.

But I can't see an advantage of many small commits instead of a single big commit. It's just shifting the same problem into the future. Plus it adds logistics like "please separate your reformatting commit and your semantic change commit before submitting a PR" (how to communicate that?). And code review needs to pull out astdiff over and over again until everything is according to standard eventually.
Also style can't be enforced until this process ends eventually, so new PRs can introduce wrong formatting which starts this hellish loop again.
There are parts of the code that haven't been touched in years. They will never be reformatted then.

If the code line change anyway, it does not make much difference to 'git blame' and 'git log' if its indentation alsy change.

But that line will drown in other lines where just the formatting changed if commits are not cleanly separated.

Or are you suggesting to format just parts of a file? omg that would introduce even more mess and can't be automated. Contributors would need to do the work of clang-format manually. I really couldn't see that in practice.

Which code style do you use as your baseline for 'properly formatted'?

Short answer: "properly formatted" is no matter of choosing a style or ruleset but rather anything that's

1. enforcable in CI

and

2. consistent across the codebase.

Longer answer: Currently, the .clang-format config of this PR is based on @rmu75's work with minor changes and lots of disabled options with the goal to

  • invoke as little changes to the source as possible while
  • gaining readability and
  • gaining uniformity

I tried to stick to the clang-format defaults as close as possible with respect to the above aspects.
With black, no config is needed (I just quickly glanced at the resulting python files. I saw no problem.)
Python developers might introduce custom settings/exceptions).

The final ruleset will hopefully be a result of this discussion and there's no trouble changing it in the future and reformatting everything again. But I really don't have an opinion on the final style as long as it's consistent and enforcable.

docs/src/code/style-guide.adoc and src/CodingStyle* are slightly different, so it is a bit unclear to me what the target formatting should be.

Yeah, those would be obsolete (they obviously are already) and might be by replaced by a short note like "run make format before comitting".
If the CI would just re-format every commit, which is possible iirc, the styleguides can be removed completely as IDEs could use the .clang-format or pyproject.toml so contributors don't need to care about formatting at all (who reads styleguides anyway?).

@petterreinholdtsen
Copy link
Collaborator

petterreinholdtsen commented Jan 2, 2024 via email

@heeplr
Copy link
Contributor Author

heeplr commented Jan 2, 2024

I have no idea how attached the project is to those style guides.

Working with the code until now gave me a pretty good idea: No one gives a darn. ;)

@smoe
Copy link
Collaborator

smoe commented Jan 3, 2024

IMHO it all depends on what we want to achieve with our code. The doxygen changes (#2827) and the code formatting both help to make LinuxCNC look more professional and likely help new devs to become productive more quickly. I personally also see a certain beauty in an indentation that perfectly and reliably matches the semantics. Anything else to me is like a typo in the documentation, or worse. You just want to fix this.

Git blame and diff can work across multiple revisions, so I am not completely worried. Also, I am more of a bisect person than a diff person. And if truly interested to perform a diff against particular branch, one can still apply the formatting to that branch/revision no matter how old that branch/revision is.

I very much get that anyone deep in the development of LinuxCNC will not necessarily care about the beauty of the code. The community contributing the reformatting and an automation of that process via git hooks should be happily embraced, though.

@andypugh
Copy link
Collaborator

andypugh commented Jan 7, 2024

I haven't found much code where the intentation is wrong, as long as you experiment a bit to see what the original coder wanted to do. The main culprit here are the files that use tabs and spaces and these look wrong until you set tabs to 8 spaces.
I agree that this isn't ideal, but neither is it a huge problem, given how rarely anyone visits those corners of the codebase.

I feel that losing the "blame" view in Github would be a significant sacrifice. It's super-useful for things other that bisect can't do, like checking if changes properly propagated, and if so, which version.

@heeplr
Copy link
Contributor Author

heeplr commented Feb 13, 2024

I haven't found much code where the intentation is wrong

This is not just about indentation. Formatting also includes braces placement, (non)spacing etc.
This is to make all code look the same everywhere. Also automating it will make it future-proof in case anyone comes along, messing up the style even further.

as long as you experiment a bit to see what the original coder wanted to do.

Yeah, I don't really feel like experimenting. And I don't think anyone should be bothered with such a time waster. It's certainly not a good strategy in terms of working on the project.

The main culprit here are the files that use tabs and spaces and these look wrong until you set tabs to 8 spaces.

It's not the "main culprit". It's one flaw among many.

I agree that this isn't ideal, but neither is it a huge problem,

I never said it's a huge problem but it's probably the lowest hanging fruit on a way to a clean & modern codebase.
I just don't see why small problems should stay unsolved when they are fundamential.

given how rarely anyone visits those corners of the codebase.

Understandable. They're not very inviting.

I feel that losing the "blame" view in Github would be a significant sacrifice. It's super-useful for things other that bisect can't do, like checking if changes properly propagated, and if so, which version.

I guess that's what --ignore-rev <rev> was made for. From the git man page:

--ignore-rev <rev>
           Ignore changes made by the revision when assigning blame, as if the change never happened.
           [...]

If you care about the github webinterface, please consider this solution.

Please Remember:
You are worrying about loosing the ability to nicely read the code when doing git blame, which is an edge case.
While this whole thing is just about readability - but for ALL cases. Git blame will also look better after this.

And I really like to stress (again) how backwards the current state is.
Every major FOSS project that's old enough will go through this eventually. Most have already. (Take OpenSSL as just one example. It took them multiple big commits until things had settled.) This is a common milestone and it certainly isn't a huge problem.

IMHO this discussion should focus on "how it's done" not "if it's done". This is just my suggestion for a first draft to tackle the problem. Considering other, much larger issues in the linuxcnc codebase - which require a lot of work and a lot of important decisions - a thing like "automating code formatting (rule enforcment)" should be easy.

@smoe
Copy link
Collaborator

smoe commented Nov 9, 2025

Hello @heeplr,

It took a while, but our review of PRs in today's spontaneous video meet-up had us agree that any automatism to get our source tree into a shiny nicely formatted consistent state would be good, indeed. The only concern is that any such dominating set of changes will ruin what git blame will show. Now, according to https://stackoverflow.com/questions/34957237/can-i-configure-git-blame-to-always-ignore-certain-commits-want-to-fix-git-blam any such information can be learned via git config. But will github then also know about that? If that is settled, and I have properly understood our discussion, then I think your work will be happily accepted.

@rmu75
Copy link
Contributor

rmu75 commented Nov 10, 2025

Should be easy to just test what is suggested in the solution heeplr linked

It is possible to instruct github blame and also git blame to ignore revisions (basically put commit hashes into .git-blame-ignore-revs) and it is possible to override that ignore (add a ~ to URL, see link).

@smoe
Copy link
Collaborator

smoe commented Nov 11, 2025

I missed that. Thank you, @rmu75 . No immediate idea about how to proceed from here, though.

@c-morley
Copy link
Collaborator

Is the idea here to add continuous automatic reformatting, or an option to reformat as needed or ??

The first one sounds pretty horrible and the second seems troublesome too. What are we trying to fix? I feel the medicine might be worse then the sickness.

@rmu75
Copy link
Contributor

rmu75 commented Nov 11, 2025

At least on the C/C++ side, linuxcnc source code is pretty consistent. IMO it would make sense to reformat the code that doesn't adhere to style conventions in one commit (which will be small) and automatically apply clang-format upon commit / pull request. That could also make editing code in modern IDEs like VS code/codium more convenient because they can automatically use clang-format settings and nobody has to think about it any more.

I don't know about python code. Personally I use black to have consistent python style, but I admit the result can look strange.

@andypugh
Copy link
Collaborator

At least on the C/C++ side, linuxcnc source code is pretty consistent.

I am not sure that I agree with this. Many files are formatted with mixed spaces and tabs, whereas others are all tabs or all spaces.

@rmu75
Copy link
Contributor

rmu75 commented Nov 12, 2025

At least on the C/C++ side, linuxcnc source code is pretty consistent.

I am not sure that I agree with this. Many files are formatted with mixed spaces and tabs, whereas others are all tabs or all spaces.

at least it looks consistent if the expected tab length is used (4 i think).

@c-morley
Copy link
Collaborator

Could you explain this more please:
'automatically apply clang-format upon commit / pull request. '

@smoe
Copy link
Collaborator

smoe commented Nov 12, 2025

Could you explain this more please: 'automatically apply clang-format upon commit / pull request. '

I did not read about any auto-applied formatting in the PR. But repeated white-space commits are to be avoided, hence we need some way to a) format everything how it should be in a single huge commit (to then be masked from git blame) and b) only accept properly formatted commits afterwards. No idea if any git hook would be beneficial or if just the CI should fail.

I am confident we could all benefit from a such introduced consistent coding style. It’s not something that will directly fix existing issues, but I expect it would make the codebase easier to navigate and maintain. Even if it doesn’t immediately attract new contributors, it could make reviews and external engagement smoother and strengthen the overall impression of LinuxCNC as a well-structured and professionally maintained project.

Could you explain what exactly your concerns are?

@rmu75
Copy link
Contributor

rmu75 commented Nov 12, 2025

Could you explain this more please: 'automatically apply clang-format upon commit / pull request. '

I would do it with a git pre-commit hook like FreeCAD, explained here https://freecad.github.io/DevelopersHandbook/codeformatting/

That would guarantee consistent formatting.

@c-morley
Copy link
Collaborator

Thanks for replying.

First reasonably consistent formatting is a worthy goal. Getting rid of all tabs would be great IMO.
That said I also don't think we need to have perfectly consistent formatting.

For instance we have a version of classicladder in the linuxcnc code. It would be great to have it's code consistently formatted but it doesn't have to have the exact same style as the mesa hostmot2 code.

I do a lot of python coding, While I could surely use some more consistent formatting, I certainly don't want to perfectly follow pep8.

I absolutely don't want automatic format changing for code I commit and I don't want code automatically blocked because it doesn't perfectly follow some formatting rules.
If you want to add tests with warnings, that would probably be helpful.
If I cherry pick code from an on going branch I'm working on and we automatically 'fix' it, and then later I try to merge the branches, I'm sure that would be a mess (or just annoying) to figure out.

As the developers that have commit authority, we are the ones who should decide it the formatting is so bad it needs to be fixed.

But the truth is we are so short on developers and pull requester that we can ill afford to chases away code
that fixes a problem but doesn't perfectly follow a formatting style. I've seen people so annoyed by the need to rewrite a pull request because of some multiple trivial requests that they never try again.

If we want to fix formatting in the code, lets fix it in one big push and they leave it alone.
There are far better things to spend time on then automatic authoritarianism :)

@rmu75
Copy link
Contributor

rmu75 commented Nov 13, 2025

I certainly do not want to impose anything upon anybody. But. Some of the C++ code is really hard to follow in github or with wrong tab size configured (e.g. interp_G7x.cc or those nested switch/case statements in emctask*.cc) so IMHO "fixing" that will really lower the barrier to understanding and participation. I don't have nor want commit access, nevertheless after kludging zmq into emctask and tinkering with the TP I would argue that there are indeed places where the situation is bad enough it needs to be fixed ;-)

Even fixing only the github-display should be reason enough. Just as example this is unreadable in github:

class straight_segment:public segment {
public:
    straight_segment(double sx, double sz,double ex, double ez):
	segment(sz,sx,ez,ex) {}
    straight_segment(std::complex<double> s,std::complex<double> e):
	segment(s,e) {}
    void intersection_z(double x, intersections_t &is) override;
    bool climb(std::complex<double>&,motion_base*) override;
    bool dive(std::complex<double>&,double, motion_base*,bool) override;
    void climb_only(std::complex<double>&,motion_base*) override;
    void draw(motion_base *out) override { out->straight_move(end); }
    void offset(double distance) override {
	std::complex<double> d=I*distance*(start-end)/abs(start-end);
	start+=d;
	end+=d;
    }
    void intersect(segment *p) override;
    void intersect_end(round_segment *p) override;
    void intersect_end(straight_segment *p) override;
    std::unique_ptr<segment> dup() override {
	return std::make_unique<straight_segment>(*this);
    }
    double radius() override { return abs(start-end); }
};

This code was formatted with an indentation of 4 characters and a tab-size of 8 characters whereas github seems to advance a tab only 4 spaces.

I'm less sure of consequences of using something like "black" on python code, e.g. using it on axis generates a large number of changes from 'single quote style strings' to "double quote style strings". 1 file changed, 1671 insertions(+), 908 deletions(-)

AFAIU a pre-commit hook would happen on the developer machine, so if one were to use that it would assure consistent formatting among all branches from that dev. I don't know how to deal with cherry-picking or old branches. In theory, it should be possible to apply the formatting to the old branch / cherry-picked-commit before merging, but I don't know if that works in practice or if there is some automated support of that in git or github.

Automatic formatting would take care of "not perfectly following a formatting style". OTOH requirement to install some pre-commit tooling is an additional hurdle. Failing CI because of formatting / style problems is a bad idea IMO. In personal projects, I configure emacs / vs code to automatically invoke clang-format resp. black when saving a file.

Styling should probably not be applied to imported code like classicladder (but last I looked upgrading classicladder would amount to completely re-do the integration again because of extensive changes on both sides since branching off the common ancestor).

I will take a closer look how FreeCAD implements per-subsystem opt-in/opt-out to automatic formatting.

@heeplr
Copy link
Contributor Author

heeplr commented Nov 13, 2025

I do a lot of python coding, While I could surely use some more consistent formatting, I certainly don't want to perfectly follow pep8.

standardization > personal taste
You can now contribute to the final set of python formatting rules and I guess if people agree, it won't differ from your current style preference.
But in the end, those rules might not meet the taste of every current or future python developer, but it will be a fixed standard for this codebase. The former is less important than the latter.

I absolutely don't want automatic format changing for code I commit and I don't want code automatically blocked because it doesn't perfectly follow some formatting rules.

Why not? The upsides outnumber the downsides and nothing is blocking you from re-formatting to your own taste automagically after each pull/commit and using your own style. Just make sure to run the provided formatter before committing (could be automated).
Although I think it's strange to jump through hoops just to use one's own style preferences, it's perfectly possible.

If you want to add tests with warnings, that would probably be helpful.

I'd say that another constantly failing test which no one has the time to fix, is a very bad idea.

As the developers that have commit authority, we are the ones who should decide it the formatting is so bad it needs to be fixed.

Exactly. But you decide that once when you agree on rules. Not for each and every commit or PR.

But the truth is we are so short on developers [...] that we can ill afford to chases away code that fixes a problem but doesn't perfectly follow a formatting style.

You could offer a nice way to format correctly so no one has to worry about that. Ever again.
Currently that's not something that is done automatically for contributors and it's each contributors own responsibility to get it right. Lots of burden can be lifted from contributors thanks to auto-formatting, linting, git-hooks etc.
It's a good thing.

As mentioned above I'd argue the opposite:
Even reading the whole code is a hassle, partly because of style.
Not to mention if you try to follow the styleguide when writing code. Those hassles might chase away new code and at least doesn't make it easier/more attractive.

I've seen people so annoyed by the need to rewrite a pull request because of some multiple trivial requests that they never try again.

Do you have an example for frustration about wrong style? Normally this is done automatically. (If you use some common style, changes are subtle most of the time.)

If we want to fix formatting in the code, lets fix it in one big push and they leave it alone. There are far better things to spend time on then automatic authoritarianism :)

What about the future then? Would you volunteer to fix wrong style regularly?

It just makes so much sense to automate this.
I don't know all projects that use automated (enforced) style standards, but those I know would never want to look back, I'm 100% sure.

@heeplr
Copy link
Contributor Author

heeplr commented Nov 13, 2025

@rmu75

AFAIU a pre-commit hook would happen on the developer machine

If github rejects a push with style-breaking commits, one could certainly set up a pre-commit hook.
Or configure the IDE accordingly.

The important thing here imho are clear instructions. If there's a nice message like Please fix your stylebreaking commits by running those commands: ... the hurdle is quite low.
And of course everyone can still decide about the exact way to get the style right.

Failing CI because of formatting / style problems is a bad idea IMO

Imho it wouldn't hurt to double check if there's a mechanism to ensure, that failing CI is the exception caused by weird edge cases or bugs. If a git hook ensures coherent style, that CI test will probably never fail unless the hook is buggy or not triggered.
Also tests that can be run locally would check uncommitted code, which might make sense.

@andypugh
Copy link
Collaborator

I am not sure that I agree with this. Many files are formatted with mixed spaces and tabs, whereas others are all tabs or all spaces.

at least it looks consistent if the expected tab length is used (4 I think).

Have a look at src/hal/drivers/hal_motenc

That is one of many files formatted for 8-space tabs where the indent levels are

space space space space
tab
space space space space tab
tab tab

and so on.

They look sort-of OK at 4 spaces per tab, but better at 8.
The problem comes when they have been edited with tab=4-spaces by subsequent developers.

I am unsure if the code-tidying tools can handle this case.

@rmu75
Copy link
Contributor

rmu75 commented Nov 13, 2025

Mixing tabs and spaces is really very annoying, doing indentation with mixed spaces/tabs on the same line is evil.

I am unsure if the code-tidying tools can handle this case.

clang-format is a compiler that produces source code as output.

What I want to say is that clang-format is not some dumb tool that superficially does some guesswork like IDEs used to do but it really understands the code. You can remove/add non-essential whitespace however you like and it will always reformat into the same result. (At least it can be configured to do that afaiu, maybe there is also a "don't care mode" that can leave some stuff as is).

@smoe
Copy link
Collaborator

smoe commented Nov 13, 2025

For instance we have a version of classicladder in the linuxcnc code. It would be great to have it's code consistently formatted but it doesn't have to have the exact same style as the mesa hostmot2 code.

This, like any other external code, to me is actually a good example for which I would indeed see problems with the idea to perform any code reformatting, since the challenge here would be to adopt updates from the external-to-us upstream source tree.

@mozmck
Copy link
Collaborator

mozmck commented Nov 13, 2025

I would agree with c-morley. I have had the experience a couple of times where a fix was kicked back multiple times for minor code style issues. It was not automated, and the admin spent more time kicking it back then it would have taken him to fix the petty stuff himself, and I wound up spending more time fixing them than I did fixing the bug in the first place. This to me is a far greater turnoff than minor differences in code style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants