Skip to content

Commit 18e133e

Browse files
authored
improvements (#2)
* add tests (not working yet) * tests do work * use yaml config * fix relpath issue * fix for .hpp * improve readme text * add "catch all" for the default policy Co-authored-by: Vasily Kulikov <[email protected]>
1 parent 67ae2c5 commit 18e133e

File tree

3 files changed

+204
-27
lines changed

3 files changed

+204
-27
lines changed

README.md

Lines changed: 58 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -27,28 +27,76 @@ The tool is simple to use. To sort your #include's just pass it your
2727

2828
```bash
2929
cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ...
30-
sort-cpp-includes --compile-commands compile_commands.json src/ include/
31-
sort-cpp-includes --compile-commands compile_commands.json src/ include/
30+
sort-cpp-includes --compile-commands compile_commands.json src/ include/ main.cpp
3231
```
3332

34-
`compile_commands.json` can be generated via `cmake` or vscode.
35-
After the successful run you might notice changes in your source files.
33+
`compile_commands.json` can be generated via `cmake` or `vscode`.
34+
After the successful run you might notice changes in your source files,
35+
if the sorting order was not met before.
3636

37+
The tool comes with a simple sorting policy: pair header, C headers, C++ headers,
38+
headers from /usr/include, the rest files. If it doesn't fit you, you may
39+
define your own policy and pass it to `sort-cpp-includes` using '-d' option.
40+
An example from Yandex.TPlatform config:
3741

38-
## Implementation
42+
```yaml
43+
rules:
44+
- matchers:
45+
- virtual: "@pair"
46+
47+
- matchers:
48+
- virtual: "@std-c"
49+
50+
- matchers:
51+
- virtual: "@std-cpp"
52+
53+
- matchers:
54+
- regex: "/usr/include/.*"
55+
56+
- matchers:
57+
- regex: ".*/third_party/.*"
58+
- regex: ".*/google-benchmark/.*"
59+
60+
- matchers:
61+
- regex: ".*/userver/.*"
62+
- matchers:
63+
- regex: ".*/build/.*"
64+
- matchers:
65+
- regex: ".*/libraries/.*"
66+
- matchers:
67+
- regex: ".*/services/.*"
68+
```
69+
70+
The sorting and grouping runs in 2 steps. First, the include group is calculated.
71+
Adjacent include groups will be separated with a single empty line.
72+
Second, the sorting takes place for each group, independently. Each matcher
73+
define a sort group. It may consist of one or more matchers.
74+
For now the following matchers are implemented:
75+
76+
* regex - a simple Python's `re` matcher
77+
* virtual: @pair - the pair header (`server.hpp` for `server.cpp`)
78+
* virtual: @std-c - standard C language headers
79+
* virtual: @std-cpp - standard C++ language headers (C++17)
80+
81+
You may add several matchers to the same group, it would mean "at least one matcher
82+
from the group must match". The first match wins, IOW, if a first group matcher
83+
matches the include, the rest groups are skipped.
84+
85+
## Implementation details
3986

4087
To properly split includes into groups separated by newlines, we have to know
4188
the full paths of the included header files. The full path is used to identify
42-
an include group according to the user policy. E.g. '#include <os.hpp>` may include
43-
`$PWD/os.hpp`, `/usr/include/os.hpp`, and so on. sort-cpp-includes doesn't
89+
an include group according to the user policy. E.g. `#include <os.hpp>` may include
90+
`$PWD/os.hpp`, `/usr/include/os.hpp`, and so on. `sort-cpp-includes` doesn't
4491
resolve includes by itself, it uses a C++ compiler's ability to preprocess
4592
your source files.
4693

47-
4894
## Errata
4995

5096
* Sourceless header files: you have to include each header into matched .cpp files
5197
at least once.
52-
* sort-cpp-includes was tested on clang only, so sorting with alternative compilers
98+
* `sort-cpp-includes` was tested on clang only, so sorting with alternative compilers
5399
might not work as expected.
54-
* only a single thread is used, but we could utilize more
100+
* Only a single thread is used, but we could utilize more
101+
* `sort-cpp-includes` stops searching include directives on the first non-include line
102+
except `#pragma once`.

src/sort_cpp_includes/sort_cpp_includes.py

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
import sys
1111
import tempfile
1212
import typing
13+
import yaml
1314

1415

1516
def read_file_contents(path: str) -> str:
@@ -40,6 +41,8 @@ def command_to_cmdline(command):
4041

4142

4243
def read_compile_commands(path: str) -> typing.Dict[str, CCEntry]:
44+
print('Loading compile_commands.json...')
45+
4346
compile_commands_raw = read_file_contents(path)
4447
compile_commands_json = json.loads(compile_commands_raw)
4548

@@ -51,6 +54,8 @@ def read_compile_commands(path: str) -> typing.Dict[str, CCEntry]:
5154
)
5255
for entry in compile_commands_json
5356
}
57+
58+
print('compile_commands.json is loaded.')
5459
return compile_commands
5560

5661

@@ -346,16 +351,7 @@ def is_match(self, path: str, orig_path: str, my_filename: str) -> bool:
346351
{'matchers': [{'virtual': '@std-c'}]},
347352
{'matchers': [{'virtual': '@std-cpp'}]},
348353
{'matchers': [{'regex': '/usr/include/.*'}]},
349-
{
350-
'matchers': [
351-
{'regex': '.*/third_party/.*'},
352-
{'regex': '.*/google-benchmark/.*'},
353-
],
354-
},
355-
{'matchers': [{'regex': '.*/userver/.*'}]},
356-
{'matchers': [{'regex': '.*/build/.*'}]},
357-
{'matchers': [{'regex': '.*/libraries/.*'}]},
358-
{'matchers': [{'regex': '.*/services/.*'}]},
354+
{'matchers': [{'regex': '.*'}]},
359355
],
360356
}
361357

@@ -435,11 +431,13 @@ def sort_includes(
435431
if match:
436432
break
437433
if match:
434+
# print(f'write {line}')
438435
res[i].append(line)
439436
break
440437

441438
if not match:
442439
raise Exception(f'Include "{line}" doesn\'t match any pattern')
440+
# print(res)
443441

444442
for group in res:
445443
group.sort(key=lambda x: (not x.endswith('.h>'), x))
@@ -477,6 +475,7 @@ def set(self, key: typing.Any, value: str) -> None:
477475

478476
# Returns the absolute path of a header from 'include_line'
479477
def include_realpath_cached(
478+
filepath: str,
480479
source_filepath: str,
481480
include_line: str,
482481
compile_commands: typing.Dict[str, CCEntry],
@@ -489,30 +488,32 @@ def include_realpath_cached(
489488
if entry:
490489
return entry
491490

492-
result = include_realpath(source_filepath, include_line, compile_commands)
491+
result = include_realpath(filepath, source_filepath, include_line, compile_commands)
493492
if result:
494493
realpath_cache.set(key, result)
495494
return result
496495

497496

498497
def include_realpath(
498+
filepath: str,
499499
source_filepath: str,
500500
include_line: str,
501501
compile_commands: typing.Dict[str, CCEntry],
502502
) -> str:
503-
filepath = os.path.abspath(source_filepath)
503+
filepath = os.path.abspath(filepath)
504+
source_filepath = os.path.abspath(source_filepath)
504505

505-
directory = os.path.dirname(source_filepath)
506+
directory = os.path.dirname(filepath)
506507
tmp = tempfile.NamedTemporaryFile(suffix='.cpp', dir=directory)
507508
tmp_name = tmp.name
508509
tmp.write(include_line.encode())
509510
tmp.write('\n'.encode())
510511
tmp.flush()
511512

512-
command = compile_commands.get(filepath)
513+
command = compile_commands.get(source_filepath)
513514
if not command:
514515
raise Exception(
515-
f'Failed to find "{filepath}" in compile_commands.json',
516+
f'Failed to find "{source_filepath}" in compile_commands.json',
516517
)
517518
command_items = adjust_cc_command(command) + [tmp_name]
518519

@@ -650,6 +651,7 @@ def do_handle_single_file(
650651
if not line.strip():
651652
continue
652653
abs_include = include_realpath_cached(
654+
filename,
653655
filename_for_cc, line, compile_commands, realpath_cache,
654656
)
655657

@@ -672,7 +674,7 @@ def do_handle_single_file(
672674
ofile.write('#pragma once\n\n')
673675
write_includes(sorted_includes, ofile)
674676

675-
for line in orig_file_contents[i:]:
677+
for line in orig_file_contents[i + 1 :]:
676678
ofile.write(line)
677679
ofile.write('\n')
678680
os.rename(src=tmp_filename, dst=filename)
@@ -684,7 +686,7 @@ def read_config(filepath: typing.Optional[str]) -> Config:
684686
return Config(DEFAULT_RULES)
685687

686688
with open(filepath, 'r') as ifile:
687-
contents = json.load(ifile)
689+
contents = yaml.safe_load(ifile)
688690
print(f'loaded {filepath} rule set')
689691
return Config(contents)
690692

@@ -703,13 +705,15 @@ def collect_files(
703705
def collect_all_files(
704706
paths: typing.List[str], suffixes: typing.List[str],
705707
) -> typing.List[str]:
708+
print('Collecting input files...')
706709
headers = []
707710
for filepath in paths:
708711
if os.path.isfile(filepath):
709712
headers.append(filepath)
710713
elif os.path.isdir(filepath):
711714
for header in collect_files(filepath, suffixes):
712715
headers.append(header)
716+
print(f'Collected files.')
713717
return headers
714718

715719

src/sort_cpp_includes/test_sort.py

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
import typing
2+
import pytest
3+
import dataclasses
4+
import json
5+
6+
from . import sort_cpp_includes
7+
8+
9+
@dataclasses.dataclass(frozen=True)
10+
class FakeArgs:
11+
paths: typing.List[str]
12+
compile_commands: str
13+
config: str
14+
hpp_suffixes: str
15+
cpp_suffixes: str
16+
17+
18+
# TODO: ad-hoc
19+
COMPILER = '/usr/bin/clang++-9'
20+
21+
22+
def compose_compile_commands(args):
23+
result = []
24+
for arg in args:
25+
result.append({
26+
'command': ' '.join([f'{COMPILER}', '-c', arg, '-o', f'{arg}.o']),
27+
'directory': '/',
28+
'file': arg,
29+
},
30+
)
31+
return result
32+
33+
34+
@pytest.fixture
35+
def check(tmp_path):
36+
def _check(input_cpp: str, expected_output: str, rules: dict):
37+
input_fname = tmp_path / 'input.cpp'
38+
with open(input_fname, 'w') as ifile:
39+
ifile.write(input_cpp)
40+
41+
cc = compose_compile_commands([str(input_fname)])
42+
cc_fname = tmp_path / 'compile_commands.json'
43+
with open(cc_fname, 'w') as cc_file:
44+
cc_file.write(json.dumps(cc))
45+
46+
config_fname = tmp_path / 'sorting-rules.json'
47+
with open(config_fname, 'w') as config_file:
48+
config_file.write(json.dumps(rules))
49+
50+
args = FakeArgs(
51+
paths=[str(input_fname)],
52+
compile_commands=str(cc_fname),
53+
config=config_fname,
54+
hpp_suffixes='.hpp,.h',
55+
cpp_suffixes='.cpp,.cc',
56+
)
57+
sort_cpp_includes.process(args)
58+
59+
with open(input_fname, 'r') as ifile:
60+
contents = ifile.read()
61+
62+
# print(contents)
63+
# print(expected_output)
64+
assert contents.strip() == expected_output.strip()
65+
66+
return _check
67+
68+
69+
def test_one_group(tmp_path, check):
70+
input_cpp="""
71+
#include <iostream>
72+
#include <atomic>
73+
#include <cstdio>
74+
"""
75+
76+
expected_output="""#include <atomic>
77+
#include <cstdio>
78+
#include <iostream>
79+
80+
"""
81+
82+
rules = {
83+
'rules': [
84+
{'matchers': [
85+
{'regex': '.*'},
86+
],
87+
},
88+
],
89+
}
90+
check(input_cpp, expected_output, rules)
91+
92+
93+
def test_multiple_groups(tmp_path, check):
94+
input_cpp="""
95+
#include <iostream>
96+
#include <time.h>
97+
#include <atomic>
98+
#include <locale.h>
99+
#include <cstdio>
100+
#include <stdio.h>
101+
#include <signal.h>
102+
"""
103+
104+
expected_output="""
105+
#include <locale.h>
106+
#include <signal.h>
107+
#include <stdio.h>
108+
#include <time.h>
109+
110+
#include <atomic>
111+
#include <cstdio>
112+
#include <iostream>
113+
"""
114+
115+
rules = {
116+
'rules': [
117+
{'matchers': [{'virtual': '@std-c'}]},
118+
{'matchers': [{'virtual': '@std-cpp'}]},
119+
{'matchers': [
120+
{'regex': '.*'},
121+
],
122+
},
123+
],
124+
}
125+
check(input_cpp, expected_output, rules)

0 commit comments

Comments
 (0)