Skip to content

Commit 5fa02d7

Browse files
StaticRocketpraneethbajjuri
authored andcommitted
ci(check-files): replace bash with faster python
Apply some tricks to speed up this lookup. Use proper regex escaping on file names to prevent false positives. Also filter out files to prevent picking up self-references in comments. Signed-off-by: Randolph Sapp <[email protected]>
1 parent 0e4cc52 commit 5fa02d7

File tree

3 files changed

+86
-18
lines changed

3 files changed

+86
-18
lines changed

.github/workflows/check-files.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,23 +32,23 @@ jobs:
3232
git fetch --no-tags --depth=1 origin master
3333
git switch master
3434
35-
- name: Run check-files.sh
35+
- name: Run check_files.py
3636
run: |
3737
# Disable color output
3838
export NO_COLOR=true
3939
4040
# Run the test
41-
bin/delta.sh -a master -b pr -- ./bin/check-files.sh
41+
bin/delta.sh -a master -b pr -- ./bin/check_files.py
4242
4343
# Prepare summary
4444
WARNING_COUNT=$(wc -l < _new-warn.log)
4545
if [ "$WARNING_COUNT" -gt "0" ]; then
46-
echo "New unreachable files found with check-files.sh:"
46+
echo "New unreachable files found with check_files.py:"
4747
echo '```text'
4848
cat _new-warn.log
4949
echo '```'
5050
else
51-
echo "No new unreachable files found with check-files.sh"
51+
echo "No new unreachable files found with check_files.py"
5252
fi >> "$GITHUB_STEP_SUMMARY"
5353
5454
# Prepare the artifacts

bin/check-files.sh

Lines changed: 0 additions & 14 deletions
This file was deleted.

bin/check_files.py

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
#!/usr/bin/env python3
2+
3+
"""Tool to check that all files are being used
4+
5+
SPDX-License-Identifier: MIT
6+
Copyright (C) 2025 Texas Instruments Incorporated - https://www.ti.com
7+
"""
8+
9+
import logging
10+
import re
11+
from pathlib import Path
12+
13+
logger = logging.getLogger(__name__)
14+
15+
SOURCE_PATH = Path("source/")
16+
RST_SOURCE = set(SOURCE_PATH.glob("**/*.rst"))
17+
IGNORED = re.compile(r"([^_].*\.rst)|(version\.txt)")
18+
19+
20+
def get_names(base):
21+
"""Get a set of file names to check for, ignoring anything in that matches the IGNORED regex.
22+
23+
:param base: Pathlib path to directory to search
24+
:return: Set of string path names
25+
"""
26+
files_to_check = set()
27+
for file in base.glob("**/*"):
28+
if file.is_dir():
29+
continue
30+
31+
name = file.name
32+
if IGNORED.match(name):
33+
logger.debug("Ignored: %s", name)
34+
continue
35+
36+
files_to_check.add(name)
37+
return files_to_check
38+
39+
40+
def check_file(string, file):
41+
"""Check to see if the given string appears in the file.
42+
43+
:param string: String to look up
44+
:param file: Pathlib path to file
45+
:return: Boolean based on presence of string
46+
"""
47+
pattern = re.compile(re.escape(string))
48+
text = file.read_text(encoding="utf-8")
49+
for _ in pattern.finditer(text):
50+
return True
51+
return False
52+
53+
54+
def check_all(string):
55+
"""Use an scan for any matches in RST_SOURCE files. Do not look for matches in the file itself.
56+
That last bit is particularly relevant for RST files that exist to be included in other files.
57+
58+
:param string: String to look up
59+
:return: Boolean based on presence of string in any other files
60+
"""
61+
for file in RST_SOURCE:
62+
if file == string:
63+
continue
64+
65+
if check_file(string, file):
66+
return True
67+
return False
68+
69+
70+
def main():
71+
"""Main CLI entrypoint"""
72+
logging.basicConfig(level=logging.INFO)
73+
74+
files_to_check = get_names(SOURCE_PATH)
75+
for filename in files_to_check:
76+
if check_all(filename):
77+
continue
78+
logging.info("File not used: %s", filename)
79+
80+
81+
if __name__ == "__main__":
82+
main()

0 commit comments

Comments
 (0)