Skip to content

Commit e5617e8

Browse files
committed
Add profanity scanning to Code Quality checks
1 parent e8ea224 commit e5617e8

File tree

4 files changed

+187
-0
lines changed

4 files changed

+187
-0
lines changed

.github/workflows/content-checks.yml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,17 @@ jobs:
5959
name: spellcheck-output
6060
path: spellcheck-output.txt
6161
retention-days: 5 # Default is 90 days
62+
- name: Scan for profanities
63+
run: |
64+
python tools/profanity.py
65+
cat profanity_log.txt
66+
67+
- name: Export profanities
68+
uses: actions/upload-artifact@v4
69+
with:
70+
name: profanities
71+
path: profanity_log.txt
72+
retention-days: 5
6273

6374
- name: Scan for malware
6475
run: |

.profanity_ignore.yml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
XX
2+
-kill
3+
kill
4+
KVM
5+
kvm
6+
X.X.X
7+
.xx.
8+
.xxx.
9+
naked
10+
facial
11+
Facial
12+
screw
13+
len
14+
LEN
15+
test
16+
TEST
17+
Test
18+
--strip-
19+
(x=x
20+
**VM
21+
Kill
22+
slave
23+
Slave
24+
A55
25+
a55
26+
455

profanity_log.txt

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
Profanity found in file: ./content/install-guides/perf.md
2+
Profanities found: dummy
3+
4+
Profanity found in file: ./content/learning-paths/embedded-and-microcontrollers/tfm/tfm.md
5+
Profanities found: <NUL>TFM_DUMMY_PROVISIONING, TFM_DUMMY_PROVISIONING, dummy
6+
7+
Profanity found in file: ./content/learning-paths/embedded-and-microcontrollers/advanced_soc/connecting_peripheral.md
8+
Profanities found: #IO_L3N_T0_DQS_AD1N_35
9+
10+
Profanity found in file: ./content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/how-to-4.md
11+
Profanities found: vmax=[np.max(l)
12+
13+
Profanity found in file: ./content/learning-paths/embedded-and-microcontrollers/training-inference-pytorch/fine-tune-2.md
14+
Profanities found: outputs.max(1)
15+
16+
Profanity found in file: ./content/learning-paths/cross-platform/kleidiai-explainer/page1.md
17+
Profanities found: *z*
18+
19+
Profanity found in file: ./content/learning-paths/cross-platform/ipexplorer/custom1.md
20+
Profanities found: m4x1_cache_2kb, m4x1_nocache_ws0, m4x1_nocache_ws4, m4x1_cache_64kb
21+
22+
Profanity found in file: ./content/learning-paths/cross-platform/vectorization-friendly-data-layout/a-more-complex-problem-revisited-sve.md
23+
Profanities found: objects->vz[i], &objects->vz[i]);, box_hi_v);, &objects->vz[i],
24+
25+
Profanity found in file: ./content/learning-paths/cross-platform/vectorization-friendly-data-layout/a-more-complex-problem-revisited.md
26+
Profanities found: objects->vz[i], vnegq_f32(box_hi_v);, box_hi_v);, vld1q_f32(&objects->vz[i]);, vst1q_f32(&objects->vz[i],
27+
28+
Profanity found in file: ./content/learning-paths/cross-platform/vectorization-friendly-data-layout/a-more-complex-problem-manual-simd.md
29+
Profanities found: vnegq_f32(box_hi_v);, box_hi_v);
30+
31+
Profanity found in file: ./content/learning-paths/cross-platform/adler32/neon-run-8.md
32+
Profanities found: Test**:
33+
34+
Profanity found in file: ./content/learning-paths/cross-platform/psa-tfm/run.md
35+
Profanities found: CVM
36+
37+
Profanity found in file: ./content/learning-paths/mobile-graphics-and-gaming/kleidiai-on-android-with-mediapipe-and-xnnpack/2-run-gemma-2b.md
38+
Profanities found: --cxxopt=-DABSL_FLAGS_STRIP_NAMES=0
39+
40+
Profanity found in file: ./content/learning-paths/mobile-graphics-and-gaming/get-started-with-arm-asr/04-generic_library.md
41+
Profanities found: **1**,
42+
43+
Profanity found in file: ./content/learning-paths/mobile-graphics-and-gaming/android_opencv_facedetection/background.md
44+
Profanities found: Facial
45+
46+
Profanity found in file: ./content/learning-paths/mobile-graphics-and-gaming/best-practices-for-hwrt-lumen-performance/8-lumen-general.md
47+
Profanities found: color=#00FF00>**1**</font>
48+
49+
Profanity found in file: ./content/learning-paths/mobile-graphics-and-gaming/build-android-selfie-app-using-mediapipe-multimodality/6-flow-data-to-view-1.md
50+
Profanities found: Paint.Style.STROKE, LANDMARK_STROKE_WIDTH
51+
52+
Profanity found in file: ./content/learning-paths/iot/iot-sdk/openiot.md
53+
Profanities found: <NUL>TFM_DUMMY_PROVISIONING, TFM_DUMMY_PROVISIONING, dummy
54+
55+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/vLLM-quant/3-run-benchmark.md
56+
Profanities found: metrics.py:455]
57+
58+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/arcee-foundation-model-on-aws/02_setting_up_the_instance.md
59+
Profanities found: G++**:
60+
61+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/from-iot-to-the-cloud-part3/how-to-2.md
62+
Profanities found: **1**, **Dev/Test**
63+
64+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/glibc-with-lse/mongo_benchmark.md
65+
Profanities found: 455
66+
67+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/triggering-pmu-events/icache.md
68+
Profanities found: 455
69+
70+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/wordpress/wordpress.md
71+
Profanities found: --strip
72+
73+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/redis/single-node_deployment.md
74+
Profanities found: **-h**, **-p**
75+
76+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/java-gc-tuning/setup.md
77+
Profanities found: 21.0.4+7-LTS,, 21.0.4+7-LTS)
78+
79+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/java-gc-tuning/tuning-parameters.md
80+
Profanities found: 21.0.4+7-LTS,, 21.0.4+7-LTS)
81+
82+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/arm_linux_page_size/centos.md
83+
Profanities found: Strip
84+
85+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/rag/backend.md
86+
Profanities found: "").strip()
87+
88+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/multiarch_ollama_on_gke/0-spin_up_gke_cluster.md
89+
Profanities found: **1**)., **1**
90+
91+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/from-iot-to-the-cloud-part1/how-to-4.md
92+
Profanities found: **vm-arm64**, **vm-arm64**,
93+
94+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/from-iot-to-the-cloud-part1/how-to-2.md
95+
Profanities found: **vm-arm64**
96+
97+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/mysql/install_mysql.md
98+
Profanities found: Manual](https://dev.mysql.com/doc/refman/8.1/en/), options](https://dev.mysql.com/doc/refman/8.1/en/installing.html)., documentation](https://dev.mysql.com/doc/refman/8.1/en/mysqld-server.html),, [instructions](https://dev.mysql.com/doc/refman/8.1/en/mysql.html)
99+
100+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/llama-vision/backend.md
101+
Profanities found: f"<|user|>\n{prompt.strip()}<|end_of_text|>\n", token.strip():
102+
103+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/llama-vision/frontend.md
104+
Profanities found: data.strip(), line.strip(), user_prompt.strip()
105+
106+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/arm_pmu/perf_event_open.md
107+
Profanities found: configure_event(&pe[3],
108+
109+
Profanity found in file: ./content/learning-paths/servers-and-cloud-computing/distributed-inference-with-llama-cpp/_index.md
110+
Profanities found: Aryan
111+

tools/profanity.py

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
import os
2+
from better_profanity import profanity
3+
import argparse
4+
import sys
5+
def load_excluded_words(file_path):
6+
with open(file_path, 'r') as f:
7+
excluded_words = [word.strip() for word in f.readlines()]
8+
return excluded_words
9+
10+
def scan_for_profanities(directory, log_file, excluded_words_file=None):
11+
exclude_words = None
12+
profanities = None
13+
if excluded_words_file:
14+
exclude_words = load_excluded_words(excluded_words_file)
15+
16+
with open(log_file, 'w') as f:
17+
for root, dirs, files in os.walk(directory):
18+
for file in files:
19+
if file.endswith('.md'): # Read only markdown files
20+
file_path = os.path.join(root, file)
21+
with open(file_path, 'r') as code_file:
22+
profanities = None
23+
content = code_file.read()
24+
if exclude_words:
25+
excluded_content = content
26+
for word in exclude_words:
27+
excluded_content = excluded_content.replace(word, '')
28+
if profanity.contains_profanity(excluded_content):
29+
profanities = set(word for word in excluded_content.split() if profanity.contains_profanity(word))
30+
else:
31+
if profanity.contains_profanity(content):
32+
profanities = set(word for word in content.split() if profanity.contains_profanity(word))
33+
if profanities:
34+
f.write(f"Profanity found in file: {file_path}\n")
35+
f.write("Profanities found: ")
36+
f.write(", ".join(profanities))
37+
f.write("\n\n")
38+
39+
scan_for_profanities("./content/", "./profanity_log.txt", excluded_words_file="./.profanity_ignore.yml")

0 commit comments

Comments
 (0)