Skip to content

Commit 6fae3aa

Browse files
avarttaylorr
authored andcommitted
spatchcache: add a ccache-alike for "spatch"
Add a rather trivial "spatchcache", with this running e.g.: make cocciclean make contrib/coccinelle/free.cocci.patch \ SPATCH=contrib/coccicheck/spatchcache \ SPATCH_FLAGS=--very-quiet Is cut down from ~20s to ~5s on my system. Much of that is either fixable shell overhead, or the around 40 files we "CANTCACHE" (see the implementation). This uses "redis" as a cache by default, but it's configurable. See the embedded documentation. This is *not* like ccache in that we won't cache failed spatch invocations, or those where spatch suggests changes for us. Those cases are so rare that I didn't think it was worth the bother, by far the most common case is that it has no suggested changes. We'll also refuse to cache any "spatch" invocation that has output on stderr, which means that "--very-quiet" must be added to "SPATCH_FLAGS". Because we narrow the cache to that we don't need to save away stdout, stderr & the exit code. We simply cache the cases where we had no suggested changes. Another benchmark is to compare this with the previous SPATCH_BATCH_SIZE=N, as noted in [1]. Before this (on my 8 core system) running: make clean; time make contrib/coccinelle/array.cocci.patch SPATCH_BATCH_SIZE=0 Would take 33s, but with the preceding changes running without this "spatchcache" is slightly slower, or around 35s: make clean; time make contrib/coccinelle/array.cocci.patch Now doing the same with SPATCH=contrib/coccinelle/spatchcache will take around 6s, but we'll need to compile the *.o files first to take full advantage of it (which can be fast with "ccache"): make clean; make; time make contrib/coccinelle/array.cocci.patch SPATCH=contrib/coccinelle/spatchcache 1. https://lore.kernel.org/git/[email protected]/ Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]> Signed-off-by: Taylor Blau <[email protected]>
1 parent d0e624a commit 6fae3aa

File tree

2 files changed

+324
-0
lines changed

2 files changed

+324
-0
lines changed

contrib/coccinelle/README

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,3 +70,23 @@ Git-specific tips & things to know about how we run "spatch":
7070
my_name", v.s. anonymous "@@") needs to be unique across all our
7171
*.cocci files. You should only need to name rules if other rules
7272
depend on them (currently only one rule is named).
73+
74+
* To speed up incremental runs even more use the "spatchcache" tool
75+
in this directory as your "SPATCH". It aimns to be a "ccache" for
76+
coccinelle, and piggy-backs on "COMPUTE_HEADER_DEPENDENCIES".
77+
78+
It caches in Redis by default, see it source for a how-to.
79+
80+
In one setup with a primed cache "make coccicheck" followed by a
81+
"make clean && make" takes around 10s to run, but 2m30s with the
82+
default of "SPATCH_CONCAT_COCCI=Y".
83+
84+
With "SPATCH_CONCAT_COCCI=" the total runtime is around ~6m, sped
85+
up to ~1m with "spatchcache".
86+
87+
Most of the 10s (or ~1m) being spent on re-running "spatch" on
88+
files we couldn't cache, as we didn't compile them (in contrib/*
89+
and compat/* mostly).
90+
91+
The absolute times will differ for you, but the relative speedup
92+
from caching should be on that order.

contrib/coccinelle/spatchcache

Lines changed: 304 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,304 @@
1+
#!/bin/sh
2+
#
3+
# spatchcache: a poor-man's "ccache"-alike for "spatch" in git.git
4+
#
5+
# This caching command relies on the peculiarities of the Makefile
6+
# driving "spatch" in git.git, in particular if we invoke:
7+
#
8+
# make
9+
# # See "spatchCache.cacheWhenStderr" for why "--very-quiet" is
10+
# # used
11+
# make coccicheck SPATCH_FLAGS=--very-quiet
12+
#
13+
# We can with COMPUTE_HEADER_DEPENDENCIES (auto-detected as true with
14+
# "gcc" and "clang") write e.g. a .depend/grep.o.d for grep.c, when we
15+
# compile grep.o.
16+
#
17+
# The .depend/grep.o.d will have the full header dependency tree of
18+
# grep.c, and we can thus cache the output of "spatch" by:
19+
#
20+
# 1. Hashing all of those files
21+
# 2. Hashing our source file, and the *.cocci rule we're
22+
# applying
23+
# 3. Running spatch, if suggests no changes (by far the common
24+
# case) we invoke "spatchCache.getCmd" and
25+
# "spatchCache.setCmd" with a hash SHA-256 to ask "does this
26+
# ID have no changes" or "say that ID had no changes>
27+
# 4. If no "spatchCache.{set,get}Cmd" is specified we'll use
28+
# "redis-cli" and maintain a SET called "spatch-cache". Set
29+
# appropriate redis memory policies to keep it from growing
30+
# out of control.
31+
#
32+
# This along with the general incremental "make" support for
33+
# "contrib/coccinelle" makes it viable to (re-)run coccicheck
34+
# e.g. when merging integration branches.
35+
#
36+
# Note that the "--very-quiet" flag is currently critical. The cache
37+
# will refuse to cache anything that has output on STDERR (which might
38+
# be errors from spatch), but see spatchCache.cacheWhenStderr below.
39+
#
40+
# The STDERR (and exit code) could in principle be cached (as with
41+
# ccache), but then the simple structure in the Redis cache would need
42+
# to change, so just supply "--very-quiet" for now.
43+
#
44+
# To use this, simply set SPATCH to
45+
# contrib/coccinelle/spatchcache. Then optionally set:
46+
#
47+
# [spatchCache]
48+
# # Optional: path to a custom spatch
49+
# spatch = ~/g/coccicheck/spatch.opt
50+
#
51+
# As well as this trace config (debug implies trace):
52+
#
53+
# cacheWhenStderr = true
54+
# trace = false
55+
# debug = false
56+
#
57+
# The ".depend/grep.o.d" can also be customized, as a string that will
58+
# be eval'd, it has access to a "$dirname" and "$basename":
59+
#
60+
# [spatchCache]
61+
# dependFormat = "$dirname/.depend/${basename%.c}.o.d"
62+
#
63+
# Setting "trace" to "true" allows for seeing when we have a cache HIT
64+
# or MISS. To debug whether the cache is working do that, and run e.g.:
65+
#
66+
# redis-cli FLUSHALL
67+
# <make && make coccicheck, as above>
68+
# grep -hore HIT -e MISS -e SET -e NOCACHE -e CANTCACHE .build/contrib/coccinelle | sort | uniq -c
69+
# 600 CANTCACHE
70+
# 7365 MISS
71+
# 7365 SET
72+
#
73+
# A subsequent "make cocciclean && make coccicheck" should then have
74+
# all "HIT"'s and "CANTCACHE"'s.
75+
#
76+
# The "spatchCache.cacheWhenStderr" option is critical when using
77+
# spatchCache.{trace,debug} to debug whether something is set in the
78+
# cache, as we'll write to the spatch logs in .build/* we'd otherwise
79+
# always emit a NOCACHE.
80+
#
81+
# Reading the config can make the command much slower, to work around
82+
# this the config can be set in the environment, with environment
83+
# variable name corresponding to the config key. "default" can be used
84+
# to use whatever's the script default, e.g. setting
85+
# spatchCache.cacheWhenStderr=true and deferring to the defaults for
86+
# the rest is:
87+
#
88+
# export GIT_CONTRIB_SPATCHCACHE_DEBUG=default
89+
# export GIT_CONTRIB_SPATCHCACHE_TRACE=default
90+
# export GIT_CONTRIB_SPATCHCACHE_CACHEWHENSTDERR=true
91+
# export GIT_CONTRIB_SPATCHCACHE_SPATCH=default
92+
# export GIT_CONTRIB_SPATCHCACHE_DEPENDFORMAT=default
93+
# export GIT_CONTRIB_SPATCHCACHE_SETCMD=default
94+
# export GIT_CONTRIB_SPATCHCACHE_GETCMD=default
95+
96+
set -e
97+
98+
env_or_config () {
99+
env="$1"
100+
shift
101+
if test "$env" = "default"
102+
then
103+
# Avoid expensive "git config" invocation
104+
return
105+
elif test -n "$env"
106+
then
107+
echo "$env"
108+
else
109+
git config $@ || :
110+
fi
111+
}
112+
113+
## Our own configuration & options
114+
debug=$(env_or_config "$GIT_CONTRIB_SPATCHCACHE_DEBUG" --bool "spatchCache.debug")
115+
if test "$debug" != "true"
116+
then
117+
debug=
118+
fi
119+
if test -n "$debug"
120+
then
121+
set -x
122+
fi
123+
124+
trace=$(env_or_config "$GIT_CONTRIB_SPATCHCACHE_TRACE" --bool "spatchCache.trace")
125+
if test "$trace" != "true"
126+
then
127+
trace=
128+
fi
129+
if test -n "$debug"
130+
then
131+
# debug implies trace
132+
trace=true
133+
fi
134+
135+
cacheWhenStderr=$(env_or_config "$GIT_CONTRIB_SPATCHCACHE_CACHEWHENSTDERR" --bool "spatchCache.cacheWhenStderr")
136+
if test "$cacheWhenStderr" != "true"
137+
then
138+
cacheWhenStderr=
139+
fi
140+
141+
trace_it () {
142+
if test -z "$trace"
143+
then
144+
return
145+
fi
146+
echo "$@" >&2
147+
}
148+
149+
spatch=$(env_or_config "$GIT_CONTRIB_SPATCHCACHE_SPATCH" --path "spatchCache.spatch")
150+
if test -n "$spatch"
151+
then
152+
if test -n "$debug"
153+
then
154+
trace_it "custom spatchCache.spatch='$spatch'"
155+
fi
156+
else
157+
spatch=spatch
158+
fi
159+
160+
dependFormat='$dirname/.depend/${basename%.c}.o.d'
161+
dependFormatCfg=$(env_or_config "$GIT_CONTRIB_SPATCHCACHE_DEPENDFORMAT" "spatchCache.dependFormat")
162+
if test -n "$dependFormatCfg"
163+
then
164+
dependFormat="$dependFormatCfg"
165+
fi
166+
167+
set=$(env_or_config "$GIT_CONTRIB_SPATCHCACHE_SETCMD" "spatchCache.setCmd")
168+
get=$(env_or_config "$GIT_CONTRIB_SPATCHCACHE_GETCMD" "spatchCache.getCmd")
169+
170+
## Parse spatch()-like command-line for caching info
171+
arg_sp=
172+
arg_file=
173+
args="$@"
174+
spatch_opts() {
175+
while test $# != 0
176+
do
177+
arg_file="$1"
178+
case "$1" in
179+
--sp-file)
180+
arg_sp="$2"
181+
;;
182+
esac
183+
shift
184+
done
185+
}
186+
spatch_opts "$@"
187+
if ! test -f "$arg_file"
188+
then
189+
arg_file=
190+
fi
191+
192+
hash_for_cache() {
193+
# Parameters that should affect the cache
194+
echo "args=$args"
195+
echo "config spatchCache.spatch=$spatch"
196+
echo "config spatchCache.debug=$debug"
197+
echo "config spatchCache.trace=$trace"
198+
echo "config spatchCache.cacheWhenStderr=$cacheWhenStderr"
199+
echo
200+
201+
# Our target file and its dependencies
202+
git hash-object "$1" "$2" $(grep -E -o '^[^:]+:$' "$3" | tr -d ':')
203+
}
204+
205+
# Sanity checks
206+
if ! test -f "$arg_sp" && ! test -f "$arg_file"
207+
then
208+
echo $0: no idea how to cache "$@" >&2
209+
exit 128
210+
fi
211+
212+
# Main logic
213+
dirname=$(dirname "$arg_file")
214+
basename=$(basename "$arg_file")
215+
eval "dep=$dependFormat"
216+
217+
if ! test -f "$dep"
218+
then
219+
trace_it "$0: CANTCACHE have no '$dep' for '$arg_file'!"
220+
exec "$spatch" "$@"
221+
fi
222+
223+
if test -n "$debug"
224+
then
225+
trace_it "$0: The full cache input for '$arg_sp' '$arg_file' '$dep'"
226+
hash_for_cache "$arg_sp" "$arg_file" "$dep" >&2
227+
fi
228+
sum=$(hash_for_cache "$arg_sp" "$arg_file" "$dep" | git hash-object --stdin)
229+
230+
trace_it "$0: processing '$arg_file' with '$arg_sp' rule, and got hash '$sum' for it + '$dep'"
231+
232+
getret=
233+
if test -z "$get"
234+
then
235+
if test $(redis-cli SISMEMBER spatch-cache "$sum") = 1
236+
then
237+
getret=0
238+
else
239+
getret=1
240+
fi
241+
else
242+
$set "$sum"
243+
getret=$?
244+
fi
245+
246+
if test "$getret" = 0
247+
then
248+
trace_it "$0: HIT for '$arg_file' with '$arg_sp'"
249+
exit 0
250+
else
251+
trace_it "$0: MISS: for '$arg_file' with '$arg_sp'"
252+
fi
253+
254+
out="$(mktemp)"
255+
err="$(mktemp)"
256+
257+
set +e
258+
"$spatch" "$@" >"$out" 2>>"$err"
259+
ret=$?
260+
cat "$out"
261+
cat "$err" >&2
262+
set -e
263+
264+
nocache=
265+
if test $ret != 0
266+
then
267+
nocache="exited non-zero: $ret"
268+
elif test -s "$out"
269+
then
270+
nocache="had patch output"
271+
elif test -z "$cacheWhenStderr" && test -s "$err"
272+
then
273+
nocache="had stderr (use --very-quiet or spatchCache.cacheWhenStderr=true?)"
274+
fi
275+
276+
if test -n "$nocache"
277+
then
278+
trace_it "$0: NOCACHE ($nocache): for '$arg_file' with '$arg_sp'"
279+
exit "$ret"
280+
fi
281+
282+
trace_it "$0: SET: for '$arg_file' with '$arg_sp'"
283+
284+
setret=
285+
if test -z "$set"
286+
then
287+
if test $(redis-cli SADD spatch-cache "$sum") = 1
288+
then
289+
setret=0
290+
else
291+
setret=1
292+
fi
293+
else
294+
"$set" "$sum"
295+
setret=$?
296+
fi
297+
298+
if test "$setret" != 0
299+
then
300+
echo "FAILED to set '$sum' in cache!" >&2
301+
exit 128
302+
fi
303+
304+
exit "$ret"

0 commit comments

Comments
 (0)