Skip to content

Commit 671944d

Browse files
committed
Cut down on exponential growth in checking switch exhaustivity.
The problem Assume 'A' and 'B' are enums with cases .a1, .a2, etc. and .b1, .b2, etc. If we try to typecheck this switch: switch (a, b) { case (.a1, .b1), (.a1, .b2): break ... The compiler is going to try to perform the following series of operations: > var s = (A, B) > s -= (.a1, .b1) ((A - .a1, B) | (A, B - .b1)) > s -= (.a1, .b2) (((A - .a1, B) | (A - .a1, B - .b2)) | ((A - .a1, B - .b1) | (A, B - .b1 - .b2))) ... As you can see, the disjunction representing the uncovered space is growing exponentially. Eventually some of these will start disappearing (for instance, if B only has .b1 and .b2, that last term can go away), and if the switch is exhaustive they can /all/ start disappearing. But several of them are also redundant: the second and third cases are fully covered by the first. I exaggerated a little: the compiler is already smart enough to know that the second case is redundant with the first, because it already knows that (.a1, .b2) isn't a subset of (A - .a1, B). However, the third and fourth cases are generated separately from the first two, and so nothing ever checks that the third case is also redundant. This patch changes the logic for subtracting from a disjunction so that 1. any resulting disjunctions are flattened, and 2. any later terms that are subspaces of earlier terms are dropped This is a quadratic algorithm in the worst case (compare every space against every earlier space), but because it saves us from exponential growth (or at least brings down the exponent) it's worth it. For the test case now committed in the repository, we went from 40 seconds (using -switch-checking-invocation-threshold=20000000 to avoid cutting off early) to 0.2 seconds. I'll admit I was only timing this one input, and it's possible that other complex switches will not see the same benefit, or may even see a slowdown. But I do think this kind of switch is common in both hand-written and auto-generated code, and therefore this is likely to be a benefit in many places. rdar://problem/47365349
1 parent f5a8f26 commit 671944d

File tree

2 files changed

+239
-4
lines changed

2 files changed

+239
-4
lines changed

lib/Sema/TypeCheckSwitchStmt.cpp

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -506,12 +506,35 @@ namespace {
506506
PAIRCASE (SpaceKind::Disjunct, SpaceKind::UnknownCase): {
507507
SmallVector<Space, 4> smallSpaces;
508508
for (auto s : this->getSpaces()) {
509-
if (auto diff = s.minus(other, TC, DC, minusCount))
510-
smallSpaces.push_back(*diff);
511-
else
509+
auto diff = s.minus(other, TC, DC, minusCount);
510+
if (!diff)
512511
return None;
512+
if (diff->getKind() == SpaceKind::Disjunct) {
513+
smallSpaces.append(diff->getSpaces().begin(),
514+
diff->getSpaces().end());
515+
} else {
516+
smallSpaces.push_back(*diff);
517+
}
513518
}
514-
return Space::forDisjunct(smallSpaces);
519+
520+
// Remove any of the later spaces that are contained entirely in an
521+
// earlier one. Since we're not sorting by size, this isn't
522+
// guaranteed to give us a minimal set, but it'll still reduce the
523+
// general (A, B, C) - ((.a1, .b1, .c1) | (.a1, .b1, .c2)) problem.
524+
// This is a quadratic operation but it saves us a LOT of work
525+
// overall.
526+
SmallVector<Space, 4> usefulSmallSpaces;
527+
for (const Space &space : smallSpaces) {
528+
bool alreadyHandled = llvm::any_of(usefulSmallSpaces,
529+
[&](const Space &previousSpace) {
530+
return space.isSubspace(previousSpace, TC, DC);
531+
});
532+
if (alreadyHandled)
533+
continue;
534+
usefulSmallSpaces.push_back(space);
535+
}
536+
537+
return Space::forDisjunct(usefulSmallSpaces);
515538
}
516539
PAIRCASE (SpaceKind::Constructor, SpaceKind::Type):
517540
return Space();
Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
// RUN: %target-typecheck-verify-swift
2+
3+
enum NumericBase {
4+
case binary
5+
case ternary
6+
case quaternary
7+
case quinary
8+
case senary
9+
case septary
10+
case octal
11+
case nonary
12+
case decimal
13+
case undecimal
14+
case duodecimal
15+
}
16+
17+
enum Direction {
18+
case left
19+
case right
20+
}
21+
22+
enum WritingSystem {
23+
case logographic
24+
case alphabet(kind: Alphabet)
25+
case abjad
26+
case abugida
27+
case syllabary
28+
case other
29+
}
30+
31+
enum Alphabet {
32+
case roman
33+
case greek
34+
case cyrillic
35+
}
36+
37+
func test(base: NumericBase, direction: Direction, writingSystem: WritingSystem) {
38+
switch (base, direction, writingSystem) {
39+
case (.binary, .left, .logographic),
40+
(.binary, .left, .alphabet),
41+
(.binary, .left, .abugida):
42+
break
43+
44+
case (.binary, .right, .logographic),
45+
(.binary, .right, .alphabet),
46+
(.binary, .right, .abugida):
47+
break
48+
49+
case (.binary, _, .abjad):
50+
break
51+
52+
case (.binary, _, .syllabary):
53+
break
54+
55+
case (.ternary, .left, .logographic):
56+
break
57+
58+
case (.ternary, .left, .alphabet),
59+
(.ternary, .left, .abugida):
60+
break
61+
62+
case (.ternary, .right, .logographic),
63+
(.ternary, .right, .abugida):
64+
break
65+
66+
case (.ternary, .right, .alphabet):
67+
break
68+
69+
case (.ternary, _, .abjad):
70+
break
71+
72+
case (.ternary, _, .syllabary):
73+
break
74+
75+
case (.quaternary, .left, .logographic):
76+
break
77+
78+
case (.quaternary, .left, .alphabet),
79+
(.quaternary, .left, .abugida):
80+
break
81+
82+
case (.quaternary, .right, .logographic),
83+
(.quaternary, .right, .abugida):
84+
break
85+
86+
case (.quaternary, .right, .alphabet):
87+
break
88+
89+
case (.quaternary, _, .abjad):
90+
break
91+
92+
case (.quaternary, _, .syllabary):
93+
break
94+
95+
case (.quinary, .left, .logographic),
96+
(.senary, .left, .logographic):
97+
break
98+
99+
case (.quinary, .left, .alphabet),
100+
(.senary, .left, .alphabet),
101+
(.quinary, .left, .abugida),
102+
(.senary, .left, .abugida):
103+
break
104+
105+
case (.quinary, .right, .logographic),
106+
(.senary, .right, .logographic):
107+
break
108+
109+
case (.quinary, .right, .alphabet),
110+
(.senary, .right, .alphabet),
111+
(.quinary, .right, .abugida),
112+
(.senary, .right, .abugida):
113+
break
114+
115+
case (.quinary, _, .abjad),
116+
(.senary, _, .abjad):
117+
break
118+
119+
case (.quinary, _, .syllabary),
120+
(.senary, _, .syllabary):
121+
break
122+
123+
case (.septary, .left, .logographic):
124+
break
125+
126+
case (.septary, .left, .alphabet),
127+
(.septary, .left, .abugida):
128+
break
129+
130+
case (.septary, .right, .logographic):
131+
break
132+
133+
case (.septary, .right, .alphabet),
134+
(.septary, .right, .abugida):
135+
break
136+
137+
case (.septary, _, .abjad):
138+
break
139+
140+
case (.septary, _, .syllabary):
141+
break
142+
143+
case (.decimal, .left, .logographic):
144+
break
145+
146+
case (.decimal, .left, .alphabet),
147+
(.decimal, .left, .abugida):
148+
break
149+
150+
case (.decimal, .right, .logographic):
151+
break
152+
153+
case (.decimal, .right, .alphabet),
154+
(.decimal, .right, .abugida):
155+
break
156+
157+
case (.octal, .left, .logographic),
158+
(.nonary, .left, .logographic):
159+
break
160+
161+
case (.octal, .left, .alphabet),
162+
(.nonary, .left, .alphabet),
163+
(.octal, .left, .abugida),
164+
(.nonary, .left, .abugida):
165+
break
166+
167+
case (.octal, .right, .logographic),
168+
(.nonary, .right, .logographic):
169+
break
170+
171+
case (.octal, .right, .alphabet),
172+
(.nonary, .right, .alphabet),
173+
(.octal, .right, .abugida),
174+
(.nonary, .right, .abugida):
175+
break
176+
177+
case (.octal, _, .abjad),
178+
(.nonary, _, .abjad),
179+
(.decimal, _, .abjad):
180+
break
181+
182+
case (.octal, _, .syllabary),
183+
(.nonary, _, .syllabary),
184+
(.decimal, _, .syllabary):
185+
break
186+
187+
case (.undecimal, .left, .logographic):
188+
break
189+
190+
case (.undecimal, .left, .alphabet),
191+
(.undecimal, .left, .abugida):
192+
break
193+
194+
case (.undecimal, .right, .logographic):
195+
break
196+
197+
case (.undecimal, .right, .alphabet),
198+
(.undecimal, .right, .abugida):
199+
break
200+
201+
case (.undecimal, _, .abjad):
202+
break
203+
204+
case (.undecimal, _, .syllabary):
205+
break
206+
207+
case (.duodecimal, _, _):
208+
break
209+
case (_, _, .other):
210+
break
211+
}
212+
}

0 commit comments

Comments
 (0)