Comprehensive testing strategy for CleverKeys Android keyboard, designed to enable testing without ADB/emulator dependencies.
| Type | Location | Count | Framework | Works on ARM64 |
|---|---|---|---|---|
| Unit | src/test/kotlin/ |
5 | Robolectric | No (x86_64 only) |
| Instrumented | src/androidTest/kotlin/ |
6 | AndroidJUnit4 | Requires ADB |
NeuralPredictionTest.kt- SwipeInput data structure testsIntegrationTest.kt- Gesture creation helpersComposeKeyTest.kt- Compose key sequencesOnnxPredictionTest.kt- ONNX prediction basicsMockClasses.kt- Mock implementations
Decouple Android framework from testable business logic.
:app (Android)
├── CleverKeysService.kt → Humble Object, delegates to core
├── Keyboard2View.kt → View layer only
└── SettingsActivity.kt → UI only
:core (Pure Kotlin) [NEW]
├── prediction/
│ ├── NeuralEngine.kt → Interface
│ ├── BeamSearchEngine.kt → Pure algorithm
│ └── VocabularyTrie.kt → Data structure
├── dictionary/
│ ├── DictionaryLoader.kt → Binary parser
│ └── WordLookup.kt → Search logic
├── gesture/
│ ├── TouchPoint.kt → data class (replaces PointF)
│ ├── GestureClassifier.kt → Tap/Swipe/Hold detection
│ └── SwipeAnalyzer.kt → Path analysis
└── text/
├── TextCommitter.kt → Interface (replaces InputConnection)
├── AutoCorrector.kt → Correction logic
└── ContractionHandler.kt→ don't → don't
interface NeuralEngine {
fun predict(features: FloatArray): PredictionResult
fun isReady(): Boolean
}
data class PredictionResult(
val probabilities: Map<Char, Float>,
val confidence: Float
)interface TextCommitter {
fun commitText(text: CharSequence)
fun deleteSurroundingText(beforeLength: Int, afterLength: Int)
fun getTextBeforeCursor(length: Int): CharSequence?
fun getTextAfterCursor(length: Int): CharSequence?
}data class TouchPoint(
val x: Float,
val y: Float,
val timestamp: Long = System.currentTimeMillis()
)// build.gradle (:core module)
testImplementation "org.junit.jupiter:junit-jupiter:5.10.0"
testImplementation "io.mockk:mockk:1.13.8"
testImplementation "com.google.truth:truth:1.1.5"
testImplementation "org.jetbrains.kotlinx:kotlinx-coroutines-test:1.7.3"| Component | Tests | Android Deps |
|---|---|---|
| VocabularyTrie | Insert, lookup, prefix search | None |
| BeamSearchEngine | Decoding, pruning, scoring | None |
| DictionaryLoader | V2 binary parsing | None |
| ContractionHandler | Mapping, reverse lookup | None |
| AutoCorrector | Edit distance, threshold | None |
| Component | Tests | Android Deps |
|---|---|---|
| GestureClassifier | Tap vs swipe vs hold | TouchPoint only |
| SwipeAnalyzer | Path smoothing, key detection | TouchPoint only |
| FeatureExtractor | Velocity, acceleration | TouchPoint only |
| Config validation | Setting ranges, defaults | None |
| Component | Tests | Android Deps |
|---|---|---|
| KeyboardState | Layer switching, modifiers | None |
| LayoutParser | XML parsing | Resources abstraction |
| LanguageDetector | Unigram scoring | None |
| PrefixBoostTrie | Aho-Corasick traversal | None |
| Component | Tests | Reason |
|---|---|---|
| View rendering | Screenshot comparison | Needs real Views |
| IME lifecycle | onStartInput, onFinishInput | Needs Android |
| Haptics | Vibration patterns | Needs hardware |
Tests that can run today with minimal changes:
// VocabularyTrieTest.kt
@Test
fun `trie prefix search returns all matches`() {
val trie = VocabularyTrie()
trie.insert("hello", 1000)
trie.insert("help", 900)
trie.insert("helicopter", 500)
val matches = trie.prefixSearch("hel")
assertThat(matches).containsExactly("hello", "help", "helicopter")
}
// ContractionTest.kt
@Test
fun `contraction mapping works for common words`() {
val handler = ContractionHandler()
handler.loadMappings(mapOf("dont" to "don't", "cant" to "can't"))
assertThat(handler.expand("dont")).isEqualTo("don't")
assertThat(handler.isContractionKey("cant")).isTrue()
}
// EditDistanceTest.kt
@Test
fun `Levenshtein distance calculated correctly`() {
assertThat(editDistance("hello", "hallo")).isEqualTo(1)
assertThat(editDistance("hello", "hello")).isEqualTo(0)
assertThat(editDistance("cat", "cut")).isEqualTo(1)
}// DictionaryLoaderTest.kt
@Test
fun `V2 binary format parses correctly`() {
val bytes = createValidV2Header() + createWordEntries(listOf("test", "word"))
val dict = DictionaryLoader.loadFromBytes(bytes)
assertThat(dict.contains("test")).isTrue()
assertThat(dict.getFrequency("test")).isGreaterThan(0)
}
@Test
fun `invalid magic number throws exception`() {
val bytes = byteArrayOf(0x00, 0x00, 0x00, 0x00)
assertThrows<InvalidDictionaryException> {
DictionaryLoader.loadFromBytes(bytes)
}
}name: Tests
on: [push, pull_request]
jobs:
unit-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-java@v4
with:
java-version: '17'
distribution: 'temurin'
- name: Run Unit Tests
run: ./gradlew test --continue
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-java@v4
with:
java-version: '17'
distribution: 'temurin'
- name: Build Debug APK
run: ./gradlew assembleDebug
instrumented-tests:
runs-on: ubuntu-latest
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v4
- uses: ReactiveCircus/android-emulator-runner@v2
with:
api-level: 29
script: ./gradlew connectedAndroidTest- Add JUnit 5 + MockK + Truth to build.gradle
- Create pure algorithm tests (no refactor needed)
- Add CI workflow for unit tests
- Create TouchPoint data class
- Create NeuralEngine interface
- Create TextCommitter interface
- Refactor BeamSearchEngine to use abstractions
- Create
:coreGradle module - Move testable code to
:core - Replace android.* imports with abstractions
- Achieve 80% coverage on
:core
| Type | Location | Count | Framework | Runner |
|---|---|---|---|---|
| Pure JVM | src/test/kotlin/ |
857 | JUnit4 + Truth | ./gradlew runPureTests |
| MockK | src/test/kotlin/ |
~176 | JUnit4 + MockK | ./gradlew runMockTests |
| Instrumented | src/androidTest/kotlin/ |
~640 | AndroidJUnit4 | emulator.wtf (Pixel7 API 34) |
Standard testDebugUnitTest is disabled — custom runPureTests JavaExec task runs
pure JVM tests directly. runMockTests adds MockK + android.jar to classpath.
Single-class run: ./gradlew runPureTests -PtestClass=ClassName
ew-cli \
--app build/outputs/apk/debug/CleverKeys-v1.2.9-x86_64.apk \
--test build/outputs/apk/androidTest/debug/CleverKeys-debug-androidTest.apk \
--device model=Pixel7,version=34 \
--use-orchestrator --clear-package-data \
--timeout 15mNote: timeout needs unit suffix (10m not 600). APKs must be x86_64 for emulator.
The 5 bugs discovered in 2026-02-24 (contractions, toggle UI, custom words, perf) all lived at composition boundaries — places where multiple components interact in ways that unit tests miss. Specifically:
- SuggestionHandler calls ContractionManager.getNonPairedMapping() but not getPairedContractions()
- WordPredictor.autoCorrect() checks dictionary.containsKey() but dictionary was polluted by contraction aliases
- MainDictionarySource.toggleWord() updates SharedPreferences but not cached DictionaryWord objects
- WordPredictor.isWordDisabled() checks disabledWords but not customAndUserWords
┌─────────────────────────────┐
User types "im" ────────────────▶ │ TypingSimulationTest.kt │
│ │
│ 1. ContractionManager │
│ .getNonPairedMapping() │
│ .getPairedContractions() │
│ │
│ 2. WordPredictor │
│ .predictWordsWithContext()│
│ .autoCorrect() │
│ │
│ 3. DictionaryDataSource │
│ .toggleWord() │
│ .getAllWords() (cache) │
└─────────────────────────────┘
│
Validates: "I'm" ◀───────────────────────────┘
Key insight: We test the PRODUCTION components with REAL data (full dictionary, real contraction files, real SharedPreferences) — not mocks. This catches the composition bugs that mocks hide.
| Category | Count | What It Tests |
|---|---|---|
| Paired contraction lookup | 6 | its→it's, well→we'll, case insensitivity |
| Non-paired contraction mapping | 4 | dont→don't, cant→can't, im→i'm, wont→won't |
| Autocorrect expansion | 10 | Contraction autocorrect, I-capitalization, case preservation |
| Autocorrect regression guards | 3 | "well"/"were"/"ill" should NOT autocorrect |
| Dictionary toggle coherence | 2 | Toggle updates cached list without reload |
| Custom word override | 2 | Custom word overrides disabled word |
| Tap-typing predictions | 3 | Prefix completion, multiple results |
| I-contraction capitalization | 3 | im→I'm, ill preserved, id documented |
| End-to-end scenarios | 3 | Full sentence typing, contraction-heavy, case |
| Pipeline integration | 3 | Scores descending, words=scores length, empty input |
InputMethodService runs in a separate process — Espresso can't instrument it directly. Options considered:
- Test Activity with EditText + IME simulation — complex, fragile, tests Android plumbing not our code
- UiAutomator keyboard interaction — slow, brittle, device-dependent
- Pipeline-level testing (chosen) — tests all production code paths with real data, fast, reliable
The pipeline approach gives us 95% of the coverage at 5% of the complexity. The remaining 5% (view rendering, touch coordinates, IME lifecycle) stays in manual QA.
- SuggestionHandler pipeline test — requires mocking PredictionCoordinator (SuggestionHandler instantiation needs keyboard context). Could test the full contraction injection + merge + capitalization chain.
- Multi-language scenarios — bilingual typing with secondary dictionary
- Adaptation learning — verify UserAdaptationManager boosts recently used words
- Performance benchmarks — dictionary load time, prediction latency, cache hit rates
| Type | Count | Execution Time |
|---|---|---|
| Pure JVM | 857 | ~17s |
| MockK | ~176 | ~12s |
| Instrumented | ~640 | ~12min (emulator.wtf) |
| Total | ~1,673 | — |
Updated: 2026-02-24 Original: 2026-01-18 (Gemini 3 Pro consultation)