Skip to content

Commit 8b52f65

Browse files
committed
Update README
1 parent 5c544b0 commit 8b52f65

File tree

2 files changed

+51
-10
lines changed

2 files changed

+51
-10
lines changed

README.md

Lines changed: 48 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,33 +5,71 @@
55
[Documentation](https://abb128.github.io/april-asr/concepts.html)
66

77
## Status
8-
This library is currently under development. Some features are unimplemented, it may have bugs and crashes, and there may be significant changes to the API. It may not yet be production-ready.
8+
This library is currently facing some major rewrites over 2025 to improve efficiency and properly fulfill the API contract of multi-session support. The model format is going to change.
99

10-
Furthermore, there's only one model that only does English and has some accuracy issues at that.
10+
## Language support
11+
The core library is written in C, and has a C API. [Python](https://abb128.github.io/april-asr/python.html) and [C#](https://abb128.github.io/april-asr/csharp.html) bindings are available.
1112

12-
### Language support
13-
The library has a C API, and there are C# and Python bindings available, but these may not be stable yet.
13+
## Example in Python
1414

15-
## Example
15+
Install via `pip install april-asr`
16+
17+
```py
18+
import april_asr as april
19+
import librosa
20+
21+
# Change these values
22+
model_path = "aprilv0_en-us.april"
23+
audio_path = "audio.wav"
24+
25+
model = april.Model(model_path)
26+
27+
28+
def handler(result_type, tokens):
29+
s = ""
30+
for token in tokens:
31+
s = s + token.token
32+
33+
if result_type == april.Result.FINAL_RECOGNITION:
34+
print("@"+s)
35+
elif result_type == april.Result.PARTIAL_RECOGNITION:
36+
print("-"+s)
37+
else:
38+
print("")
39+
40+
session = april.Session(model, handler)
41+
42+
data, sr = librosa.load(audio_path, sr=model.get_sample_rate(), mono=True)
43+
data = (data * 32767).astype("short").astype("<u2").tobytes()
44+
45+
session.feed_pcm16(data)
46+
session.flush()
47+
```
48+
49+
Read the [Python documentation here](https://abb128.github.io/april-asr/python.html).
50+
51+
## Example in C
1652
An example use of this library is provided in `example.cpp`. It can perform speech recognition on a wave file, or do streaming recognition by reading stdin.
1753

1854
It's built as the target `main`. After building aprilasr, you can run it like so:
1955
```
2056
$ ./main /path/to/file.wav /path/to/model.april
2157
```
2258

23-
For streaming recognition, you can pipe parec into it:
59+
For streaming recognition, you can pipe parec into it. The command below will live caption your desktop audio.
2460
```
25-
$ parec --format=s16 --rate=16000 --channels=1 --latency-ms=100 | ./main - /path/to/model.april
61+
$ parec --format=s16 --rate=16000 --channels=1 --latency-ms=100 --device=@DEFAULT_MONITOR@ | ./main - /path/to/model.april
2662
```
2763

2864
## Models
29-
Currently only one model is available, the [English model](https://april.sapples.net/aprilv0_en-us.april), based on [csukuangfj's trained icefall model](https://huggingface.co/csukuangfj/icefall-asr-librispeech-lstm-transducer-stateless2-2022-09-03/tree/main/exp) as the base, and trained with some extra data.
65+
A few models are available, listed [here](https://abb128.github.io/april-asr/models.html).
66+
67+
The English models are based on [csukuangfj's trained icefall model](https://huggingface.co/csukuangfj/icefall-asr-librispeech-lstm-transducer-stateless2-2022-09-03/tree/main/exp) as the base, and trained with some extra data.
3068

31-
To make your own models, check out `extra/exporting-howto.md`
69+
To export your own models, check out `extra/exporting-howto.md`
3270

3371
## Building on Linux
34-
Building requires ONNXRuntime v1.13.1. You can either try to build it from source or just download the release binaries.
72+
Building requires ONNXRuntime. You can either try to build it from source or just download the release binaries.
3573

3674
### Downloading ONNXRuntime
3775
Run `./download_onnx_linux_x64.sh` for linux-x64.

example.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
// For basic live captioning of desktop audio, run it like so:
2+
// parec --format=s16 --rate=16000 --channels=1 --latency-ms=100 --device=@DEFAULT_MONITOR@ | ./main - /path/to/model.april
3+
14
#include <stdio.h>
25
#include <cstdlib>
36
#include <cstring>

0 commit comments

Comments
 (0)