You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+14-14Lines changed: 14 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@
8
8
## What is noScribe?
9
9
- An app to produce **high quality transcripts of interviews** for qualitative social research or journalistic use
10
10
- noScribe is **free and open source** ([GPL-3.0](https://www.gnu.org/licenses/gpl-3.0.html)), available for Windows, MacOS and Linux
11
-
- It runs **completely local** on your computer, protecting the confidentiallity of your interviews. No cloud, no worries
11
+
- It runs **completely locally** on your computer, protecting the confidentiality of your interviews. No cloud, no worries
12
12
- It can distinguish between different **speakers** and understands around 60 languages (more or less, see below)
13
13
- It includes a **nice editor** to review, verify and correct the resulting transcript
14
14
- It is standing on the shoulders of giants: [Whisper from OpenAI](https://github.com/openai/whisper), [faster-whisper by Guillaume Klein](https://github.com/guillaumekln/faster-whisper) and [pyannote from Hervé Bredin](https://github.com/pyannote/pyannote-audio)
@@ -22,7 +22,7 @@
22
22
- The download is quite large (several gigabytes) due to the included AI models.
23
23
- Beware that a one hour interview can take up to three hours to transcribe, depending on your machine.
24
24
- Poor audio and background noise will lead to poor transcription results.
25
-
- No automatic transcription is perfect, there will always be some manual revision necessary. Use the [included Editor](#noscribeedit) to check your transcripts thouroughly. (See also ["Factors Influencing the Quality"](#factors-influencing-the-quality-of-the-transcription) and ["Known Issues"](#known-issues) below.)
25
+
- No automatic transcription is perfect, there will always be some manual revision necessary. Use the [included Editor](#noscribeedit) to check your transcripts thoroughly. (See also ["Factors Influencing the Quality"](#factors-influencing-the-quality-of-the-transcription) and ["Known Issues"](#known-issues) below.)
26
26
27
27
If you want to know more and can understand German, Rebecca Schmidt from the University of Paderborn wrote a nice [review of noScribe,](https://sozmethode.hypotheses.org/2315) also discussing its limitations. Also the German [computer magazine c't recommended noScribe in a recent review](https://www.heise.de/select/ct/2025/2/2433207582191637980).
28
28
@@ -42,7 +42,7 @@ The [urban dictionary](https://www.urbandictionary.com/define.php?term=Scribe) d
42
42
<summary>Click to expand</summary>
43
43
44
44
-**Download:**
45
-
- The **general purpose version** for normal PCs without a NVIDIA graphics card: [https://drive.switch.ch/index.php/s/EIVup04qkSHb54j?path=%2FnoScribe%20vers.%200.7%2FWindows%2Fnormal](https://drive.switch.ch/index.php/s/EIVup04qkSHb54j?path=%2FnoScribe%20vers.%200.7%2FWindows%2Fnormal)
45
+
- The **general purpose version** for normal PCs without an NVIDIA graphics card: [https://drive.switch.ch/index.php/s/EIVup04qkSHb54j?path=%2FnoScribe%20vers.%200.7%2FWindows%2Fnormal](https://drive.switch.ch/index.php/s/EIVup04qkSHb54j?path=%2FnoScribe%20vers.%200.7%2FWindows%2Fnormal)
46
46
- A special version using **CUDA acceleration on NVIDIA graphics cards** with at least 6 GB of VRAM: [https://drive.switch.ch/index.php/s/EIVup04qkSHb54j?path=%2FnoScribe%20vers.%200.7%2FWindows%2Fcuda](https://drive.switch.ch/index.php/s/EIVup04qkSHb54j?path=%2FnoScribe%20vers.%200.7%2FWindows%2Fcuda). Make sure that your NVIDIA drivers are on version 570.65 or higher. You must also install the [CUDA toolkit from here](https://developer.nvidia.com/cuda-downloads?target_os=Windows) (a reboot is required afterwards).
47
47
-**Installation**:
48
48
- Start the downloaded setup file. This may take a while, be patient.
@@ -60,7 +60,7 @@ ported by [gernophil](https://github.com/gernophil) </br>
60
60
-**Newer Macs with Apple Silicon M1-M4 processors and macOS 14 or newer**
- Double-click on the downloaded dmg-file, then drag noScribe and noScribeEdit into the link to your applications folder (labeled "drag both here to install").
63
-
- You will need Apple's Rosetta2 Intel emulator since one component (ffmpeg) is still made the Intel CPUs. If you don't have it installed already, do this as follows:
63
+
- You will need Apple's Rosetta2 Intel emulator since one component (ffmpeg) is still made for Intel CPUs. If you don't have it installed already, do this as follows:
64
64
- Open the Terminal (located at `/Applications/Utilities/Terminal.app`).
65
65
- Type `softwareupdate --install-rosetta` or `softwareupdate --install-rosetta --agree-to-license`.
66
66
- Hit enter and follow the instructions on the screen.
@@ -96,7 +96,7 @@ ported by [Eckhard Kadasch](https://github.com/eckhrd) and [Florian Dobener](htt
96
96
See [this discussion](https://github.com/kaixxx/noScribe/discussions/83) for
97
97
more information.
98
98
99
-
As you want to install from sources, `git` and `git-lfs`is necessary to get
99
+
If you want to install from source, `git` and `git-lfs`are necessary to get
100
100
all required pieces. The latest sources are directly fetched from the
101
101
repository. Please use the installation above (executable installation) if
-**Mark Pause**: If enabled, parts of your audio without voice activity will be marked as pauses. Pauses are transcribed as round brackets with one dot per second inside, e.g., "(..)" for a two-second pause. Pauses longer than 10 seconds are written out as "(XX seconds pause)" or "(XX minutes pause)". You have the option to mark either pauses of one second and more ("1sec+"), two seconds and more ("2sec+"), or only the longer ones of three seconds and more ("3sec+"). Choose "none" to disable this feature entirely.
156
156
-**Speaker Detection:** This feature uses the Pyannote AI model to identify distinct speakers in your audio and organizes the transcript accordingly. Choose the number of speakers if known, or select "auto." Opting for "none" bypasses this step altogether, reducing the processing time by approximately half. However, the resultant transcript will be a continuous block of text without any indicators of speaker transitions.
157
157
-**Overlapping Speech**: If enabled, noScribe attempts to mark instances where two people speak simultaneously. The overlapping section is demarcated with //double slashes//. (Note: This is an experimental feature.)
158
-
-**Disfluencies**: If enabled, common speech disfluencies like filler words ("um"), unfinished words or sentences, etc. will also be transcribed. Note that this is not a hard on/off switch, but more of a 'recommendation' for the transcription AI model which only works to some extend.
158
+
-**Disfluencies**: If enabled, common speech disfluencies like filler words ("um"), unfinished words or sentences, etc. will also be transcribed. Note that this is not a hard on/off switch, but more of a 'recommendation' for the transcription AI model which only works to some extent.
159
159
-**Timestamps**: When enabled, noScribe incorporates timestamps in the format [hh:mm:ss] into the transcript either at every change of speaker or every 60 seconds. I find these timestamps somewhat distracting, hence my decision to disable them by default. However, they can be quite useful in certain contexts. Even with timestamps disabled, determining the audio timecode for a specific segment is straightforward: simply open the transcript in the noScribe Editor, navigate through the text, and the corresponding timecode will appear in the bottom right corner of the app.
- The "Queue" tab in the main window shows a list of all jobs as well as their state and progress.
176
176
- If you start a new job while another is still running, the new job will wait in the queue to be processed afterwards.
177
-
- To start multiple jobs at once with the same settings, select as many files files as you want in the audio file dialog. The output files will be named automatically. Use the "Save transcript as" dialog to select a different output folder if needed. Otherwise, the transcripts will be stored in the same folders as the audio.
177
+
- To start multiple jobs at once with the same settings, select as many files as you want in the audio file dialog. The output files will be named automatically. Use the "Save transcript as" dialog to select a different output folder if needed. Otherwise, the transcripts will be stored in the same folders as the audio.
178
178
- The job buttons:
179
179
-`X` Deletes a job from the list or cancels a running one.
180
180
-`✔` Opens the transcript in the included editor. This also works for unfinished transcripts in case of an error or if the job was canceled by the user.
@@ -185,13 +185,13 @@ The included editor to check the final transcript.
185
185
186
186

187
187
188
-
The noScribe Editor is a separate app. It will open automatically once the transcript is finished, but can also be run independent from noScribe. It contains some handy features to check your finished transcript for errors and correct them:
188
+
The noScribe Editor is a separate app. It will open automatically once the transcript is finished, but can also be run independently from noScribe. It contains some handy features to check your finished transcript for errors and correct them:
189
189
- Press **Ctrl + Spacebar** (^Space on Mac) or the **orange button in the toolbar** to hear the audio which corresponds to your current position in the text.
190
190
- The **selection of the text will follow the audio that you hear**. If you want to **make changes,** click anywhere in the text with your mouse or use the arrow keys to move the cursor. The audio will stop, and you can edit the text.
191
191
- You can also **stop the audio** by pressing Ctrl + Spacebar again or clicking the orange button.
192
192
- If you want to **speed up or slow down the audio**, change the "100%"-field next to the "Play/Pause Audio"-Button to the appropriate speed.
193
193
- To change the **speaker names,** use the Search & Replace feature, accessible from the magnifying glass icon or the Edit menu.
194
-
- Use the plus und minus icons in the toolbar to **zoom in or out**
194
+
- Use the plus and minus icons in the toolbar to **zoom in or out**
195
195
- You will find the **most common features of a basic text editor** in the toolbar as well as in the menu at the top (basic text formatting, cut, copy & paste, undo & redo).
196
196
- Your typical **hotkeys** will also work (e.g., Ctrl+S for Save, Ctrl+F for Find & Replace). You can see all the hotkeys if you open the menu. As already mentioned, 'Ctrl+Space' is the hotkey you'll use the most as it starts or pauses the audio.
197
197
@@ -205,23 +205,23 @@ The source code of the editor can be found here: [https://github.com/kaixxx/noSc
205
205
206
206
## Known Issues
207
207
208
-
- The output of this software needs always checked for quality, misunderstandings, and wrong speaker diarization. This software is based on [OpenAI's Whisper model](https://github.com/openai/whisper) and a first impression on word error rates can be seen [here](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages). See also [this paper](https://doi.org/10.1145/3576915.3624380) for a comparison of different transcription services and their errors.
208
+
- The output of this software always needs to be checked for quality, misunderstandings, and wrong speaker diarization. This software is based on [OpenAI's Whisper model](https://github.com/openai/whisper). Typical word error rates can be seen [here](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages). See also [this paper](https://doi.org/10.1145/3576915.3624380) for a comparison of different transcription services and their errors.
209
209
210
210
- Like any other large language model, the whisper model can sometimes **hallucinate**. This is especially prevalent in silent audio passages or when background noise is treated as "text" (see [this study from the Cornell University](https://facctconference.org/static/papers24/facct24-111.pdf) for more info about the issue). We use voice activity detection (VAD) to filter out sections without speech as best as possible.
211
211
212
-
More severely, users also reported cases where words where hallucinated that would fit syntactically into the context, but where actually not present in the orginal audio. Such errors are especially hard to catch.
212
+
More severely, users also reported cases where words were hallucinated that would fit syntactically into the context, but were actually not present in the original audio. Such errors are especially hard to catch.
213
213
214
214
- The whisper AI can sometimes get **stuck in a loop of repeating text,** especially on longer audio files. If this happens, try to transcribe shorter sections (using the "Start" and "Stop" fields in noScribe), and join them manually.
215
215
216
-
-**Multilingual audio**is now supported, but experimental. Sometimes it can happen that words in other languages than the main language are translated.
216
+
-**Multilingual audio**is now supported, but experimental. Sometimes it can happen that words in other languages than the main language are translated.
217
217
218
218
-**Nonverbal expressions** like laughter are not included in the transcript and must be added later in the editor if you need them.
219
219
220
220
-**Speaker diarization:** In some recordings, the AI used by noScribe may not be able to tell the voices of certain speakers apart, even if they sound quite different to the human ear. Check the results carefully.
221
221
222
-
- It can happen that **punctuations and capitalizations** are lost over time, especially in longer interviews. If you run into this issue, you can
222
+
- It can happen that **punctuation and capitalization** are lost over time, especially in longer interviews. If you run into this issue, you can
223
223
- Try to transcribe shorter sections (using the "Start" and "Stop" fields in noScribe), and join them manually.
224
-
- Try to use another model, especially "faster-whisper-large-v2", which is less prone to this problem. You have to install this models first as described [in the Wiki](https://github.com/kaixxx/noScribe/wiki/Add-custom-Whisper-models-for-transcription).
224
+
- Try to use another model, especially "faster-whisper-large-v2", which is less prone to this problem. You have to install this model first as described [in the Wiki](https://github.com/kaixxx/noScribe/wiki/Add-custom-Whisper-models-for-transcription).
0 commit comments