Skip to content

A simple and intuitive AI interface with a convenient modification for ASR decryption.

License

Notifications You must be signed in to change notification settings

tatpow/project-aura

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Aura Logo
Project Aura - Audio in Analysis

GitHub Release GitHub Downloads (all assets, all releases) GitHub License

About this project

This project was created to make life easier for schoolers or students. All UI in program is in Russian. Maybe later I add English version. Convert any* audio file into a txt file, with decoding of the recording.

*-I use Librosa and FFmpeg (gyan.dev build essential version: 2025-11-12-git-6cdd2cbe32) for work with audio files. That mean, program maybe will be work with ANY audio type.

Important notes

Warning

All neural networks that are presented below or in the project (file settings.json) are only an EXAMPLE. By using these neural networks, you automatically agree to their license agreement, if any. If you want to read information about the neural networks that are used here, go to the AI ​​column.

CUDA

I ran my 'banchmark'. I used model bond005/whisper-podlodka-turbo (Apache 2.0) (it's the fastest). Audio file len is 2780 seconds.

I have the following components in my PC:

  • CPU: 12th Gen Intel Core i5-12500H, 2500 MHz
  • GPU (from CPU): Irix Xe Graphics
  • GPU: GeForce RTX 3050 Laptop
  • Ram: 16GB

The laptop was on charge all the time. No third party programs were opened. Only one file in ogg.

Also, through my setup program, you can select the type of operation:

  • Quiet (uses the processor video card)
  • Efficiency (according to the manufacturers, this mode “balances” between video cards)
  • Turbo (everything is at maximum)

Table of banchmark:

Device Type Time (sec) Laptop Mod
GPU 330 Q
GPU 170 E
GPU 165 T
CPU > ~2100 Q
CPU > ~2100 E
CPU > ~2100 T

I don't believe this kind of performance on a CPU, it was faster on my PC rather than a laptop. Perhaps the problem is in the ogg file extension.

AI Examples

Warning

All URL will be entered into the Hugging Face website. The AI ​​work in the program is done using their Transformers library.

AI Description

Aboit it you can read right here.

Modify of JSON files

If you want to modify, for example, list of all models, just update JSON file. All files you can find:

_internal\app\json (BUILD)

app/json (SOURCE CODE)

Roadmap

  • Add multy-system to detect models.
  • Fix error 'expected str, bytes or os.PathLike object, not NoneType'.
  • Add torchaudio and etc.
  • More safer work.
  • New UI
  • New architecture
  • More functionality

License

MIT © tatpow

About

A simple and intuitive AI interface with a convenient modification for ASR decryption.

Resources

License

Stars

Watchers

Forks

Languages