Spaces:
Runtime error
Runtime error
title: Amanu | |
emoji: π | |
colorFrom: yellow | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 3.44.4 | |
app_file: app.py | |
pinned: false | |
# This repo's goal is to support the transcription and annotation of audios. | |
## Parts | |
- `audio.py`: Everything related to audio preprocessing and analysis. | |
- `transcription.py`: All code for transcript audios using fast-whisper. | |
- `diarization.py`: Everything related to pyannotation. | |
- `textformatting.py`: All related to fomatting the text in specific outputs. | |
## UI parts | |
1. Transcription. | |
2. Diarization. | |
3. Revision. | |
4. Output formatting. | |
## How to access to the service? | |
The user will logging using a password and user specified by me. That user and password will be manually managed by me. | |
## Pricing | |
1. Calculate the fixed cost of a server running for a long period of time. | |
2. Check if I can use the hibernation period to save some money. | |
## Development | |
- [x] Add word time-stamp | |
- [x] Add Accuracy at word level | |
- [ ] Add mel spectrogram? | |
- [ ] Add Whisper parameters to the interface | |
- [x] Add Whisper X | |
- [x] Introduce SRT as output | |
- [x] Obtain txt with Diarization. | |
- [x] Obtain plain txt with segments. | |
- [ ] Introduce POS. | |
- [x] Optional Preprocessing | |
- [ ] Transcripcion box as the text being written. | |
Introduce Tab for analysis including POS. Maybe it would be great to have a visualizer with the timestamps and other features in Streamlit. Quizas correcciones. | |
## Dev | |
I used huggingface lfs | |
``` | |
git install lfs | |
``` | |
``` | |
huggingface-cli lfs-enable-largefiles . | |
``` |