Spaces:

katospiegel
/

amanu

Runtime error

amanu / README.md

results

98d0bf0 over 1 year ago

1.54 kB

	---
	title: Amanu
	emoji: 👁
	colorFrom: yellow
	colorTo: purple
	sdk: gradio
	sdk_version: 3.44.4
	app_file: app.py
	pinned: false
	---

	# This repo's goal is to support the transcription and annotation of audios.

	## Parts

	- `audio.py`: Everything related to audio preprocessing and analysis.
	- `transcription.py`: All code for transcript audios using fast-whisper.
	- `diarization.py`: Everything related to pyannotation.
	- `textformatting.py`: All related to fomatting the text in specific outputs.

	## UI parts

	1. Transcription.
	2. Diarization.
	3. Revision.
	4. Output formatting.

	## How to access to the service?

	The user will logging using a password and user specified by me. That user and password will be manually managed by me.

	## Pricing

	1. Calculate the fixed cost of a server running for a long period of time.
	2. Check if I can use the hibernation period to save some money.

	## Development

	- [x] Add word time-stamp
	- [x] Add Accuracy at word level
	- [ ] Add mel spectrogram?
	- [ ] Add Whisper parameters to the interface
	- [x] Add Whisper X
	- [x] Introduce SRT as output
	- [x] Obtain txt with Diarization.
	- [x] Obtain plain txt with segments.
	- [ ] Introduce POS.
	- [x] Optional Preprocessing
	- [ ] Transcripcion box as the text being written.


	Introduce Tab for analysis including POS. Maybe it would be great to have a visualizer with the timestamps and other features in Streamlit. Quizas correcciones.

	## Dev

	I used huggingface lfs

	```
	git install lfs
	```

	```
	huggingface-cli lfs-enable-largefiles .
	```