Spaces:
Running
Running
File size: 1,384 Bytes
95841bc b82656e 95841bc b82656e 95841bc b82656e 95841bc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
title: MediVox - AI Doctor with Vision and Voice
emoji: π¨ββοΈ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.16.0
app_file: app.py
pinned: false
---
# AI Doctor with Vision and Voice
This is an AI-powered medical assistant that can:
- Accept voice input from patients
- Analyze medical images
- Provide medical insights using RAG (Retrieval Augmented Generation)
- Respond with natural voice output
## Features
- Speech-to-Text using Whisper
- Image Analysis using LLaVA
- RAG using FAISS and medical knowledge base
- Text-to-Speech using ElevenLabs
- Context-aware responses using medical domain knowledge
## Environment Variables Required
```bash
GROQ_API_KEY=your_groq_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
```
## Usage
1. Click the microphone button to record your question
2. Upload or take a picture of the medical condition
3. Wait for the AI doctor to analyze and respond
4. Listen to the voice response or read the text output
## Model Details
- Vision Model: LLaVA 3.2 90B
- Speech-to-Text: Whisper Large V3
- Text Generation: Groq
- Voice Generation: ElevenLabs
- Embeddings: sentence-transformers/all-MiniLM-L6-v2
## Citation
If you use this space, please cite:
```
@misc{medivoicebot2024,
author = {Gaurav Gulati},
title = {AI Doctor with Vision and Voice},
year = {2024},
publisher = {Hugging Face Spaces},
}
``` |