A newer version of the Gradio SDK is available:
5.19.0
metadata
title: Diego GenAI LLM multi-model story telling fun
emoji: 🤗
sdk: gradio
sdk_version: 4.24.0
license: cc-by-nc-sa-4.0
short_description: Diego's GenAI LLM multi-model story telling fun
colorFrom: yellow
colorTo: gray
app_file: app.py
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Result
- Multi-models in action
- Story Telling
- Given a image
- Generate the caption for the image
- Generate an background story for the text
- Use LLM models:
- Salesforce/blip-image-captioning-base for image captioning
- gpt2 for text generation
- gTTS for text to speech, gTTS is a Python library and CLI tool to interface with Google Translate's text-to-speech API.
- openai/whisper-large-v2 for speach recognition
- pipeline/sentiment-analysis task for sentiment analysis of the text story
Result UI:
Audio Result: