Spaces:
Sleeping
Sleeping
File size: 987 Bytes
7d6fea3 a7920d3 7d6fea3 edeaf50 ff067ae edeaf50 7d6fea3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
---
title: Diego GenAI LLM multi-model story telling fun
emoji: 🤗
sdk: gradio
sdk_version: 4.24.0
license: cc-by-nc-sa-4.0
short_description: Diego's GenAI LLM multi-model story telling fun
colorFrom: yellow
colorTo: gray
app_file: app.py
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
### Result
* Multi-models in action
* Story Telling
* Given a image
* Generate the caption for the image
* Generate an background story for the text
* Use LLM models:
* Salesforce/blip-image-captioning-base for image captioning
* gpt2 for text generation
* gTTS for text to speech, gTTS is a Python library and CLI tool to interface with Google Translate's text-to-speech API.
* openai/whisper-large-v2 for speach recognition
* pipeline/sentiment-analysis task for sentiment analysis of the text story
Result UI:
<img src='result.png' />
Audio Result:
<audio controls>
<source src="audio.mp3" type="audio/mpeg">
</audio> |