File size: 3,071 Bytes
52409f1 7049bd0 9d2de5c 96d8926 52409f1 6093608 52409f1 c1187fc 345e642 79b1869 52409f1 7049bd0 5c85be0 345e642 5c85be0 345e642 5c85be0 345e642 5c85be0 345e642 5c85be0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
title: AI Video Composer
short_description: Create videos with FFMPEG + Qwen2.5-Coder
emoji: 🏞
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 5.6.0
app_file: app.py
pinned: false
disable_embedding: true
models:
- Qwen/Qwen2.5-Coder-32B-Instruct
---
# 🏞 AI Video Composer
AI Video Composer is an intelligent media processing application that uses natural language instructions to create videos from your media assets. It leverages the Qwen2.5-Coder language model to generate FFmpeg commands based on your requirements.
## How It Works
1. **Upload Media Files**:
- Supports multiple file formats including:
- Images: .png, .jpg, .jpeg, .tiff, .bmp, .gif, .svg
- Audio: .mp3, .wav, .ogg
- Video: .mp4, .avi, .mov, .mkv, .flv, .wmv, .webm, and more
- File size limit: 10MB per file
- Video duration limit: 2 minutes
2. **Provide Instructions**:
- Write natural language instructions describing how you want to process your media
- Examples:
- "Convert these images into a slideshow with 1 second per image"
- "Add this audio track to the video"
- "Make the video play 2x faster"
- "Create a waveform visualization for this audio file"
3. **Advanced Parameters**:
- Top-p (nucleus sampling): Controls diversity of generated commands (0-1)
- Temperature: Controls randomness in command generation (0-5)
4. **Processing**:
- The app analyzes your files and instructions
- Generates an optimized FFmpeg command using Qwen2.5-Coder
- Executes the command and returns the processed video
- Displays the generated FFmpeg command for transparency
## Features
- **Smart Command Generation**: Automatically generates optimal FFmpeg commands based on natural language input
- **Error Handling**: Validates commands before execution and retries with alternative approaches if needed
- **Multiple Asset Support**: Process multiple media files in a single operation
- **Waveform Visualization**: Special support for audio visualization with customizable parameters
- **Image Sequence Processing**: Efficient handling of image sequences for slideshow creation
- **Format Conversion**: Support for various input/output format conversions
- **Example Gallery**: Built-in examples demonstrating common use cases
## Technical Details
- Built with Gradio for the user interface
- Uses FFmpeg for media processing
- Powered by Qwen2.5-Coder for command generation
- Implements robust error handling and command validation
- Processes files in a temporary directory for safety
- Supports both simple operations and complex media transformations
## Limitations
- Maximum file size: 10MB per file
- Maximum video duration: 2 minutes
- Output format: Always MP4
- Processing time may vary based on input complexity
## Contributing
If you have ideas for improvements or bug fixes, please open a PR:
[![Open a Pull Request](https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-pr-lg-light.svg)](https://huggingface.co/spaces/huggingface-projects/video-composer-gpt4/discussions)
|