docs: Add comprehensive README with detailed app description and usage guide
Browse files
README.md
CHANGED
@@ -13,4 +13,66 @@ models:
|
|
13 |
- Qwen/Qwen2.5-Coder-32B-Instruct
|
14 |
---
|
15 |
|
16 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
- Qwen/Qwen2.5-Coder-32B-Instruct
|
14 |
---
|
15 |
|
16 |
+
# 🏞 Video Composer
|
17 |
+
|
18 |
+
Video Composer is an intelligent media processing application that uses natural language instructions to create videos from your media assets. It leverages the Qwen2.5-Coder language model to generate FFmpeg commands based on your requirements.
|
19 |
+
|
20 |
+
## How It Works
|
21 |
+
|
22 |
+
1. **Upload Media Files**:
|
23 |
+
- Supports multiple file formats including:
|
24 |
+
- Images: .png, .jpg, .jpeg, .tiff, .bmp, .gif, .svg
|
25 |
+
- Audio: .mp3, .wav, .ogg
|
26 |
+
- Video: .mp4, .avi, .mov, .mkv, .flv, .wmv, .webm, and more
|
27 |
+
- File size limit: 10MB per file
|
28 |
+
- Video duration limit: 2 minutes
|
29 |
+
|
30 |
+
2. **Provide Instructions**:
|
31 |
+
- Write natural language instructions describing how you want to process your media
|
32 |
+
- Examples:
|
33 |
+
- "Convert these images into a slideshow with 1 second per image"
|
34 |
+
- "Add this audio track to the video"
|
35 |
+
- "Make the video play 2x faster"
|
36 |
+
- "Create a waveform visualization for this audio file"
|
37 |
+
|
38 |
+
3. **Advanced Parameters**:
|
39 |
+
- Top-p (nucleus sampling): Controls diversity of generated commands (0-1)
|
40 |
+
- Temperature: Controls randomness in command generation (0-5)
|
41 |
+
|
42 |
+
4. **Processing**:
|
43 |
+
- The app analyzes your files and instructions
|
44 |
+
- Generates an optimized FFmpeg command using Qwen2.5-Coder
|
45 |
+
- Executes the command and returns the processed video
|
46 |
+
- Displays the generated FFmpeg command for transparency
|
47 |
+
|
48 |
+
## Features
|
49 |
+
|
50 |
+
- **Smart Command Generation**: Automatically generates optimal FFmpeg commands based on natural language input
|
51 |
+
- **Error Handling**: Validates commands before execution and retries with alternative approaches if needed
|
52 |
+
- **Multiple Asset Support**: Process multiple media files in a single operation
|
53 |
+
- **Waveform Visualization**: Special support for audio visualization with customizable parameters
|
54 |
+
- **Image Sequence Processing**: Efficient handling of image sequences for slideshow creation
|
55 |
+
- **Format Conversion**: Support for various input/output format conversions
|
56 |
+
- **Example Gallery**: Built-in examples demonstrating common use cases
|
57 |
+
|
58 |
+
## Technical Details
|
59 |
+
|
60 |
+
- Built with Gradio for the user interface
|
61 |
+
- Uses FFmpeg for media processing
|
62 |
+
- Powered by Qwen2.5-Coder for command generation
|
63 |
+
- Implements robust error handling and command validation
|
64 |
+
- Processes files in a temporary directory for safety
|
65 |
+
- Supports both simple operations and complex media transformations
|
66 |
+
|
67 |
+
## Limitations
|
68 |
+
|
69 |
+
- Maximum file size: 10MB per file
|
70 |
+
- Maximum video duration: 2 minutes
|
71 |
+
- Output format: Always MP4
|
72 |
+
- Processing time may vary based on input complexity
|
73 |
+
|
74 |
+
## Contributing
|
75 |
+
|
76 |
+
If you have ideas for improvements or bug fixes, please open a PR:
|
77 |
+
|
78 |
+
[![Open a Pull Request](https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-pr-lg-light.svg)](https://huggingface.co/spaces/huggingface-projects/video-composer-gpt4/discussions)
|