victor HF staff commited on
Commit
5c85be0
1 Parent(s): f1e8279

docs: Add comprehensive README with detailed app description and usage guide

Browse files
Files changed (1) hide show
  1. README.md +63 -1
README.md CHANGED
@@ -13,4 +13,66 @@ models:
13
  - Qwen/Qwen2.5-Coder-32B-Instruct
14
  ---
15
 
16
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  - Qwen/Qwen2.5-Coder-32B-Instruct
14
  ---
15
 
16
+ # 🏞 Video Composer
17
+
18
+ Video Composer is an intelligent media processing application that uses natural language instructions to create videos from your media assets. It leverages the Qwen2.5-Coder language model to generate FFmpeg commands based on your requirements.
19
+
20
+ ## How It Works
21
+
22
+ 1. **Upload Media Files**:
23
+ - Supports multiple file formats including:
24
+ - Images: .png, .jpg, .jpeg, .tiff, .bmp, .gif, .svg
25
+ - Audio: .mp3, .wav, .ogg
26
+ - Video: .mp4, .avi, .mov, .mkv, .flv, .wmv, .webm, and more
27
+ - File size limit: 10MB per file
28
+ - Video duration limit: 2 minutes
29
+
30
+ 2. **Provide Instructions**:
31
+ - Write natural language instructions describing how you want to process your media
32
+ - Examples:
33
+ - "Convert these images into a slideshow with 1 second per image"
34
+ - "Add this audio track to the video"
35
+ - "Make the video play 2x faster"
36
+ - "Create a waveform visualization for this audio file"
37
+
38
+ 3. **Advanced Parameters**:
39
+ - Top-p (nucleus sampling): Controls diversity of generated commands (0-1)
40
+ - Temperature: Controls randomness in command generation (0-5)
41
+
42
+ 4. **Processing**:
43
+ - The app analyzes your files and instructions
44
+ - Generates an optimized FFmpeg command using Qwen2.5-Coder
45
+ - Executes the command and returns the processed video
46
+ - Displays the generated FFmpeg command for transparency
47
+
48
+ ## Features
49
+
50
+ - **Smart Command Generation**: Automatically generates optimal FFmpeg commands based on natural language input
51
+ - **Error Handling**: Validates commands before execution and retries with alternative approaches if needed
52
+ - **Multiple Asset Support**: Process multiple media files in a single operation
53
+ - **Waveform Visualization**: Special support for audio visualization with customizable parameters
54
+ - **Image Sequence Processing**: Efficient handling of image sequences for slideshow creation
55
+ - **Format Conversion**: Support for various input/output format conversions
56
+ - **Example Gallery**: Built-in examples demonstrating common use cases
57
+
58
+ ## Technical Details
59
+
60
+ - Built with Gradio for the user interface
61
+ - Uses FFmpeg for media processing
62
+ - Powered by Qwen2.5-Coder for command generation
63
+ - Implements robust error handling and command validation
64
+ - Processes files in a temporary directory for safety
65
+ - Supports both simple operations and complex media transformations
66
+
67
+ ## Limitations
68
+
69
+ - Maximum file size: 10MB per file
70
+ - Maximum video duration: 2 minutes
71
+ - Output format: Always MP4
72
+ - Processing time may vary based on input complexity
73
+
74
+ ## Contributing
75
+
76
+ If you have ideas for improvements or bug fixes, please open a PR:
77
+
78
+ [![Open a Pull Request](https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-pr-lg-light.svg)](https://huggingface.co/spaces/huggingface-projects/video-composer-gpt4/discussions)