Spaces:
Running
on
Zero
A newer version of the Gradio SDK is available:
5.9.1
Point cloud render configurations
Configuration | default | Explanation |
---|---|---|
--mode |
'single_view_txt' | Currently we support 'single_view_txt' and 'single_view_target' mode |
--traj_txt |
None | Required for 'single_view_txt' mode, a txt file that specify camera trajectory |
--elevation |
5. | The elevation angle of the input image in degree. Estimate a rough value based on your visual judgment |
--center_scale |
1. | Range: (0, 2]. Scale factor for the spherical radius (r). By default, r is set to the depth value of the center pixel (H//2, W//2) of the reference image |
--d_theta |
10. | Range: [-40, 40]. Required for 'single_view_target' mode, specify target theta angle as (theta + d_theta) |
--d_phi |
30. | Range: [-45, 45]. Required for 'single_view_target' mode, specify target phi angle as (phi + d_phi) |
--d_r |
-.2 | Range: [-0.5, 0.5]. Required for 'single_view_target' mode, specify target radius as (r + r*dr) |
The image above illustrates the definition of the world coordinate system.
1. Take a single reference image as an example, you first need to estimate an elevation angle --elevation
that represents the angle at which the image was taken. A value greater than 0 indicates a top-down view, and it doesn't need to be precise.
2. The origin of the world coordinate system is by default defined at the point cloud corresponding to the center pixel of the reference image. You can adjust the position of the origin by modifying --center_scale
; a value less than 1 brings the origin closer to the reference camera.
3. We use spherical coordinates to represent the camera pose. The initial camera is located at (r, 0, 0). You can specify a target camera pose by setting --mode
as 'single_view_target'. As shown in the figure above, a positive --d_phi
moves the camera to the right, a positive --d_theta
moves the camera down, and a negative --d_r
moves the camera forward (closer to the origin). The program will interpolate a smooth trajectory between the initial pose and the target pose, then rendering the point cloud along that trajectory. Below shows some examples:
--center_scale | --d_phi | --d_theta | --d_r | Render results |
0.5 | 45. | 0. | 0. | |
1. | 45. | 0. | 0. | |
1. | 0. | -30. | 0. | |
1. | 0. | 0. | -0.5 | |
1. | 45. | -30. | -0.5 |
4. You can also create a camera trajectory by specifying a sequence of d_phi, d_theta, d_r values. Set --mode
as 'single_view_txt' and write the sequences in a txt file (example: loop1.txt). The first line of the txt file should contain the target d_phi sequence, the second line the target d_theta sequence, and the third line the target d_r sequence. Each sequence should start with 0, and the length of each sequence should range from 2 to 25. Then, input the txt file path into --traj_txt
. The program will interpolate a smooth trajectory based on the sequences you provide. Below shows some examples:
Target sequences | Trajectory visulization | Render results |
0 -3 -15 -20 -17 -5 0 0 -2 -5 -10 -8 -5 0 2 5 10 8 5 0 0 0 |
||
0 3 10 20 17 10 0 0 -2 -8 -6 0 2 8 6 0 0 -0.02 -0.09 -0.18 -0.16 -0.09 0 |
||
0 40 0 -1 -3 -7 -6 -4 0 1 3 7 6 4 0 -1 -3 -7 -6 -4 0 1 3 7 6 4 0 0 0 |
- Tips: A sequence in which the differences between adjacent values increase in one direction results in a smoother trajectory. Ensure that these differences are not too large; otherwise, they may lead to abrupt camera movements, causing the model to produce artifacts such as content drift.