the 1024 model doesn't work well : (
I have tested the model of 512 vision. it works well.
but the 1024 generates noise.
my scripts is here.
seed=123
name=inference
ckpt=ckpt/finetuned1024/timenoise.ckpt # path to your checkpoint
config=configs/inference_1024_v1.0.yaml # the config I copied from original DynamiCrafter repo
prompt_dir=prompts/1024 # file for prompts, which includes images and their corresponding text
res_dir="results" # file for outputs
H=576
W=1024
FS=24
M=1000
CUDA_VISIBLE_DEVICES=0 python3 -m torch.distributed.launch
--nproc_per_node=1 --nnodes=1 --master_addr=127.0.0.1 --master_port=23459 --node_rank=0
scripts/evaluation/ddp_wrapper.py
--module 'inference'
--seed ${seed}
--ckpt_path $ckpt
--config $config
--savedir $res_dir/$name
--n_samples 1
--bs 1 --height ${H} --width ${W}
--unconditional_guidance_scale 7.5
--ddim_steps 50
--ddim_eta 1.0
--prompt_dir $prompt_dir
--text_input
--video_length 16
--frame_stride ${FS}
--timestep_spacing 'uniform_trailing'
--guidance_rescale 0.7
--perframe_ae
--M ${M}
--whether_analytic_init 0
--analytic_init_path 'ckpt/initial_noise_1024.pt'
是我哪里搞错了吗?
我把 whether_analytic_init 改为1,M=940跑也还是噪声
Hi
@Conn3r
,
Thank your for your attention to our work! The problem lies in your inference_1024_v1.0.yaml
,which is copied from the original DynamiCrafter repo. Actually,line 2 of this yaml should be target: lvdm.models.ddpm3dInference.LatentVisualDiffusion
rather than lvdm.models.ddpm3d.LatentVisualDiffusion
.It's different from that in the original repo.We've updated the correct yaml to our git repo. Hoping this time it works for you!
thank you for your reply ,it did work.
thanks for the great work!