Skip to content

help!!! #66

@ltpwy

Description

@ltpwy

I can only use models downloaded in advance because I'm on a company intranet. I would like to ask why this error occurs.

Some weights of UNet3DConditionModel were not initialized from the model checkpoint at /mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-dprec/wangyue/ad_1/video_creative/github.com/Picsart-AI-Research/StreamingT2V-StreamingModelscope/t2v_enhanced/huggingface.co/ali-vilab/text-to-video-ms-1.7b and are newly initialized: ['cross_attention_merger_down_blocks.5.temporal_transformer.attention.to_k.weight', 'cross_attention_merger_down_blocks.9.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.0.temporal_transformer.proj_in.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.attn2.conv.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.attn2.conv.weight', 'cross_attention_merger_down_blocks.3.temporal_transformer.norm.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.attn2.conv_ln.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_down_blocks.8.temporal_transformer.norm.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.attn2.alpha', 'cross_attention_merger_down_blocks.3.temporal_transformer.attention.to_k.weight', 'cross_attention_merger_down_blocks.4.temporal_transformer.proj_out.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.attn2.conv.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.attn2.conv.weight', 'cross_attention_merger_mid_block.temporal_transformer.attention.to_k.weight', 'cross_attention_merger_down_blocks.0.temporal_transformer.attention.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.attn2.conv_ln.weight', 'cross_attention_merger_down_blocks.3.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.9.temporal_transformer.attention.to_out.0.weight', 'cross_attention_merger_down_blocks.7.temporal_transformer.attention.to_k.weight', 'cross_attention_merger_down_blocks.6.temporal_transformer.proj_in.bias', 'cross_attention_merger_down_blocks.2.temporal_transformer.norm.bias', 'cross_attention_merger_mid_block.temporal_transformer.proj_in.bias', 'cross_attention_merger_down_blocks.2.temporal_transformer.attention.to_v.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.attn2.conv_ln.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_down_blocks.10.temporal_transformer.norm.weight', 'cross_attention_merger_down_blocks.9.temporal_transformer.proj_in.bias', 'cross_attention_merger_down_blocks.10.temporal_transformer.attention.to_q.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.conv.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.conv.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.attn2.alpha', 'up_blocks.3.attentions.2.transformer_blocks.0.attn2.conv.bias', 'mid_block.attentions.0.transformer_blocks.0.attn2.conv.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.1.temporal_transformer.attention.to_v.weight', 'cross_attention_merger_down_blocks.2.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.6.temporal_transformer.norm.bias', 'cross_attention_merger_down_blocks.6.temporal_transformer.proj_out.bias', 'cross_attention_merger_down_blocks.3.temporal_transformer.attention.to_out.0.bias', 'cross_attention_merger_down_blocks.4.temporal_transformer.attention.to_v.weight', 'cross_attention_merger_down_blocks.1.temporal_transformer.proj_in.weight', 'cross_attention_merger_down_blocks.6.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.8.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.6.temporal_transformer.attention.to_v.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.attn2.alpha', 'up_blocks.3.attentions.2.transformer_blocks.0.attn2.alpha', 'cross_attention_merger_down_blocks.9.temporal_transformer.attention.to_k.weight', 'cross_attention_merger_down_blocks.4.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.6.temporal_transformer.attention.to_out.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_down_blocks.8.temporal_transformer.norm.bias', 'cross_attention_merger_down_blocks.11.temporal_transformer.proj_out.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.attn2.conv_ln.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.attn2.conv_ln.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.conv_ln.weight', 'cross_attention_merger_down_blocks.7.temporal_transformer.attention.to_out.0.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.attn2.conv.weight', 'cross_attention_merger_mid_block.temporal_transformer.attention.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_down_blocks.9.temporal_transformer.proj_in.weight', 'cross_attention_merger_down_blocks.5.temporal_transformer.attention.to_q.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.attn2.alpha', 'cross_attention_merger_down_blocks.9.temporal_transformer.norm.bias', 'cross_attention_merger_down_blocks.1.temporal_transformer.attention.to_k.weight', 'cross_attention_merger_down_blocks.4.temporal_transformer.proj_out.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.attn2.conv_ln.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.attn2.conv.weight', 'cross_attention_merger_down_blocks.4.temporal_transformer.norm.bias', 'cross_attention_merger_down_blocks.10.temporal_transformer.attention.to_k.weight', 'cross_attention_merger_down_blocks.11.temporal_transformer.norm.weight', 'cross_attention_merger_down_blocks.9.temporal_transformer.norm.weight', 'mid_block.attentions.0.transformer_blocks.0.attn2.alpha', 'cross_attention_merger_down_blocks.11.temporal_transformer.proj_in.bias', 'cross_attention_merger_down_blocks.3.temporal_transformer.attention.to_v.weight', 'cross_attention_merger_down_blocks.7.temporal_transformer.attention.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.attn2.conv.weight', 'cross_attention_merger_down_blocks.11.temporal_transformer.attention.to_v.weight', 'cross_attention_merger_down_blocks.1.temporal_transformer.norm.bias', 'cross_attention_merger_mid_block.temporal_transformer.norm.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.attn2.alpha', 'up_blocks.3.attentions.1.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_down_blocks.5.temporal_transformer.norm.bias', 'cross_attention_merger_down_blocks.2.temporal_transformer.proj_in.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.attn2.conv.weight', 'cross_attention_merger_down_blocks.8.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.3.temporal_transformer.proj_in.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_down_blocks.10.temporal_transformer.norm.bias', 'cross_attention_merger_mid_block.temporal_transformer.norm.weight', 'cross_attention_merger_down_blocks.10.temporal_transformer.proj_in.weight', 'cross_attention_merger_down_blocks.6.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.8.temporal_transformer.attention.to_v.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.attn2.conv_ln.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.conv_ln.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_down_blocks.6.temporal_transformer.attention.to_out.0.weight', 'cross_attention_merger_mid_block.temporal_transformer.proj_in.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.attn2.conv_ln.weight', 'cross_attention_merger_down_blocks.7.temporal_transformer.proj_out.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.conv_ln.weight', 'cross_attention_merger_down_blocks.11.temporal_transformer.proj_in.weight', 'cross_attention_merger_down_blocks.9.temporal_transformer.attention.to_out.0.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.attn2.alpha', 'cross_attention_merger_down_blocks.10.temporal_transformer.attention.to_out.0.bias', 'cross_attention_merger_down_blocks.9.temporal_transformer.proj_out.bias', 'cross_attention_merger_down_blocks.4.temporal_transformer.attention.to_k.weight', 'cross_attention_merger_down_blocks.1.temporal_transformer.norm.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_down_blocks.0.temporal_transformer.norm.bias', 'cross_attention_merger_down_blocks.4.temporal_transformer.attention.to_out.0.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.attn2.conv.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.attn2.conv_ln.weight', 'cross_attention_merger_down_blocks.2.temporal_transformer.attention.to_out.0.weight', 'cross_attention_merger_down_blocks.5.temporal_transformer.proj_out.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.1.temporal_transformer.attention.to_out.0.weight', 'cross_attention_merger_down_blocks.7.temporal_transformer.norm.bias', 'cross_attention_merger_down_blocks.0.temporal_transformer.attention.to_k.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.9.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.2.temporal_transformer.attention.to_k.weight', 'cross_attention_merger_down_blocks.5.temporal_transformer.norm.weight', 'cross_attention_merger_down_blocks.10.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.1.temporal_transformer.proj_in.bias', 'cross_attention_merger_down_blocks.3.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.7.temporal_transformer.proj_in.weight', 'cross_attention_merger_down_blocks.0.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.8.temporal_transformer.attention.to_k.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.conv.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.attn2.alpha', 'down_blocks.0.attentions.0.transformer_blocks.0.attn2.conv.weight', 'cross_attention_merger_mid_block.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.6.temporal_transformer.proj_in.weight', 'cross_attention_merger_down_blocks.10.temporal_transformer.attention.to_v.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.10.temporal_transformer.proj_out.bias', 'cross_attention_merger_mid_block.temporal_transformer.attention.to_v.weight', 'cross_attention_merger_down_blocks.0.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.2.temporal_transformer.proj_in.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.alpha', 'cross_attention_merger_down_blocks.1.temporal_transformer.attention.to_out.0.bias', 'cross_attention_merger_down_blocks.2.temporal_transformer.norm.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.attn2.conv.weight', 'cross_attention_merger_down_blocks.8.temporal_transformer.proj_out.bias', 'cross_attention_merger_down_blocks.7.temporal_transformer.proj_in.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.3.temporal_transformer.attention.to_out.0.weight', 'cross_attention_merger_down_blocks.7.temporal_transformer.proj_out.bias', 'cross_attention_merger_down_blocks.10.temporal_transformer.attention.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.5.temporal_transformer.attention.to_out.0.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.conv_ln.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.attn2.conv_ln.weight', 'mid_block.attentions.0.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.10.temporal_transformer.proj_in.bias', 'cross_attention_merger_down_blocks.11.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.0.temporal_transformer.norm.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_down_blocks.11.temporal_transformer.attention.to_out.0.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.attn2.conv.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.attn2.conv_ln.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.8.temporal_transformer.proj_in.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_down_blocks.11.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.1.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.0.temporal_transformer.attention.to_out.0.weight', 'cross_attention_merger_down_blocks.7.temporal_transformer.attention.to_v.weight', 'cross_attention_merger_down_blocks.7.temporal_transformer.attention.to_q.weight', 'cross_attention_merger_down_blocks.4.temporal_transformer.attention.to_out.0.bias', 'cross_attention_merger_down_blocks.2.temporal_transformer.attention.to_out.0.bias', 'cross_attention_merger_down_blocks.7.temporal_transformer.norm.weight', 'cross_attention_merger_down_blocks.0.temporal_transformer.proj_in.bias', 'mid_block.attentions.0.transformer_blocks.0.attn2.conv_ln.weight', 'cross_attention_merger_down_blocks.11.temporal_transformer.attention.to_k.weight', 'cross_attention_merger_down_blocks.11.temporal_transformer.norm.bias', 'cross_attention_merger_down_blocks.3.temporal_transformer.proj_in.bias', 'cross_attention_merger_down_blocks.1.temporal_transformer.proj_out.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.conv.weight', 'cross_attention_merger_down_blocks.6.temporal_transformer.attention.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.attn2.alpha', 'cross_attention_merger_down_blocks.1.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.5.temporal_transformer.proj_in.weight', 'cross_attention_merger_down_blocks.11.temporal_transformer.attention.to_out.0.bias', 'cross_attention_merger_down_blocks.3.temporal_transformer.norm.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.attn2.conv_ln.weight', 'cross_attention_merger_down_blocks.3.temporal_transformer.proj_out.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.attn2.conv_ln.weight', 'mid_block.attentions.0.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_mid_block.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.2.temporal_transformer.proj_out.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.5.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.6.temporal_transformer.norm.weight', 'cross_attention_merger_down_blocks.9.temporal_transformer.attention.to_v.weight', 'cross_attention_merger_down_blocks.2.temporal_transformer.proj_out.weight', 'cross_attention_merger_down_blocks.4.temporal_transformer.norm.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.attn2.conv_ln.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.attn2.alpha', 'cross_attention_merger_down_blocks.5.temporal_transformer.proj_in.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.attn2.conv_ln.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.attn2.conv.bias', 'cross_attention_merger_mid_block.temporal_transformer.attention.to_out.0.bias', 'cross_attention_merger_down_blocks.4.temporal_transformer.proj_in.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.attn2.alpha', 'cross_attention_merger_down_blocks.5.temporal_transformer.attention.to_out.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.alpha', 'up_blocks.2.attentions.2.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_mid_block.temporal_transformer.proj_out.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.attn2.conv.weight', 'cross_attention_merger_down_blocks.4.temporal_transformer.proj_in.weight', 'cross_attention_merger_down_blocks.5.temporal_transformer.attention.to_v.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.attn2.alpha', 'cross_attention_merger_down_blocks.8.temporal_transformer.attention.to_out.0.weight', 'cross_attention_merger_down_blocks.8.temporal_transformer.proj_in.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.attn2.conv.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.0.temporal_transformer.proj_out.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.alpha', 'up_blocks.2.attentions.1.transformer_blocks.0.attn2.conv_ln.bias', 'cross_attention_merger_down_blocks.0.temporal_transformer.attention.to_out.0.bias', 'cross_attention_merger_down_blocks.8.temporal_transformer.attention.to_out.0.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions