表 1 版本配套表
| 配套 | 版本 | 环境准备指导 |
|---|---|---|
| Python | 3.11.14 | - |
| torch | 2.9.0 | - |
| CANN | 8.5.0 | - |
- 设备支持 Atlas 800I/800T A2(864G) 800I/800T A3(864G)推理设备:支持的卡数最小为A2 8卡或者A3 4卡
- Atlas 800I/800T A2(8*64G)
- 环境准备指导
pip3 install -r requirements_env.txt
pip install phonemizer-fork==3.3.2# 安装long-context-attention
git clone https://github.com/feifeibear/long-context-attention.git
cd long-context-attention/
pip install .
# 安装xDit
git clone https://github.com/xdit-project/xDiT.git
cd xDiT/
pip install -e .# 下载ffmpeg包
wget https://ffmpeg.org/releases/ffmpeg-4.2.1.tar.gz
# 安装ffmpeg
tar -zxvf ffmpeg-4.2.1.tar.gz
cd ffmpeg-4.2.1
./configure --enable-shared --prefix=/usr/local/ffmpeg
make -j
make install
vi ~/.bashrc
export FFMPEG_PATH=/usr/local/ffmpeg/
export PATH=$FFMPEG_PATH/bin:$PATH
export LD_LIBRARY_PATH=$FFMPEG_PATH/lib:$LD_LIBRARY_PATH
source ~/.bashrc
# 安装decord
git clone --recursive https://github.com/dmlc/decord.git
cd decord
mkdir build && cd build
cmake .. -DFFMPEG_DIR=/usr/local/ffmpeg
make
cd ../python
pwd=$PWD
echo "PYTHONPATH=$PYTHONPATH:$pwd" >> ~/.bashrc
source ~/.bashrc
python3 setup.py install --user#使用源码进行编译安装
git clone https://gitcode.com/Ascend/MindIE-SD.git && cd MindIE-SD python setup.py bdist_wheel
cd dist
pip install mindiesd-*.whlapt-get update
apt-get install -y libgl1-mesa-glx libglib2.0-0使用hostname获取当前主机名称,在/etc/hosts文件后追加配置
{本机IP} {主机名}| Models | Download Link | Notes |
|---|---|---|
| Wan2.1-I2V-14B-480P | 🤗 Huggingface | Base model |
| Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64 | 🤗 Huggingface | wan lora weights |
| chinese-wav2vec2-base | 🤗 Huggingface | Audio encoder |
| MeiGen-InfiniteTalk | 🤗 Huggingface | Our audio condition weights |
Download models using huggingface-cli:
huggingface-cli download Wan-AI/Wan2.1-I2V-14B-480P --local-dir ./weights/Wan2.1-I2V-14B-480P
huggingface-cli download lgylgy/Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64 --local-dir ./weights/Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64
huggingface-cli download TencentGameMate/chinese-wav2vec2-base --local-dir ./weights/chinese-wav2vec2-base
huggingface-cli download TencentGameMate/chinese-wav2vec2-base model.safetensors --revision refs/pr/1 --local-dir ./weights/chinese-wav2vec2-base
huggingface-cli download MeiGen-AI/InfiniteTalk --local-dir ./weights/InfiniteTalk
git clone https://github.com/Eco-Sphere/InfiniteTalk.git执行命令:
MINDIESD_PATH=/usr/local/python3.11.14/lib/python3.11/site-packages/mindiesd
export ASCEND_CUSTOM_OPP_PATH=$MINDIESD_PATH/ops/vendors/customize:$MINDIESD_PATH/ops/vendors/aie_ascendc:
NPU_NUM=8
export HCCL_CONNECT_TIMEOUT=3600
export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export TASK_QUEUE_ENABLE=2
export LD_PRELOAD=/usr/local/Ascend/cann-8.5.0/aarch64-linux/lib64/libjemalloc.so:$LD_PRELOAD
export CPU_AFFINITY_CONF=2
torchrun --nproc_per_node=$NPU_NUM --standalone generate_infinitetalk.py \
--ckpt_dir /data/z00823791/weight/Wan2.1-I2V-14B-480P \
--wav2vec_dir /data/z00823791/weight/chinese-wav2vec2-base \
--infinitetalk_dir /data/z00823791/weight/InfiniteTalk-single/single/infinitetalk.safetensors \
--ulysses_size=$NPU_NUM \
--input_json examples/single_example_image.json \
--size infinitetalk-480 \
--t5_fsdp \
--sample_steps 4 \
--lora_dir /data/z00823791/weight/Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors \
--mode streaming \
--motion_frame 9 \
--sample_text_guide_scale 1.0 \
--sample_audio_guide_scale 1.0 \
--lora_scale 1.0 \
--sample_shift 11 \
--use_rainfusion \
--sparsity 0.85 \
--sparse_start_step 1 \
--rainfusion_type "v2" \
--save_file infinitetalk_sigle参数说明:
- NPU_NUM: 使用npu卡数
- size: 生成视频的尺寸
- sample_steps: 生成视频时执行的步数
- use_rainfusion: 开启LA稀疏
- sparsity: LA稀疏系数,值越大,精度损失越高
- sparse_start_step: 开启稀疏的步数
- rainfusion_type: 稀疏的版本,当前只支持V2
注:开启LA稀疏后,会有精度损失,LA稀疏系数越高,性能收益越高,精度损失越大,具体损失需要根据业务实测
执行命令:
MINDIESD_PATH=/usr/local/python3.11.14/lib/python3.11/site-packages/mindiesd
export ASCEND_CUSTOM_OPP_PATH=$MINDIESD_PATH/ops/vendors/customize:$MINDIESD_PATH/ops/vendors/aie_ascendc:
NPU_NUM=8
export HCCL_CONNECT_TIMEOUT=3600
export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export TASK_QUEUE_ENABLE=2
export LD_PRELOAD=/usr/local/Ascend/cann-8.5.0/aarch64-linux/lib64/libjemalloc.so:$LD_PRELOAD
export CPU_AFFINITY_CONF=2
torchrun --nproc_per_node=$NPU_NUM --standalone generate_infinitetalk.py \
--ckpt_dir /data/z00823791/weight/Wan2.1-I2V-14B-480P \
--wav2vec_dir /data/z00823791/weight/chinese-wav2vec2-base \
--infinitetalk_dir /data/z00823791/weight/InfiniteTalk-single/single/infinitetalk.safetensors \
--ulysses_size=$NPU_NUM \
--input_json examples/single_example_image.json \
--size infinitetalk-480 \
--t5_fsdp \
--sample_steps 4 \
--lora_dir /data/z00823791/weight/Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors \
--mode streaming \
--motion_frame 9 \
--sample_text_guide_scale 1.0 \
--sample_audio_guide_scale 1.0 \
--lora_scale 1.0 \
--sample_shift 11 \
--save_file infinitetalk_sigle参数说明:
- NPU_NUM: 使用npu卡数
- size: 生成视频的尺寸
- sample_steps: 生成视频时执行的步数
在线量化wan2.1和lora权重后,精度损失较大,且性能收益较低,不建议进行量化
| 模型 | 硬件型号 | 卡数 | 分辨率 | LA稀疏 | 4步 E2E耗时(s) |
|---|---|---|---|---|---|
| InfiniteTalk | A2 910B3 | 8 | 480P | 关闭 | 82 |
| InfiniteTalk | A2 910B3 | 8 | 480P | 0.85 | 78 |
| InfiniteTalk | 800T A3 | 4 | 480P | 关闭 | 64 |
| InfiniteTalk | 800T A3 | 4 | 480P | 0.85 | 59 |
- 本代码仓提到的数据集和模型仅作为示例,这些数据集和模型仅供您用于非商业目的,如您使用这些数据集和模型来完成示例,请您特别注意应遵守对应数据集和模型的License,如您因使用数据集或模型而产生侵权纠纷,华为不承担任何责任。
- 如您在使用本代码仓的过程中,发现任何问题(包括但不限于功能问题、合规问题),请在本代码仓提交issue,我们将及时审视并解答。