MoviePy不该这么慢

0 投票
1 回答
42 浏览
提问于 2025-04-14 18:19

我正在用一个json脚本制作一本插图有声书。这个脚本会让同一张图片显示很多秒,有时候甚至会显示超过1秒。但我觉得moviepy这个工具还是在生成每一帧画面。
我觉得应该有更快的方法来渲染这个内容——我甚至考虑换一个库或工具。每次我稍微修改一下,制作一个小时的有声书都要花好几个小时!

Python版本:3.11.5 | 由Anaconda, Inc.打包 | (主版本,2023年9月11日,13:26:23)[MSC v.1916 64位(AMD64)]

我的代码

!pip install python-ffmpeg
# remember to restart the kernel if this wasn't installed before
!pip install moviepy
pip install ImageMagic  # need to install 'legacy tools' for this to work

mp3_f = "d:/bookcontent/mp3s/"
image_f = "d:/bookcontent/images/"
json_file = "d:/bookcontent/script.json"
output_file = "d:/desktop/render_debug.mp4"

from moviepy.editor import ImageClip, AudioFileClip, concatenate_audioclips, concatenate_videoclips, TextClip, CompositeVideoClip
import json
import time
import matplotlib.pyplot as plt

def render_movie(mp3_folder, images_folder, json_file, output_file, debug_mode=False, thread_count=14):
    start_time = time.time()  # Start timing
    with open(json_file, 'r', encoding="UTF-8") as file:
        data = json.load(file)

    clips = []
    current_img_clip = None
    audio_duration = 0

    for item in data:
        if 'img' in item:
            if current_img_clip is not None:
                # Finalize the previous clip before starting a new one
                clips.append(current_img_clip.set_duration(audio_duration))

            img_path = f"{images_folder}/{item['img']}"
            current_img_clip = ImageClip(img_path)
            audio_duration = 0  # Reset audio duration for the new img clip

            if debug_mode:
                debug_text = f"Image: {item['img']}"

        elif 'mp3' in item:
            mp3_path = f"{mp3_folder}/{item['mp3']}"
            audio_clip = AudioFileClip(mp3_path)
            audio_duration += audio_clip.duration

            if current_img_clip.audio is None:
                current_img_clip = current_img_clip.set_audio(audio_clip)
            else:
                current_img_clip.audio = concatenate_audioclips([current_img_clip.audio, audio_clip])

            if debug_mode:
                debug_text += f" | MP3: {item['mp3']}"

    # Append the last image clip if it exists, with any pending audio adjustments
    if current_img_clip is not None:
        clips.append(current_img_clip.set_duration(audio_duration))

    # Apply debug mode text to all clips if debug_mode is True
    if debug_mode:
        for i, clip in enumerate(clips):
            txt_clip = TextClip(debug_text, fontsize=20, color='white', bg_color='black').set_position('bottom').set_duration(clip.duration)
            clips[i] = CompositeVideoClip([clip, txt_clip])

    # Concatenate all clips into one video
    final_clip = concatenate_videoclips(clips, method="compose")

    # Export the video
    
    fps_num = 24
    if debug_mode:
        fps_num = 4
    
    final_clip.write_videofile(output_file, fps=24, codec="libx264", audio_codec="aac", bitrate="4000k", threads=thread_count)


    end_time = time.time()  # End timing
    return end_time - start_time  # Return the duration of the render operation

duration = render_movie(mp3_f, image_f, json_file, output_file + str(i)+".mp4", debug_mode=True)


示例JSON数据

[
    {
        "img": "maintitle.png"
    },
    {
        "mp3": "intro_music.mp3",
    },
    {
        "mp3": "10000300_7095387230477472453.mp3",
    },
    {
        "img": "CH1.png"
    },
    {
        "mp3": "10000200_13107803339676511791.mp3",
    }
]

1 个回答

0

我把这个速度提升了很多。之前我的代码有个bug,调试模式下的帧率没有生效。我觉得尝试“合并”音频,而不是简单地重复使用图片,这样也让速度变慢了。另外,文本覆盖只是为了调试,但我觉得这也拖慢了速度。我把处理时间从几个小时缩短到了几分钟。如果我再想尝试文本覆盖,我会先修改图片,然后再从这个修改后的图片创建ImageClip,而不是让它在每一帧上都渲染。

from moviepy.editor import ImageClip, AudioFileClip, concatenate_audioclips, concatenate_videoclips, TextClip, CompositeVideoClip
import json
import time
import matplotlib.pyplot as plt

def render_movie(mp3_folder, images_folder, json_file, output_file, debug_mode=False, thread_count=14):
    print("Rendering")
    image_clips = []
    
    with open(json_file, 'r', encoding="UTF-8") as file:
        data = json.load(file)
    for item in data:
        if 'img' in item:
            image_path = f"{images_folder}/{item['img']}"
        elif 'mp3' in item:
            audio_path = f"{mp3_folder}/{item['mp3']}"
            audio = AudioFileClip(audio_path)
            image_clips.append(ImageClip(image_path).set_audio(audio).set_duration(audio.duration))
    print("Concatenating")
    final_clip = concatenate_videoclips(image_clips, method="compose")
    print("Finalizing")
    final_clip.write_videofile(output_file, fps=(30 if not debug_mode else 4), threads=thread_count, verbose=False)
    print("Done")

    
render_movie(mp3_f, image_f, json_file, output_file, debug_mode=True)

撰写回答