MoviePy不该这么慢

Question

我正在用一个json脚本制作一本插图有声书。这个脚本会让同一张图片显示很多秒，有时候甚至会显示超过1秒。但我觉得moviepy这个工具还是在生成每一帧画面。
我觉得应该有更快的方法来渲染这个内容——我甚至考虑换一个库或工具。每次我稍微修改一下，制作一个小时的有声书都要花好几个小时！

Python版本：3.11.5 | 由Anaconda, Inc.打包 | （主版本，2023年9月11日，13:26:23）[MSC v.1916 64位（AMD64）]

我的代码

!pip install python-ffmpeg
# remember to restart the kernel if this wasn't installed before
!pip install moviepy
pip install ImageMagic  # need to install 'legacy tools' for this to work

mp3_f = "d:/bookcontent/mp3s/"
image_f = "d:/bookcontent/images/"
json_file = "d:/bookcontent/script.json"
output_file = "d:/desktop/render_debug.mp4"

from moviepy.editor import ImageClip, AudioFileClip, concatenate_audioclips, concatenate_videoclips, TextClip, CompositeVideoClip
import json
import time
import matplotlib.pyplot as plt

def render_movie(mp3_folder, images_folder, json_file, output_file, debug_mode=False, thread_count=14):
    start_time = time.time()  # Start timing
    with open(json_file, 'r', encoding="UTF-8") as file:
        data = json.load(file)

    clips = []
    current_img_clip = None
    audio_duration = 0

    for item in data:
        if 'img' in item:
            if current_img_clip is not None:
                # Finalize the previous clip before starting a new one
                clips.append(current_img_clip.set_duration(audio_duration))

            img_path = f"{images_folder}/{item['img']}"
            current_img_clip = ImageClip(img_path)
            audio_duration = 0  # Reset audio duration for the new img clip

            if debug_mode:
                debug_text = f"Image: {item['img']}"

        elif 'mp3' in item:
            mp3_path = f"{mp3_folder}/{item['mp3']}"
            audio_clip = AudioFileClip(mp3_path)
            audio_duration += audio_clip.duration

            if current_img_clip.audio is None:
                current_img_clip = current_img_clip.set_audio(audio_clip)
            else:
                current_img_clip.audio = concatenate_audioclips([current_img_clip.audio, audio_clip])

            if debug_mode:
                debug_text += f" | MP3: {item['mp3']}"

    # Append the last image clip if it exists, with any pending audio adjustments
    if current_img_clip is not None:
        clips.append(current_img_clip.set_duration(audio_duration))

    # Apply debug mode text to all clips if debug_mode is True
    if debug_mode:
        for i, clip in enumerate(clips):
            txt_clip = TextClip(debug_text, fontsize=20, color='white', bg_color='black').set_position('bottom').set_duration(clip.duration)
            clips[i] = CompositeVideoClip([clip, txt_clip])

    # Concatenate all clips into one video
    final_clip = concatenate_videoclips(clips, method="compose")

    # Export the video
    
    fps_num = 24
    if debug_mode:
        fps_num = 4
    
    final_clip.write_videofile(output_file, fps=24, codec="libx264", audio_codec="aac", bitrate="4000k", threads=thread_count)


    end_time = time.time()  # End timing
    return end_time - start_time  # Return the duration of the render operation

duration = render_movie(mp3_f, image_f, json_file, output_file + str(i)+".mp4", debug_mode=True)

示例JSON数据

[
    {
        "img": "maintitle.png"
    },
    {
        "mp3": "intro_music.mp3",
    },
    {
        "mp3": "10000300_7095387230477472453.mp3",
    },
    {
        "img": "CH1.png"
    },
    {
        "mp3": "10000200_13107803339676511791.mp3",
    }
]

图像处理 moviepy 渲染性能有声书制作 json脚本多媒体工具编码效率

MoviePy不该这么慢

1 个回答

撰写回答