如何配置推理设置以使用Stable Diffusion XL管道生成图像?

0 投票
1 回答
32 浏览
提问于 2025-04-12 23:55

我正在使用Hugging Face的diffusers库中的Stable Diffusion XL(SDXL)模型,我想设置一些推理参数:

  • 宽度:图像的宽度,以像素为单位。
  • 高度:图像的高度,以像素为单位。
  • 步骤:在生成图像时进行的推理步骤数量。
  • cfg_scale:扩散过程对提示文本的遵循程度(数值越高,生成的图像越接近你的提示)。

这是我当前实现的一个简单示例:

import os
import datetime

from diffusers import DiffusionPipeline
import torch

if __name__ == "__main__":
    output_dir = "output_images"
    os.makedirs(output_dir, exist_ok=True)

    pipe = DiffusionPipeline.from_pretrained(
        # https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
        "stabilityai/stable-diffusion-xl-base-1.0",
        torch_dtype=torch.float16,
        use_safetensors=True,
        variant="fp16",
    )
    pipe.to("cuda")
    # enabling xformers for memory efficiency
    pipe.enable_xformers_memory_efficient_attention()

    prompt = "Extreme close up of a slice a lemon with splashing green cocktail, alcohol,  healthy food photography"

    images = pipe(prompt=prompt).images
    timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
    image_path = os.path.join(output_dir, f"output_{timestamp}.jpg")
    images[0].save(image_path)

    print(f"Image saved at: {image_path}")

我该如何设置推理参数?

1 个回答

0

这是我的解决方案

import os
import datetime

from diffusers import DiffusionPipeline
import torch

if __name__ == "__main__":
    output_dir = "output_images"
    os.makedirs(output_dir, exist_ok=True)

    pipe = DiffusionPipeline.from_pretrained(
        # https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
        "stabilityai/stable-diffusion-xl-base-1.0",
        torch_dtype=torch.float16,
        use_safetensors=True,
        variant="fp16",
    )
    pipe.to("cuda")
    # enabling xformers for memory efficiency
    pipe.enable_xformers_memory_efficient_attention()

    prompt = "Extreme close up of a slice a lemon with splashing green cocktail, alcohol,  healthy food photography"

    images = pipe(
                prompt=prompt,
                negative_prompt='',
                width=1024,                                     # Width of the image in pixels.
                height=1024,                                    # Height of the image in pixels.
                guidance_scale=guidance_scale,                  # How strictly the diffusion process adheres to the prompt text (higher values keep your image closer to your prompt).
                num_inference_steps=num_inference_steps,        # Amount of inference steps performed on image generation.
                num_images_per_prompt = 1,

    ).images
    timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
    image_path = os.path.join(output_dir, f"output_{timestamp}.jpg")
    images[0].save(image_path)

    print(f"Image saved at: {image_path}")

撰写回答