如何避免生成重复的图像文件?

2024-04-25 03:49:18 发布

您现在位置:Python中文网/ 问答频道 /正文

如何避免在Python中生成重复的图像文件?你知道吗

我有一个使用Pydocx(用于docx转换的Python模块)将MS-Word文档转换为基本HTML的项目。 除了将图像文件写入磁盘的部分之外,代码大部分都按预期工作。你知道吗

我使用随机键和图像名称函数以及urlretrieve的组合。 我的要求是编写/生成唯一的自定义文件名。你知道吗

这是我的密码:

def random_key(length):
    key = ''
    for i in range(length):
        key += random.choice(string.digits)
    return key

# Function to generate random image names

def image_name():
    return '{}'.format(os.path.join(IMAGE_LOCATION, random_key(4)))

def get_image_tag(self, image, width=None, height=None, rotate=None, 
    alt=None, caption=None):

        image_src = self.get_image_source(image)

# get base64 file extension from bytes
# https://matthewdaly.co.uk/blog/2015/07/04/handling-images-as-base64-
     strings-with-django-rest-framework/

format, imag = image_src.split(';base64,') 
# guess file extension
    ext = format.split('/')[-1]

# Capture the generated filename with the proper extension to use in img 
    source attribute

image_src_new = 'doc_img_' + image_name() + '.' + ext

# Code that is generating duplicate images from the same base64 source 
string
# Function to convert base64 string to image using urlretireve


urlretrieve(image_src, './source/output/' + image_src_new)

# Set the image source to the newly created filename

attrs = {
        'src': image_src_new
    }
if rotate:
    attrs['style'] = 'transform: rotate(%sdeg);' % rotate
if alt:
    attrs['alt'] = alt

return HtmlTag('img', allow_self_closing=True, allow_whitespace=True, 
**attrs)

# List files with glob using filter

source_files = glob.glob('./source/mydocument.docx')

for file in source_files:

html = PyDocXHTMLExporterImageOut(file).export()

# Get the full filename
base_filename = os.path.splitext(file)[0]
# Split the full filename to get the actual filename excluding parent 
directory

file_name = file.split('/')[3]
# Get the filename without the extension
no_ext_file_name = file_name.split('.')[0]

# Use codecs to write clean html content to utf

with codecs.open(('./source/output/' + no_ext_file_name).lower() + '.html', 
'w', 'utf-16') as output:
    output.write(html)

print('Done converting source word files to html')

谢谢


Tags: thetokeynameimagesrcnonesource
1条回答
网友
1楼 · 发布于 2024-04-25 03:49:18

我猜你指的是偶尔的随机名称重复-我会在try except附件中检查试图读取它的文件,如果抛出异常,则生成的名称是唯一的。你知道吗

相关问题 更多 >