使用Python和HTMLDOC动态转换HTML为PDF

Question

大约一年前，我为一个客户开发了一个Django应用程序。现在，他把这个应用程序转售给了一个超级保密的政府机构，他们甚至不告诉我这个机构的名字。

这个应用程序的一部分功能是使用一个叫xhtml2pdf（也叫pisa）的Python库动态生成PDF文件。可是，政府不喜欢这个库，他们也不告诉我原因，只说我必须使用HTMLDOC来生成PDF。

关于这个库的文档不多，但我从PHP的示例中看到，似乎可以通过命令行与它进行沟通，所以应该可以和Python一起使用。不过，我在把HTML传给HTMLDOC时遇到了困难。看起来HTMLDOC只接受文件，但我需要把动态生成的HTML作为字符串传过去。（或者先把HTML字符串写入一个临时文件，然后再把这个临时文件传给HTMLDOC）。

我以为使用StringIO可以解决这个问题，但我遇到了错误。以下是我的代码：

def render_to_pdf(template_src, context_dict):
    template = get_template(template_src)
    context = Context(context_dict)
    html  = template.render(context)
    result = StringIO.StringIO(html.encode("utf-8"))
    os.putenv("HTMLDOC_NOCGI", "1")

    #this line throws "[Errno 2] No such file or directory"
    htmldoc = subprocess.Popen("htmldoc -t pdf --quiet '%s'" % result, stdout=subprocess.PIPE).communicate()

    pdf = htmldoc[0]
    result.close()
    return HttpResponse(pdf, mimetype='application/pdf')

如果有任何想法、建议或帮助，我会非常感激。

谢谢。

更新

错误追踪信息：

Environment:


Request Method: GET
Request URL: (redacted)

Django Version: 1.3 alpha 1 SVN-14921
Python Version: 2.6.5
Installed Applications:
['django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.sites',
 'django.contrib.messages',
 'django.contrib.admin',
 'application']
Installed Middleware:
('django.middleware.common.CommonMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware')


Traceback:
File "/usr/local/lib/python2.6/dist-packages/django/core/handlers/base.py" in get_response

  111. response = callback(request, *callback_args, **callback_kwargs)

File "/usr/local/lib/python2.6/dist-packages/django/contrib/auth/decorators.py" in _wrapped_view

  23. return view_func(request, *args, **kwargs)

File "/home/ascgov/application/views/pdf.py" in application_pdf

  90. 'user':owner})

File "/home/ascgov/application/views/pdf.py" in render_to_pdf

  53. htmldoc = subprocess.Popen("/usr/bin/htmldoc -t pdf --quiet '%s'" % result, stdout=subprocess.PIPE).communicate()

File "/usr/lib/python2.6/subprocess.py" in __init__

  633. errread, errwrite)

File "/usr/lib/python2.6/subprocess.py" in _execute_child

  1139. raise child_exception

Exception Type: OSError at /pdf/application/feed-filtr/
Exception Value: [Errno 2] No such file or directory

django stringio command line interface xhtml2pdf pdf generation temporary files dynamic html htmldoc

使用Python和HTMLDOC动态转换HTML为PDF

2 个回答

撰写回答