<blockquote>
<p>AFAIK, you will have to at least create a temp file so that you can
perform your process.</p>
</blockquote>
<p>您可以使用以下代码获取/读取PDF文件并将其转换为文本文件。
这将使用PDFMINER和python3.7。在</p>
<pre><code>from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import HTMLConverter,TextConverter,XMLConverter
from pdfminer.layout import LAParams
from pdfminer.pdfpage import PDFPage
import io
def convert(case,fname, pages=None):
if not pages:
pagenums = set()
else:
pagenums = set(pages)
manager = PDFResourceManager()
codec = 'utf-8'
caching = True
output = io.StringIO()
converter = TextConverter(manager, output, codec=codec, laparams=LAParams())
interpreter = PDFPageInterpreter(manager, converter)
infile = open(fname, 'rb')
for page in PDFPage.get_pages(infile, pagenums, caching=caching, check_extractable=True):
interpreter.process_page(page)
convertedPDF = output.getvalue()
print(convertedPDF)
infile.close()
converter.close()
output.close()
return convertedPDF
</code></pre>
<p>调用上述程序的主函数:</p>
^{pr2}$
<p>当然,你可以对它进行更多的调整,也许还有更大的改进空间,但这件事肯定会奏效。在</p>
<blockquote>
<p>Just make sure instead of providing pdf folder provide a temp pdf
file directly.</p>
</blockquote>
<p>希望这对你有帮助…快乐的编码!在</p>