将我的数据插入MySql数据库时出现UnicodeDecodeError('charmap')

2024-05-14 22:46:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在浏览一些网站。我得到一些文本并将它们插入我的数据库。我搜集的文本可以是任何语言,包括英语。我想保留数据库中的所有文本,不会丢失。有时我在插入数据库时会遇到异常。所有的例外情况都是一样的。一个例外的例子是这样的

Arguments: (UnicodeDecodeError('charmap', b'Overview :\nGeneral\n1.\nEligibility\nThis \xe2\x80\x9cCall for Expression of Interest\xe2\x80\x9d is open for Individual Consultants.\n2. \nPurpose\nThe aim of the Call is to establish a qualified pool of consultants who can deliver professional services on a short-term basis for the conduct of various activities including but not limited to provision of consultancy services

我明白了,这可能是因为一些未定义的字符。我如何避免它们,并确保不会出现任何UnivodEdeCoder错误,我的数据将插入到我的数据库中

这是我以前试过的

final_text = scraped_text.encode('utf-8', 'ignore')


op { final_text = final_text }
session.add (op)
session.commit()
session.flush()

这是我的数据库表方案 enter image description here

p.S将数据插入数据库并注意刮取的文本并非始终是英语,这一点非常重要。它可以是法语,荷兰语,。。。。。非常感谢你的帮助

导致异常的示例文本

Message: 'commit exception UNDP procurement' Arguments: (UnicodeDecodeError('charmap', b'Overview :\nRequest for Proposal (RFP) for a consultancy service to conduct a review of routine existing data collection tools and supplementary survey manuals from a gender perspective and publish a standard basic data collection tools and supplementary manuals for surveys and administrative data.\nThe United Nations Entity for Gender Equality and the Empowerment of Women (UN Women) plans to procure a consultancy service a consultancy service to conduct a review of routine existing data collection tools and supplementary survey manuals from a gender perspective and publish a standard basic data collection tools and supplementary manuals for surveys, and administrative dataas described in this Request for Proposal and its related annexes. UN Women now invites sealed proposals from qualified proposers for providing the requirements as defined in these documents.\nIn order to prepare a responsive proposal, you must carefully review, and understand the contents of the following documents:\nThis letter (and the included Proposal Instruction Sheet (PIS)\nInstructions to Proposers (Annex I)available from this link:http://www.unwomen.org/-/media/headquarters/attachments/sections/about%20us/procurement/un-women-procurement-rfp-instructions-en.pdf?la=en&vs=3939\nTerms of Reference (TOR) (Annex 2)\nEvaluation Methodology and Criteria (Annex 3)\nFormat of Technical Proposal (Annex 4)\nFormat of Financial Proposal (Annex 5)\nProposal Submission Form (Annex 6)\nVoluntary Agreement to Promote Gender Equality and Women\xe2\x80\x99s Empowerment (Annex 7)\nUN Women Model Forms of Contract (Annex 8)\nGeneral Conditions of Contract (Annex 8)\nJoint Venture/Consortium/Association Information Form (Annex 9)\nSubmission Checklist (Annex 10)\nThe Proposal Instruction Sheet (PIS) -below- provides the requisite information (with cross reference numbers) which is further detailed in theInstructions to Proposers (Annex I)\nDetailed Instruction governing below listed summary of the \xe2\x80\x9cinstructions to proposers\xe2\x80\x9d are available in the Annex I (\xe2\x80\x9cInstruction to Proposers\xe2\x80\x9d) accessible from this link:\nDeadline for submission : Date and Time:Monday 06 July 2020 5:00 PM(EAT)\nAddress of proposal submission : \xe2\x98\x92 Electronic submission of Quotations:
https://ungm.in-tend.co.uk/unwomen/aspx/Home \nPlease note that proposers should attach the below additional documents :\nCompany Registration Certificate \xe2\x80\x93 mandatory\nAudited Financial Statement\nTestimonial/Previous Track Record.\nContact address for requesting clarifications on the solicitation documents: \nhttps://ungm.in-tend.co.uk/unwomen/aspx/Home\n ', 1993, 1994, 'character maps to '),)

Process finished with exit code -1

Tags: andofthetoinfrom文本数据库
1条回答
网友
1楼 · 发布于 2024-05-14 22:46:34

你需要使用

final_text = scraped_text.decode('utf-8', 'ignore')

不是encode。这里的理论是8位字符串是“编码”的,需要“解码”才能转换为Unicode字符串。你发布的字符串解码得很好。这些特殊字符是“智能引号”

相关问题 更多 >

    热门问题