从单元格中检索全文数据（单元格中有多种字体颜色/样式）

1条回答

网友
1楼 · 发布于 2024-04-29 14:49:56

我遇到了同样的问题。我需要在一些富文本单元格中找到红色的文本跨度。在深入研究openpyxl（v3.0.9）的源代码之后，我发现了它do parse the rich-text tags，但是格式是stripped by the reader，因为Text对象的^{}属性在^{}函数中使用
因此，我编写了一个简单的修补程序脚本来覆盖read_string_table函数，以便在存在格式化文本时返回原始Text对象。修改的read_string_table函数如下所示
def read_string_table(xml_source): """Read in all shared strings in the table. If a shared string has formatted snippets, the raw Text object is appended to the returned list. Otherwise, only the plain text content of the shared string is appended to the list. """ strings = [] STRING_TAG = '{%s}si' % SHEET_MAIN_NS for _, node in iterparse(xml_source): if node.tag == STRING_TAG: text_obj = Text.from_tree(node) if text_obj.formatted: text = text_obj # return raw Text object else: # original processing text = text_obj.content text = text.replace('x005F_', '') node.clear() strings.append(text) return strings
完整的补丁脚本在here中提供。在直接导入任何openpyxl模块之前，需要导入它并调用patch_read_string_table函数。应用此修补程序后，富文本单元格的value将是一个Text对象，其中包含您需要的所有样式信息
根据您的使用情况，这可能不是最好的解决方案，但它会向您显示格式的剥离位置以及如何恢复格式。我希望可以提出一个更优雅的解决方案，并最终在将来合并到官方代码中

相关问题更多 >

编程相关推荐

热门问题

热门文章