AttributeError: 'Document'对象没有'get_doc_id'属性
我的应用程序:将CSV文件加载到知识图谱中(使用KnowledgeGraphIndex),并使用大型语言模型(HuggingFaceH4/zephyr-7b-beta)从图谱存储(SimpleGraphStore)中获取答案。
我的问题:我想将多个CSV文件传入知识图谱,我正在使用CSVLoader,当我运行knowledgeGraphIndex时,出现了这个错误:AttributeError: 'Document'对象没有'get_doc_id'这个属性。
这是我加载CSV的方式:
`from langchain.document_loaders import CSVLoader
from langchain.text_splitter import CharacterTextSplitter
csv_loader = CSVLoader("/content/Train-Set.csv")
data = csv_loader.load()
splitter = CharacterTextSplitter(separator = "\n",
chunk_size=500,
chunk_overlap=0,
length_function=len)
documents = splitter.split_documents(data)`
这是我的KnowledgeGraphIndex:
`index = KnowledgeGraphIndex.from_documents(
documents,
storage_context=storage_context,
include_embeddings=True,
max_triplets_per_chunk=2,
embed_model=embed_model,
)``
1 个回答
0
试着使用来自 langchain.docstore
的 Document
,然后用这个类来创建文档:
from langchain.document_loaders import CSVLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.docstore.document import Document
from llama_index.core import KnowledgeGraphIndex
csv_loader = CSVLoader("/content/Train-Set.csv")
data = csv_loader.load()
documents = []
splitter = CharacterTextSplitter(separator="\n",
chunk_size=500,
chunk_overlap=0,
length_function=len)
docs = splitter.split_documents(data)
# Assign a unique identifier to each document
for i, doc in enumerate(documents):
new_doc = Document(
page_content=doc.page_content, )
documents.append(new_doc)
index = KnowledgeGraphIndex.from_documents(
documents,
include_embeddings=True,
max_triplets_per_chunk=2,
)