AttributeError: 'Document'对象没有'get_doc_id'属性

0 投票
1 回答
89 浏览
提问于 2025-04-13 16:51

我的应用程序:将CSV文件加载到知识图谱中(使用KnowledgeGraphIndex),并使用大型语言模型(HuggingFaceH4/zephyr-7b-beta)从图谱存储(SimpleGraphStore)中获取答案。

我的问题:我想将多个CSV文件传入知识图谱,我正在使用CSVLoader,当我运行knowledgeGraphIndex时,出现了这个错误:AttributeError: 'Document'对象没有'get_doc_id'这个属性。

这是我加载CSV的方式:

`from langchain.document_loaders import CSVLoader
from langchain.text_splitter import CharacterTextSplitter
csv_loader = CSVLoader("/content/Train-Set.csv")
data = csv_loader.load()

splitter = CharacterTextSplitter(separator = "\n",
                                chunk_size=500, 
                                chunk_overlap=0,
                                length_function=len)
documents = splitter.split_documents(data)`

这是我的KnowledgeGraphIndex:

`index = KnowledgeGraphIndex.from_documents(
   documents,
 storage_context=storage_context,
   include_embeddings=True,
   max_triplets_per_chunk=2,
   embed_model=embed_model,

)``

1 个回答

0

试着使用来自 langchain.docstoreDocument,然后用这个类来创建文档:

from langchain.document_loaders import CSVLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.docstore.document import Document
from llama_index.core import KnowledgeGraphIndex

csv_loader = CSVLoader("/content/Train-Set.csv")
data = csv_loader.load()
documents = []
splitter = CharacterTextSplitter(separator="\n",
                                 chunk_size=500,
                                 chunk_overlap=0,
                                 length_function=len)
docs = splitter.split_documents(data)

# Assign a unique identifier to each document
for i, doc in enumerate(documents):
    new_doc = Document(
        page_content=doc.page_content,    )
    documents.append(new_doc)

index = KnowledgeGraphIndex.from_documents(
    documents,
    include_embeddings=True,
    max_triplets_per_chunk=2,
)

撰写回答