docverifyrag / backend /ingest.py
Carlos Salgado
update backend files, ignore pycache
d665e88
raw
history blame
223 Bytes
from langchain_community.document_loaders import UnstructuredPDFLoader
def ingest_pdf(path):
loader = UnstructuredPDFLoader()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
return data