mistralai groq requests beautifulsoup4 docx2txt python-docx textract openpyxl==3.1.0 sentence-transformers anthropic pdfminer pypdf langchain unstructured[docx] transformers sentence-transformers