Spaces:

aadi8anant
/

DocBot

App Files Files Community

aadi8anant commited on Jun 13

Commit

5c9a23e

•

1 Parent(s): 6ad2654

Upload 2 files

Browse files

Files changed (2) hide show

README.md +59 -12
requirements.txt +8 -0

README.md CHANGED Viewed

@@ -1,12 +1,59 @@
----
-title: DocBot
-emoji: 🐢
-colorFrom: indigo
-colorTo: pink
-sdk: streamlit
-sdk_version: 1.35.0
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# 🤖 DocBot: Smart Document ChatBot
+DocBot is an intelligent document processing application with a chatbot interface. It can process various types of documents, including PDFs and images, extract essential information, and enable user interaction through a chat interface.
+## ⭐️ Features
+- **Document Upload**: Upload PDF, PNG, JPG, or JPEG files for processing.
+- **Text Extraction**: Extract text content from uploaded documents.
+- **Image Processing**: Convert PDF documents to images and extract text from images.
+- **Chatbot Interface**: Interact with the document through a chatbot interface powered by Groq.
+- **Natural Language Understanding**: Utilizes spaCy for natural language processing.
+- **Dynamic Progress Bar**: Visual feedback on document processing progress.
+- **Error Handling**: Provides error messages for any processing failures.
+## ⚙️ Installation
+1. Clone the repository:
+    ```bash
+    git clone https://github.com/yourusername/docbot.git
+    ```
+2. Install the required Python packages:
+    ```bash
+    pip install -r requirements.txt
+    ```
+3. Set up the environment variables:
+    Create a `.env` file in the root directory and add the following:
+    ```dotenv
+    GROQ_API_KEY='your_groq_api_key'
+    ```
+4. Run the Streamlit app:
+    ```bash
+    streamlit run app.py
+    ```
+## 🚀 Usage
+1. Run the Streamlit app using the provided installation instructions.
+2. Upload your document using the file uploader.
+3. Wait for the document to be processed.
+4. Interact with the document by asking questions in the chatbot interface.
+## 💻 Technologies Used
+- [Streamlit](https://streamlit.io/) - For building the interactive web application.
+- [PyPDF2](https://pythonhosted.org/PyPDF2/) - For PDF document processing.
+- [pdf2image](https://github.com/Belval/pdf2image) - For converting PDFs to images.
+- [PyMuPDF](https://pypi.org/project/PyMuPDF/) - For PDF document rendering.
+- [Tesseract OCR](https://github.com/tesseract-ocr/tesseract) - For extracting text from images.
+- [spaCy](https://spacy.io/) - For natural language processing.
+- [Groq](https://github.com/groq/groq-py) - For AI-powered chatbot interaction.
+- [Pillow](https://python-pillow.org/) - For image processing.

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+streamlit==1.11.0
+PyPDF2==1.26.0
+pdf2image==1.16.0
+pytesseract==0.3.9
+Pillow==9.2.0
+spacy==3.3.1
+transformers==4.21.1
+requests==2.28.1