aadi8anant commited on
Commit
4adab6f
β€’
1 Parent(s): 5c9a23e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -59
README.md CHANGED
@@ -1,59 +1,70 @@
1
- # πŸ€– DocBot: Smart Document ChatBot
2
-
3
- DocBot is an intelligent document processing application with a chatbot interface. It can process various types of documents, including PDFs and images, extract essential information, and enable user interaction through a chat interface.
4
-
5
- ## ⭐️ Features
6
-
7
- - **Document Upload**: Upload PDF, PNG, JPG, or JPEG files for processing.
8
- - **Text Extraction**: Extract text content from uploaded documents.
9
- - **Image Processing**: Convert PDF documents to images and extract text from images.
10
- - **Chatbot Interface**: Interact with the document through a chatbot interface powered by Groq.
11
- - **Natural Language Understanding**: Utilizes spaCy for natural language processing.
12
- - **Dynamic Progress Bar**: Visual feedback on document processing progress.
13
- - **Error Handling**: Provides error messages for any processing failures.
14
-
15
- ## βš™οΈ Installation
16
-
17
- 1. Clone the repository:
18
-
19
- ```bash
20
- git clone https://github.com/yourusername/docbot.git
21
- ```
22
-
23
- 2. Install the required Python packages:
24
-
25
- ```bash
26
- pip install -r requirements.txt
27
- ```
28
-
29
- 3. Set up the environment variables:
30
-
31
- Create a `.env` file in the root directory and add the following:
32
-
33
- ```dotenv
34
- GROQ_API_KEY='your_groq_api_key'
35
- ```
36
-
37
- 4. Run the Streamlit app:
38
-
39
- ```bash
40
- streamlit run app.py
41
- ```
42
-
43
- ## πŸš€ Usage
44
-
45
- 1. Run the Streamlit app using the provided installation instructions.
46
- 2. Upload your document using the file uploader.
47
- 3. Wait for the document to be processed.
48
- 4. Interact with the document by asking questions in the chatbot interface.
49
-
50
- ## πŸ’» Technologies Used
51
-
52
- - [Streamlit](https://streamlit.io/) - For building the interactive web application.
53
- - [PyPDF2](https://pythonhosted.org/PyPDF2/) - For PDF document processing.
54
- - [pdf2image](https://github.com/Belval/pdf2image) - For converting PDFs to images.
55
- - [PyMuPDF](https://pypi.org/project/PyMuPDF/) - For PDF document rendering.
56
- - [Tesseract OCR](https://github.com/tesseract-ocr/tesseract) - For extracting text from images.
57
- - [spaCy](https://spacy.io/) - For natural language processing.
58
- - [Groq](https://github.com/groq/groq-py) - For AI-powered chatbot interaction.
59
- - [Pillow](https://python-pillow.org/) - For image processing.
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: "DocBot: Smart Document ChatBot"
3
+ emoji: πŸ€–
4
+ colorFrom: indigo
5
+ colorTo: purple
6
+ sdk: streamlit
7
+ sdk_version: "0.87.0"
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
+
12
+ # πŸ€– DocBot: Smart Document ChatBot
13
+
14
+ DocBot is an intelligent document processing application with a chatbot interface. It can process various types of documents, including PDFs and images, extract essential information, and enable user interaction through a chat interface.
15
+
16
+ ## ⭐️ Features
17
+
18
+ - **Document Upload**: Upload PDF, PNG, JPG, or JPEG files for processing.
19
+ - **Text Extraction**: Extract text content from uploaded documents.
20
+ - **Image Processing**: Convert PDF documents to images and extract text from images.
21
+ - **Chatbot Interface**: Interact with the document through a chatbot interface powered by Groq.
22
+ - **Natural Language Understanding**: Utilizes spaCy for natural language processing.
23
+ - **Dynamic Progress Bar**: Visual feedback on document processing progress.
24
+ - **Error Handling**: Provides error messages for any processing failures.
25
+
26
+ ## βš™οΈ Installation
27
+
28
+ 1. Clone the repository:
29
+
30
+ ```bash
31
+ git clone https://github.com/yourusername/docbot.git
32
+ ```
33
+
34
+ 2. Install the required Python packages:
35
+
36
+ ```bash
37
+ pip install -r requirements.txt
38
+ ```
39
+
40
+ 3. Set up the environment variables:
41
+
42
+ Create a `.env` file in the root directory and add the following:
43
+
44
+ ```dotenv
45
+ GROQ_API_KEY='your_groq_api_key'
46
+ ```
47
+
48
+ 4. Run the Streamlit app:
49
+
50
+ ```bash
51
+ streamlit run app.py
52
+ ```
53
+
54
+ ## πŸš€ Usage
55
+
56
+ 1. Run the Streamlit app using the provided installation instructions.
57
+ 2. Upload your document using the file uploader.
58
+ 3. Wait for the document to be processed.
59
+ 4. Interact with the document by asking questions in the chatbot interface.
60
+
61
+ ## πŸ’» Technologies Used
62
+
63
+ - [Streamlit](https://streamlit.io/) - For building the interactive web application.
64
+ - [PyPDF2](https://pythonhosted.org/PyPDF2/) - For PDF document processing.
65
+ - [pdf2image](https://github.com/Belval/pdf2image) - For converting PDFs to images.
66
+ - [PyMuPDF](https://pypi.org/project/PyMuPDF/) - For PDF document rendering.
67
+ - [Tesseract OCR](https://github.com/tesseract-ocr/tesseract) - For extracting text from images.
68
+ - [spaCy](https://spacy.io/) - For natural language processing.
69
+ - [Groq](https://github.com/groq/groq-py) - For AI-powered chatbot interaction.
70
+ - [Pillow](https://python-pillow.org/) - For image processing.