SpacyModelCreator / README2.md
WebashalarForML's picture
Update README2.md
e31d2da verified
AI App Icon

Spacy Model Creator



Overview:

This project is a comprehensive Resume Parsing tool built using Python, integrating the Mistral-Nemo-Instruct-2407 model for primary parsing.

Installation Guide:

  1. Create and Activate a Virtual Environment python -m venv venv source venv/bin/activate # For Linux/Mac

    or

    venv\Scripts\activate # For Windows

    NOTE: If the virtual environment (venv) is already created, you can skip the creation step and just activate.

     - For Linux/Mac:
         source venv/bin/activate
     - For Windows:
         venv\Scripts\activate
    
  2. Install Required Libraries pip install -r requirements.txt

    Ensure the following dependencies are included:

    • Flask
    • spaCy
    • huggingface_hub
    • PyMuPDF
    • python-docx
    • Tesseract-OCR (for image-based parsing)

; NOTE : If any model or library is not installed, you can install it using: pip install Replace with the specific model or library you need to install

  1. Set up Hugging Face Token
    • Add your Hugging Face token to the .env file as: HF_TOKEN=

File Structure Overview:

Spacy_Model_creator/
β”‚
β”œβ”€β”€ Models/
β”‚   └── ner_model_05_3  # Pretrained spaCy model directory for resume parsing
β”‚    
β”œβ”€β”€ data/
β”‚   └── Json_data.json 
β”‚   └── resume_text.txt
β”‚   └── Spacy_data.spacy
β”‚
β”œβ”€β”€ templates/
β”‚   β”œβ”€β”€ anoter.html  
β”‚   └── result.html   
β”‚   └── guide.html
β”‚   └── savejson.html
β”‚   └── savespacy.html
β”‚   └── text.html
β”‚   └── upload.html
β”‚   └── data_files.html
β”‚
β”œβ”€β”€ JSON/ 
β”‚   └── Json_data.json 
β”‚
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ model.py  # Code for calling Mistral API and handling responses
β”‚   β”œβ”€β”€ json_to_spacy.py  # spaCy fallback model for parsing resumes
β”‚   β”œβ”€β”€ anoter_to_json.py  # Error handling utilities
β”‚   └── file_To_text.py  # Functions to extract text from different file formats (PDF, DOCX, etc.)
β”‚
β”œβ”€β”€ venv/  # Virtual environment
β”‚
β”œβ”€β”€ .env  # Environment variables file (contains Hugging Face token)
β”‚
β”œβ”€β”€ app.py  # Flask app handling API routes for uploading and processing resumes
β”‚
└── requirements.txt  # Dependencies required for the project

References: