--- title: LLMServer emoji: 👹 colorFrom: indigo colorTo: purple sdk: docker pinned: false --- # LLM Server This repository contains a FastAPI-based server that serves open-source Large Language Models from Hugging Face. ## Getting Started These instructions will help you set up and run the project on your local machine. ### Prerequisites - Python 3.10 or higher - Git ### Cloning the Repository Choose one of the following methods to clone the repository: #### HTTPS ```bash git clone https://huggingface.co/spaces/TeamGenKI/LLMServer cd project-name ``` #### SSH ```bash git clone git@hf.co:spaces/TeamGenKI/LLMServer cd project-name ``` ### Setting Up the Virtual Environment #### Windows ```bash # Create virtual environment python -m venv myenv # Activate virtual environment myenv\Scripts\activate # Install dependencies pip install -r requirements.txt ``` #### Linux ```bash # Create virtual environment python -m venv myenv # Activate virtual environment source myenv/bin/activate # Install dependencies pip install -r requirements.txt ``` #### macOS ```bash # Create virtual environment python3 -m venv myenv # Activate virtual environment source myenv/bin/activate # Install dependencies pip3 install -r requirements.txt ``` ### Running the Application Once you have set up your environment and installed the dependencies, you can start the FastAPI application: ```bash uvicorn main.app:app --reload ``` The API will be available at `http://localhost:8001` ### API Documentation Once the application is running, you can access: - Interactive API documentation (Swagger UI) at `http://localhost:8000/docs` - Alternative API documentation (ReDoc) at `http://localhost:8000/redoc` ### Deactivating the Virtual Environment When you're done working on the project, you can deactivate the virtual environment: ```bash deactivate ``` ## Contributing [Add contributing guidelines here] ## License [Add license information here] ## Project Structure ``` . ├── Dockerfile ├── main │ ├── api.py │ ├── app.py │ ├── config.yaml │ ├── env_template │ ├── __init__.py │ ├── logs │ │ └── llm_api.log │ ├── models │ ├── __pycache__ │ │ ├── api.cpython-39.pyc │ │ ├── app.cpython-39.pyc │ │ ├── __init__.cpython-39.pyc │ │ └── routes.cpython-39.pyc │ ├── routes.py │ ├── test_locally.py │ └── utils │ ├── errors.py │ ├── helpers.py │ ├── __init__.py │ ├── logging.py │ ├── __pycache__ │ │ ├── helpers.cpython-39.pyc │ │ ├── __init__.cpython-39.pyc │ │ ├── logging.cpython-39.pyc │ │ └── validation.cpython-39.pyc │ └── validation.py ├── README.md └── requirements.txt ``` ERROR: INFO: 127.0.0.1:60874 - "POST /api/v1/model/download?model_name=microsoft%2FPhi-3.5-mini-instruct HTTP/1.1" 200 OK 2025-01-13 16:18:45,409 - api_routes - INFO - Received request to initialize model: microsoft/Phi-3.5-mini-instruct 2025-01-13 16:18:45,409 - llm_api - INFO - Initializing generation model: microsoft/Phi-3.5-mini-instruct 2025-01-13 16:18:45,412 - llm_api - INFO - Loading model from local path: main/models/Phi-3.5-mini-instruct The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead. Could not find the bitsandbytes CUDA binary at PosixPath('/home/aurelio/Desktop/Projects/LLMServer/myenv/lib/python3.13/site-packages/bitsandbytes/libbitsandbytes_cuda124.so') g++ (GCC) 14.2.1 20240910 Copyright (C) 2024 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. 2025-01-13 16:18:45,982 - llm_api - ERROR - Failed to initialize generation model microsoft/Phi-3.5-mini-instruct: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): Dynamo is not supported on Python 3.13+ 2025-01-13 16:18:45,982 - api_routes - ERROR - Error initializing model: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): Dynamo is not supported on Python 3.13+ INFO: 127.0.0.1:38330 - "POST /api/v1/model/initialize?model_name=microsoft%2FPhi-3.5-mini-instruct HTTP/1.1" 500 Internal Server Error