metadata

title: LLMServer
emoji: 👹
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false

LLM Server

This repository contains a FastAPI-based server that serves open-source Large Language Models from Hugging Face.

Getting Started

These instructions will help you set up and run the project on your local machine.

Prerequisites

Python 3.10 or higher
Git

Cloning the Repository

Choose one of the following methods to clone the repository:

HTTPS

git clone https://huggingface.co/spaces/TeamGenKI/LLMServer
cd project-name

SSH

git clone git@hf.co:spaces/TeamGenKI/LLMServer
cd project-name

Setting Up the Virtual Environment

Windows

# Create virtual environment
python -m venv myenv

# Activate virtual environment
myenv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Linux

# Create virtual environment
python -m venv myenv

# Activate virtual environment
source myenv/bin/activate

# Install dependencies
pip install -r requirements.txt

macOS

# Create virtual environment
python3 -m venv myenv

# Activate virtual environment
source myenv/bin/activate

# Install dependencies
pip3 install -r requirements.txt

Running the Application

Once you have set up your environment and installed the dependencies, you can start the FastAPI application:

uvicorn main.app:app --reload

The API will be available at http://localhost:8001

API Documentation

Once the application is running, you can access:

Interactive API documentation (Swagger UI) at http://localhost:8000/docs
Alternative API documentation (ReDoc) at http://localhost:8000/redoc

Deactivating the Virtual Environment

When you're done working on the project, you can deactivate the virtual environment:

deactivate

Contributing

[Add contributing guidelines here]

License

[Add license information here]

Project Structure

.
├── Dockerfile
├── main
│   ├── api.py
│   ├── app.py
│   ├── config.yaml
│   ├── env_template
│   ├── __init__.py
│   ├── logs
│   │   └── llm_api.log
│   ├── models
│   ├── __pycache__
│   │   ├── api.cpython-39.pyc
│   │   ├── app.cpython-39.pyc
│   │   ├── __init__.cpython-39.pyc
│   │   └── routes.cpython-39.pyc
│   ├── routes.py
│   ├── test_locally.py
│   └── utils
│       ├── errors.py
│       ├── helpers.py
│       ├── __init__.py
│       ├── logging.py
│       ├── __pycache__
│       │   ├── helpers.cpython-39.pyc
│       │   ├── __init__.cpython-39.pyc
│       │   ├── logging.cpython-39.pyc
│       │   └── validation.cpython-39.pyc
│       └── validation.py
├── README.md
└── requirements.txt

ERROR:

INFO: 127.0.0.1:60874 - "POST /api/v1/model/download?model_name=microsoft%2FPhi-3.5-mini-instruct HTTP/1.1" 200 OK 2025-01-13 16:18:45,409 - api_routes - INFO - Received request to initialize model: microsoft/Phi-3.5-mini-instruct 2025-01-13 16:18:45,409 - llm_api - INFO - Initializing generation model: microsoft/Phi-3.5-mini-instruct 2025-01-13 16:18:45,412 - llm_api - INFO - Loading model from local path: main/models/Phi-3.5-mini-instruct The load_in_4bit and load_in_8bit arguments are deprecated and will be removed in the future versions. Please, pass a BitsAndBytesConfig object in quantization_config argument instead. Could not find the bitsandbytes CUDA binary at PosixPath('/home/aurelio/Desktop/Projects/LLMServer/myenv/lib/python3.13/site-packages/bitsandbytes/libbitsandbytes_cuda124.so') g++ (GCC) 14.2.1 20240910 Copyright (C) 2024 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

2025-01-13 16:18:45,982 - llm_api - ERROR - Failed to initialize generation model microsoft/Phi-3.5-mini-instruct: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): Dynamo is not supported on Python 3.13+ 2025-01-13 16:18:45,982 - api_routes - ERROR - Error initializing model: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): Dynamo is not supported on Python 3.13+ INFO: 127.0.0.1:38330 - "POST /api/v1/model/initialize?model_name=microsoft%2FPhi-3.5-mini-instruct HTTP/1.1" 500 Internal Server Error