LLMServer / README.md
AurelioAguirre's picture
Update README.md
11c65a0 verified
|
raw
history blame
4.66 kB
metadata
title: LLMServer
emoji: πŸ‘Ή
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false

LLM Server

This repository contains a FastAPI-based server that serves open-source Large Language Models from Hugging Face.

Getting Started

These instructions will help you set up and run the project on your local machine.

Prerequisites

  • Python 3.10 or higher
  • Git

Cloning the Repository

Choose one of the following methods to clone the repository:

HTTPS

git clone https://huggingface.co/spaces/TeamGenKI/LLMServer
cd project-name

SSH

git clone git@hf.co:spaces/TeamGenKI/LLMServer
cd project-name

Setting Up the Virtual Environment

Windows

# Create virtual environment
python -m venv myenv

# Activate virtual environment
myenv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Linux

# Create virtual environment
python -m venv myenv

# Activate virtual environment
source myenv/bin/activate

# Install dependencies
pip install -r requirements.txt

macOS

# Create virtual environment
python3 -m venv myenv

# Activate virtual environment
source myenv/bin/activate

# Install dependencies
pip3 install -r requirements.txt

Running the Application

Once you have set up your environment and installed the dependencies, you can start the FastAPI application:

uvicorn main.app:app --reload

The API will be available at http://localhost:8001

API Documentation

Once the application is running, you can access:

  • Interactive API documentation (Swagger UI) at http://localhost:8000/docs
  • Alternative API documentation (ReDoc) at http://localhost:8000/redoc

Deactivating the Virtual Environment

When you're done working on the project, you can deactivate the virtual environment:

deactivate

Contributing

[Add contributing guidelines here]

License

[Add license information here]

Project Structure

.
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ main
β”‚   β”œβ”€β”€ api.py
β”‚   β”œβ”€β”€ app.py
β”‚   β”œβ”€β”€ config.yaml
β”‚   β”œβ”€β”€ env_template
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ logs
β”‚   β”‚   └── llm_api.log
β”‚   β”œβ”€β”€ models
β”‚   β”œβ”€β”€ __pycache__
β”‚   β”‚   β”œβ”€β”€ api.cpython-39.pyc
β”‚   β”‚   β”œβ”€β”€ app.cpython-39.pyc
β”‚   β”‚   β”œβ”€β”€ __init__.cpython-39.pyc
β”‚   β”‚   └── routes.cpython-39.pyc
β”‚   β”œβ”€β”€ routes.py
β”‚   β”œβ”€β”€ test_locally.py
β”‚   └── utils
β”‚       β”œβ”€β”€ errors.py
β”‚       β”œβ”€β”€ helpers.py
β”‚       β”œβ”€β”€ __init__.py
β”‚       β”œβ”€β”€ logging.py
β”‚       β”œβ”€β”€ __pycache__
β”‚       β”‚   β”œβ”€β”€ helpers.cpython-39.pyc
β”‚       β”‚   β”œβ”€β”€ __init__.cpython-39.pyc
β”‚       β”‚   β”œβ”€β”€ logging.cpython-39.pyc
β”‚       β”‚   └── validation.cpython-39.pyc
β”‚       └── validation.py
β”œβ”€β”€ README.md
└── requirements.txt

ERROR:

INFO: 127.0.0.1:60874 - "POST /api/v1/model/download?model_name=microsoft%2FPhi-3.5-mini-instruct HTTP/1.1" 200 OK 2025-01-13 16:18:45,409 - api_routes - INFO - Received request to initialize model: microsoft/Phi-3.5-mini-instruct 2025-01-13 16:18:45,409 - llm_api - INFO - Initializing generation model: microsoft/Phi-3.5-mini-instruct 2025-01-13 16:18:45,412 - llm_api - INFO - Loading model from local path: main/models/Phi-3.5-mini-instruct The load_in_4bit and load_in_8bit arguments are deprecated and will be removed in the future versions. Please, pass a BitsAndBytesConfig object in quantization_config argument instead. Could not find the bitsandbytes CUDA binary at PosixPath('/home/aurelio/Desktop/Projects/LLMServer/myenv/lib/python3.13/site-packages/bitsandbytes/libbitsandbytes_cuda124.so') g++ (GCC) 14.2.1 20240910 Copyright (C) 2024 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

2025-01-13 16:18:45,982 - llm_api - ERROR - Failed to initialize generation model microsoft/Phi-3.5-mini-instruct: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): Dynamo is not supported on Python 3.13+ 2025-01-13 16:18:45,982 - api_routes - ERROR - Error initializing model: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): Dynamo is not supported on Python 3.13+ INFO: 127.0.0.1:38330 - "POST /api/v1/model/initialize?model_name=microsoft%2FPhi-3.5-mini-instruct HTTP/1.1" 500 Internal Server Error