Spaces:

TeamGenKI
/

LLMServer

Paused

App Files Files Community

LLMServer / README.md

AurelioAguirre

Update README.md

11c65a0 verified about 2 months ago

preview code

raw

history blame

4.66 kB

	---
	title: LLMServer
	emoji: 👹
	colorFrom: indigo
	colorTo: purple
	sdk: docker
	pinned: false
	---

	# LLM Server

	This repository contains a FastAPI-based server that serves open-source Large Language Models from Hugging Face.

	## Getting Started

	These instructions will help you set up and run the project on your local machine.

	### Prerequisites

	- Python 3.10 or higher
	- Git

	### Cloning the Repository

	Choose one of the following methods to clone the repository:

	#### HTTPS
	```bash
	git clone https://huggingface.co/spaces/TeamGenKI/LLMServer
	cd project-name
	```

	#### SSH
	```bash
	git clone git@hf.co:spaces/TeamGenKI/LLMServer
	cd project-name
	```

	### Setting Up the Virtual Environment

	#### Windows
	```bash
	# Create virtual environment
	python -m venv myenv

	# Activate virtual environment
	myenv\Scripts\activate

	# Install dependencies
	pip install -r requirements.txt
	```

	#### Linux
	```bash
	# Create virtual environment
	python -m venv myenv

	# Activate virtual environment
	source myenv/bin/activate

	# Install dependencies
	pip install -r requirements.txt
	```

	#### macOS
	```bash
	# Create virtual environment
	python3 -m venv myenv

	# Activate virtual environment
	source myenv/bin/activate

	# Install dependencies
	pip3 install -r requirements.txt
	```

	### Running the Application

	Once you have set up your environment and installed the dependencies, you can start the FastAPI application:

	```bash
	uvicorn main.app:app --reload
	```

	The API will be available at `http://localhost:8001`

	### API Documentation

	Once the application is running, you can access:
	- Interactive API documentation (Swagger UI) at `http://localhost:8000/docs`
	- Alternative API documentation (ReDoc) at `http://localhost:8000/redoc`

	### Deactivating the Virtual Environment

	When you're done working on the project, you can deactivate the virtual environment:

	```bash
	deactivate
	```

	## Contributing

	[Add contributing guidelines here]

	## License

	[Add license information here]

	## Project Structure

	```
	.
	├── Dockerfile
	├── main
	│ ├── api.py
	│ ├── app.py
	│ ├── config.yaml
	│ ├── env_template
	│ ├── __init__.py
	│ ├── logs
	│ │ └── llm_api.log
	│ ├── models
	│ ├── __pycache__
	│ │ ├── api.cpython-39.pyc
	│ │ ├── app.cpython-39.pyc
	│ │ ├── __init__.cpython-39.pyc
	│ │ └── routes.cpython-39.pyc
	│ ├── routes.py
	│ ├── test_locally.py
	│ └── utils
	│ ├── errors.py
	│ ├── helpers.py
	│ ├── __init__.py
	│ ├── logging.py
	│ ├── __pycache__
	│ │ ├── helpers.cpython-39.pyc
	│ │ ├── __init__.cpython-39.pyc
	│ │ ├── logging.cpython-39.pyc
	│ │ └── validation.cpython-39.pyc
	│ └── validation.py
	├── README.md
	└── requirements.txt
	```

	ERROR:

	INFO: 127.0.0.1:60874 - "POST /api/v1/model/download?model_name=microsoft%2FPhi-3.5-mini-instruct HTTP/1.1" 200 OK
	2025-01-13 16:18:45,409 - api_routes - INFO - Received request to initialize model: microsoft/Phi-3.5-mini-instruct
	2025-01-13 16:18:45,409 - llm_api - INFO - Initializing generation model: microsoft/Phi-3.5-mini-instruct
	2025-01-13 16:18:45,412 - llm_api - INFO - Loading model from local path: main/models/Phi-3.5-mini-instruct
	The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
	Could not find the bitsandbytes CUDA binary at PosixPath('/home/aurelio/Desktop/Projects/LLMServer/myenv/lib/python3.13/site-packages/bitsandbytes/libbitsandbytes_cuda124.so')
	g++ (GCC) 14.2.1 20240910
	Copyright (C) 2024 Free Software Foundation, Inc.
	This is free software; see the source for copying conditions. There is NO
	warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

	2025-01-13 16:18:45,982 - llm_api - ERROR - Failed to initialize generation model microsoft/Phi-3.5-mini-instruct: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):
	Dynamo is not supported on Python 3.13+
	2025-01-13 16:18:45,982 - api_routes - ERROR - Error initializing model: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):
	Dynamo is not supported on Python 3.13+
	INFO: 127.0.0.1:38330 - "POST /api/v1/model/initialize?model_name=microsoft%2FPhi-3.5-mini-instruct HTTP/1.1" 500 Internal Server Error