jayakody2000lk
/

SummLlama3.2-3B-Q5_K_M-GGUF

Inference Endpoints

Model card Files Files and versions Community

SummLlama3.2-3B-Q5_K_M-GGUF / README.md

jayakody2000lk's picture

Upload README.md with huggingface_hub

3166078 verified 3 months ago

|

2.88 kB

	---
	library_name: transformers
	base_model: DISLab/SummLlama3.2-3B
	pipeline_tag: summarization
	widget:
	- text: '<\|begin_of_text\|><\|start_header_id\|>user<\|end_header_id\|>

	Below is an instruction that describes a task. Write a response that appropriately
	completes the request.


	### Instruction:

	Please summarize the input documnet.


	### Input:

	The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey
	building, and the tallest structure in Paris. Its base is square, measuring 125
	metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed
	the Washington Monument to become the tallest man-made structure in the world,
	a title it held for 41 years until the Chrysler Building in New York City was
	finished in 1930. It was the first structure to reach a height of 300 metres.
	Due to the addition of a broadcasting aerial at the top of the tower in 1957,
	it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters,
	the Eiffel Tower is the second tallest free-standing structure in France after
	the Millau Viaduct.


	### Response:<\|eot_id\|>'
	tags:
	- llama-cpp
	- gguf-my-repo
	---

	# jayakody2000lk/SummLlama3.2-3B-Q5_K_M-GGUF
	This model was converted to GGUF format from [`DISLab/SummLlama3.2-3B`](https://huggingface.co/DISLab/SummLlama3.2-3B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
	Refer to the [original model card](https://huggingface.co/DISLab/SummLlama3.2-3B) for more details on the model.

	## Use with llama.cpp
	Install llama.cpp through brew (works on Mac and Linux)

	```bash
	brew install llama.cpp

	```
	Invoke the llama.cpp server or the CLI.

	### CLI:
	```bash
	llama-cli --hf-repo jayakody2000lk/SummLlama3.2-3B-Q5_K_M-GGUF --hf-file summllama3.2-3b-q5_k_m.gguf -p "The meaning to life and the universe is"
	```

	### Server:
	```bash
	llama-server --hf-repo jayakody2000lk/SummLlama3.2-3B-Q5_K_M-GGUF --hf-file summllama3.2-3b-q5_k_m.gguf -c 2048
	```

	Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

	Step 1: Clone llama.cpp from GitHub.
	```
	git clone https://github.com/ggerganov/llama.cpp
	```

	Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
	```
	cd llama.cpp && LLAMA_CURL=1 make
	```

	Step 3: Run inference through the main binary.
	```
	./llama-cli --hf-repo jayakody2000lk/SummLlama3.2-3B-Q5_K_M-GGUF --hf-file summllama3.2-3b-q5_k_m.gguf -p "The meaning to life and the universe is"
	```
	or
	```
	./llama-server --hf-repo jayakody2000lk/SummLlama3.2-3B-Q5_K_M-GGUF --hf-file summllama3.2-3b-q5_k_m.gguf -c 2048
	```