JetBrains
/

CodeLlama-7B-Kexer

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

CodeLlama-7B-Kexer / README.md

jdev8's picture

Update README.md

422a12d verified 4 months ago

|

No virus

2.29 kB

	---
	license: apache-2.0
	---

	# Kexer models

	Kexer models is a collection of fine-tuned open-source generative text models fine-tuned on Kotlin Exercices dataset.
	This is a repository for fine-tuned CodeLlama-7b model in the Hugging Face Transformers format.

	# Model use

	```
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load pre-trained model and tokenizer
	model_name = 'JetBrains/CodeLlama-7B-Kexer' # Replace with the desired model name
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name).cuda()

	# Encode input text
	input_text = """This function takes an integer n and returns factorial of a number:
	fun factorial(n: Int): Int {"""
	input_ids = tokenizer.encode(input_text, return_tensors='pt').to('cuda')

	# Generate text
	output = model.generate(input_ids, max_length=150, num_return_sequences=1, no_repeat_ngram_size=2, early_stopping=True)

	# Decode and print the generated text
	generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
	print(generated_text)
	```

	# Training setup

	The model was trained on one A100 GPU with following hyperparameters:

	\| Hyperparameter \| Value \|
	\|:---------------------------:\|:----------------------------------------:\|
	\| `warmup` \| 10% \|
	\| `max_lr` \| 1e-4 \|
	\| `scheduler` \| linear \|
	\| `total_batch_size` \| 256 (~130K tokens per step) \|


	# Fine-tuning data

	For this model we used 15K exmaples of Kotlin Exercices dataset {TODO: link!}. For more information about the dataset follow th link.

	# Evaluation

	To evaluate we used Kotlin Humaneval (more infromation here)

	Fine-tuned model:

	\| Model name \| Kotlin HumanEval Pass Rate \| Kotlin Completion \|
	\|:---------------------------:\|:----------------------------------------:\|:----------------------------------------:\|
	\| `base model` \| 26.89 \| 0.388 \|
	\| `fine-tuned model` \| 42.24 \| 0.344 \|