--- license: apache-2.0 base_model: BEE-spoke-data/smol_llama-220M-GQA datasets: - BEE-spoke-data/pypi_clean-deduped - bigcode/the-stack-smol-xl - EleutherAI/proof-pile-2 language: - en tags: - python - codegen - markdown - smol_llama metrics: - accuracy inference: parameters: max_new_tokens: 64 min_new_tokens: 8 do_sample: true epsilon_cutoff: 0.0008 temperature: 0.3 top_p: 0.9 repetition_penalty: 1.02 no_repeat_ngram_size: 8 renormalize_logits: true widget: - text: | def add_numbers(a, b): return example_title: Add Numbers Function - text: | class Car: def __init__(self, make, model): self.make = make self.model = model def display_car(self): example_title: Car Class - text: | import pandas as pd data = {'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19]} df = pd.DataFrame(data).convert_dtypes() # eda example_title: Pandas DataFrame - text: | def factorial(n): if n == 0: return 1 else: example_title: Factorial Function - text: | def fibonacci(n): if n <= 0: raise ValueError("Incorrect input") elif n == 1: return 0 elif n == 2: return 1 else: example_title: Fibonacci Function - text: | import matplotlib.pyplot as plt import numpy as np x = np.linspace(0, 10, 100) # simple plot example_title: Matplotlib Plot - text: | def reverse_string(s:str) -> str: return example_title: Reverse String Function - text: | def is_palindrome(word:str) -> bool: return example_title: Palindrome Function - text: | def bubble_sort(lst: list): n = len(lst) for i in range(n): for j in range(0, n-i-1): example_title: Bubble Sort Function - text: | def binary_search(arr, low, high, x): if high >= low: mid = (high + low) // 2 if arr[mid] == x: return mid elif arr[mid] > x: example_title: Binary Search Function pipeline_tag: text-generation --- # BEE-spoke-data/beecoder-220M-python This is `BEE-spoke-data/smol_llama-220M-GQA` fine-tuned for code generation on: - filtered version of stack-smol-XL - deduped version of 'algebraic stack' from proof-pile-2 - cleaned and deduped pypi (last dataset) This model (and the base model) were both trained using ctx length 2048. ## examples > Example script for inference testing: [here](https://gist.github.com/pszemraj/c7738f664a64b935a558974d23a7aa8c) It has its limitations at 220M, but seems decent for single-line or docstring generation, and/or being used for speculative decoding for such purposes. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/60bccec062080d33f875cd0c/bLrtpr7Vi_MPvtF7mozDN.png) The screenshot is on CPU on a laptop. ---