|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
datasets: |
|
- TokenBender/code_instructions_122k_alpaca_style |
|
- nickrosh/Evol-Instruct-Code-80k-v1 |
|
metrics: |
|
- accuracy |
|
language: |
|
- en |
|
widget: |
|
- text: | |
|
### Instruction: write a neural network complete code in Keras |
|
### Input: Use cifar dataset |
|
### Response: |
|
output: |
|
text: >- |
|
from keras.datasets import cifar10 import numpy as np from keras.models |
|
import Sequential from keras.layers import Dense, Dropout, Activation, |
|
Flatten from keras.layers import Conv2D, MaxPooling2D from keras.utils |
|
import to_categorical from keras.optimizers import Adam |
|
|
|
|
|
cifar10.load_data() |
|
|
|
|
|
X_train.astype('float32') / 255 X_test = X_test.astype('float32') / 255 |
|
|
|
|
|
to_categorical(y_train, num_classes=10) y_test = to_categorical(y_test, |
|
num_classes=10) |
|
|
|
|
|
input_shape=(32, 32, 3))) model.add(Activation('relu')) |
|
model.add(MaxPooling2D(pool_size=(2, 2))) |
|
|
|
model.add(Conv2D(64, (3, 3))) model.add(Activation('relu')) |
|
model.add(MaxPooling2D(pool_size=(2, 2))) |
|
|
|
model.add(Flatten()) model.add(Dense(64)) model.add(Activation('relu')) |
|
model.add(Dropout(0.5)) model.add(Dense(10)) |
|
model.add(Activation('softmax')) |
|
|
|
|
|
optimizer=Adam(), metrics=['accuracy']) |
|
|
|
|
|
validation_split=0.2) |
|
pipeline_tag: text-generation |
|
base_model: codellama/CodeLlama-13b-Instruct-hf |
|
--- |
|
|
|
<p align="center" style="font-size:34px;"><b>Panda-Coder πΌ</b></p> |
|
|
|
# Panda Coder-13B vLLM Inference: [](https://colab.research.google.com/drive/1yP-11PWqLrDn5ymKDWMfz9r6jLpTcTAH?usp=sharing) |
|
|
|
 |
|
|
|
Panda Coder is a state-of-the-art LLM capable of generating code on the NLP based Instructions |
|
|
|
## Model description |
|
|
|
π€ Model Description: Panda-Coder is a state-of-the-art LLM, a fine-tuned model, specifically designed to generate code based on natural language instructions. It's the result of relentless innovation and meticulous fine-tuning, all to make coding easier and more accessible for everyone. |
|
|
|
## Inference |
|
|
|
> Hardware requirements: |
|
> |
|
> 30GB VRAM - A100 Preferred |
|
|
|
### vLLM - For Faster Inference |
|
|
|
#### Installation |
|
|
|
``` |
|
!pip install vllm |
|
``` |
|
|
|
**Implementation**: |
|
|
|
```python |
|
from vllm import LLM, SamplingParams |
|
|
|
llm = LLM(model='aiplanet/panda-coder-13B',gpu_memory_utilization=0.95,max_model_len=4096) |
|
|
|
prompts = [""" ### Instruction: Write a Java code to add 15 numbers randomly generated. |
|
### Input: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15] |
|
### Response: |
|
""", |
|
"### Instruction: write a neural network complete code in Keras ### Input: Use cifar dataset ### Response:" |
|
] |
|
|
|
sampling_params = SamplingParams(temperature=0.1, top_p=0.95,repetition_penalty = 1.1,max_tokens=1000) |
|
|
|
outputs = llm.generate(prompts, sampling_params) |
|
|
|
for output in outputs: |
|
prompt = output.prompt |
|
generated_text = output.outputs[0].text |
|
print(generated_text) |
|
print("\n\n") |
|
``` |
|
|
|
|
|
### Transformers - Basic Implementation |
|
|
|
```python |
|
import torch |
|
import transformers |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments,BitsAndBytesConfig |
|
|
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_use_double_quant=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
) |
|
|
|
model = "aiplanet/panda-coder-13B" |
|
|
|
base_model = AutoModelForCausalLM.from_pretrained(model, quantization_config=bnb_config, device_map="cuda") |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model, trust_remote_code=True) |
|
tokenizer.pad_token = tokenizer.eos_token |
|
tokenizer.padding_side = "right" |
|
|
|
prompt = f"""### Instruction: |
|
Below is an instruction that describes a task. Write a response that appropriately completes the request. |
|
|
|
Write a Python quickstart script to get started with TensorFlow |
|
|
|
### Input: |
|
|
|
### Response: |
|
""" |
|
|
|
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda() |
|
outputs = base_model.generate(input_ids=input_ids, max_new_tokens=512, do_sample=True, top_p=0.9,temperature=0.1,repetition_penalty=1.1) |
|
|
|
print(f"Output:\n{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(prompt):]}") |
|
``` |
|
|
|
Output |
|
|
|
```bash |
|
Output: |
|
import tensorflow as tf |
|
|
|
# Create a constant tensor |
|
hello_constant = tf.constant('Hello, World!') |
|
|
|
# Print the value of the constant |
|
print(hello_constant) |
|
``` |
|
|
|
## Prompt Template for Panda Coder 13B |
|
|
|
``` |
|
### Instruction: |
|
{<add your instruction here>} |
|
|
|
### Input: |
|
{<can be empty>} |
|
|
|
### Response: |
|
``` |
|
|
|
## π Key Features: |
|
|
|
π NLP-Based Coding: With Panda-Coder, you can transform your plain text instructions into functional code effortlessly. No need to grapple with syntax and semantics - it understands your language. |
|
|
|
π― Precision and Efficiency: The model is tailored for accuracy, ensuring your code is not just functional but also efficient. |
|
|
|
β¨ Unleash Creativity: Whether you're a novice or an expert coder, Panda-Coder is here to support your coding journey, offering creative solutions to your programming challenges. |
|
|
|
π Evol Instruct Code: It's built on the robust Evol Instruct Code 80k-v1 dataset, guaranteeing top-notch code generation. |
|
|
|
π’ What's Next?: We believe in continuous improvement and are excited to announce that in our next release, Panda-Coder will be enhanced with a custom dataset. This dataset will not only expand the language support but also include hardware programming languages like MATLAB, Embedded C, and Verilog. π§°π‘ |
|
|
|
## Get in Touch |
|
|
|
|
|
You can schedule 1:1 meeting with our DevRel & Community Team to get started with AI Planet Open Source LLMs and GenAI Stack. Schedule the call here: [https://calendly.com/jaintarun](https://calendly.com/jaintarun) |
|
|
|
Stay tuned for more updates and be a part of the coding evolution. Join us on this exciting journey as we make AI accessible to all at AI Planet! |
|
|
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.33.3 |
|
- Pytorch 2.0.1+cu118 |
|
- Datasets 2.14.5 |
|
- Tokenizers 0.13.3 |
|
|
|
### Citation |
|
|
|
``` |
|
@misc {lucifertrj, |
|
author = { {Tarun Jain} }, |
|
title = { Panda Coder-13B by AI Planet}, |
|
year = 2023, |
|
url = { https://huggingface.co/aiplanet/panda-coder-13B }, |
|
publisher = { Hugging Face } |
|
} |
|
``` |