Model Card for StarCoderCodeQ&A

This is a version of StarCoder model that was fine-tuned on the grammatically corrected texts.

Model Details

Model Description

  • Model type: GPT-2
  • Number of Parameters: 15.5B
  • Supported Programming Language: Python
  • Finetuned from model: StarCoder

Model Sources [optional]

  • Repository: GitHub Repo
  • Paper: "Leveraging Large Language Models in Code Question Answering: Baselines and Issues" Georgy Andryushchenko, Vladimir V. Ivanov, Vladimir Makharev, Elizaveta Tukhtina, Aidar Valeev

How to Get Started with the Model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('datapaf/StarCoderCodeQnA')
model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="cuda")

code = ... # Your Python code snippet here
question = ... # Your question regarding the snippet here

prompt_template = "Question: {question}\n\nCode: {code}\n\nAnswer:"
prompt = prompt_template.format(question=ex['question'], code=ex['code'])

inputs = tokenizer.encode(prompt, return_tensors="pt").to('cuda')
outputs = model.generate(inputs, max_new_tokens=512, pad_token_id=tokenizer.eos_token_id)
text = tokenizer.decode(outputs[0])
print(text)
-->
Downloads last month
6
Safetensors
Model size
15.5B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.