metadata
language:
- th
- en
license: mit
base_model: aisingapore/sea-lion-7b-instruct
datasets:
- AIAT/Optimizer-datasetfinal
pipeline_tag: text-generation
Sea-lion2pandas
fine-tuned from sea-lion-7b-instruct with question-pandas expression pairs.
How to use:
from transformers import AutoModelForCausalLM, AutoTokenizer
import pandas as pd
tokenizer = AutoTokenizer.from_pretrained("AIAT/Optimizer-sealion2pandas", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("AIAT/Optimizer-sealion2pandas", trust_remote_code=True)
df = pd.read_csv("Your csv..")
prompt_template = "### USER:\n{human_prompt}\n\n### RESPONSE:\n"
prompt = """\
You are working with a pandas dataframe in Python.
The name of the dataframe is `df`.
This is the result of `print(df.head())`:
{df_str}
Follow these instructions:
1. Convert the query to executable Python code using Pandas.
2. The final line of code should be a Python expression that can be called with the `eval()` function.
3. The code should represent a solution to the query.
4. PRINT ONLY THE EXPRESSION.
5. Do not quote the expression.
Query: {query_str} """
def create_prompt(query_str, df):
text = prompt.format(df_str=str(df.head()), query_str=query_str)
text = prompt_template.format(human_prompt=text)
return text
full_prompt = create_prompt("Find test ?", df)
tokens = tokenizer(full_prompt, return_tensors="pt")
output = model.generate(tokens["input_ids"], max_new_tokens=20, eos_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(output[0], skip_special_tokens=True))