WIP

This model is a finetuned version of the BSC-LT/salamandra-7b model trained as an european LLM in Spain. Both SFT and DPO have been done utilizing Spectrum to identify relevent layers for training.

Model Details

Model Description

  • Developed by: Matthias Uhlig
  • Model type: Decoder-Only Transformer with Llama-Architecture
  • Language(s) (NLP): German, English (in Finetuning)
  • License: Apache 2.0
  • Finetuned from model: BSC-LT/salamandra-7b

How to Use

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("DRXD1000/Atlas-7B", torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("DRXD1000/Atlas-7B")
messages = [
    {
        "role": "system",
        "content": "You are a helpful assistant", 
    },
    {"role": "user", "content": "Explain AI"},
]
inputs = tokenizer.apply_chat_template(messages,add_generation_prompt=True,  return_tensors="pt").to("cuda")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=False)

Tool Usage

The model has been trained to provide tool calls using the following template

conversations = [
    {
      "role": "system",
      "content": "You are a function calling AI model. You are provided with function signatures within <tools> </tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions.\n<tools>\n[{'type': 'function', 'function': {'name': 'search_flights', 'description': 'Searches for flights based on departure and destination cities, dates, class, and other preferences.', 'parameters': {'type': 'object', 'properties': {'departure_city': {'type': 'string', 'description': 'The city from which the flight will depart.'}, 'destination_city': {'type': 'string', 'description': 'The destination city for the flight.'}, 'departure_date': {'type': 'string', 'description': 'The departure date for the flight.', 'format': 'date'}, 'return_date': {'type': 'string', 'description': 'The return date for the flight.', 'format': 'date'}, 'class': {'type': 'string', 'description': 'The class of the flight ticket.', 'enum': ['economy', 'business', 'first']}, 'flexible_cancellation': {'type': 'boolean', 'description': 'Indicates if the search should filter for flights with flexible cancellation policies.'}}, 'required': ['departure_city', 'destination_city', 'departure_date', 'return_date', 'class']}}}]\n</tools>\nFor each function call return a json object with function name and arguments within <tool_call> </tool_call> tags with the following schema:\n<tool_call>\n{'arguments': <args-dict>, 'name': <function-name>}\n</tool_call>\n"
    },
    {
      "role": "user",
      "content": "I'm planning a kayaking trip and looking to book flights from Los Angeles to Auckland. My departure is scheduled for July 10th, 2023, and I intend to return on July 24th, 2023. I would prefer to travel in economy class and would also like the option to have flexible cancellation policies for the tickets due to the uncertain nature of outdoor activities. Could you please search for flights that meet these criteria and provide me with the available options?"
    }]

Evaluation

Needs to be done (Currently in Line for OpenLLM Leaderboard)

Disclaimer Toxic Content:

This Large Language Model (LLM) may generate content that is inappropriate, offensive, or harmful. While the dataset has been filtered to minimize such outputs, the model may still produce text that is biased or toxic due to the large scale and diverse nature of the data.

Out-of-Scope Use

The model is not intended for use in math and coding tasks.

Bias, Risks, and Limitations

Atlas-7B has been trained with alignemnt data but is not free from biases and hallucinations and can produce bad content.

Training Details

Infrastructure

Both SFT and DPO have been done on a 4xA100 80GB Instance.

Downloads last month
37
Safetensors
Model size
7.77B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for DRXD1000/Atlas-7B

Quantizations
2 models