Model Card for syedzaidi-kiwi/Llama-2-7b-chat-finetune

This model is a fine-tuned version of Meta's Llama 2 7B variant for enhanced chat functionalities.

This modelcard aims to be a base template for new models. It has been generated using this raw template.

Model Details

Model Description

Developed by: Syed Asad
Model type: Fine-tuned Llama 2 7B variant
Language(s) (NLP): English
License: Apache-2.0
Finetuned from model: NousResearch/Llama-2-7b-chat-hf

Model Sources

Repository: syedzaidi-kiwi/Llama-2-7b-chat-finetune
Paper: [https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/]

Uses

Direct Use

The model is intended for direct use in applications requiring conversational responses, such as chatbots or virtual assistants.

Out-of-Scope Use

The model is not designed for tasks outside of conversational AI, such as document summarization or translation.

Bias, Risks, and Limitations

Users should be aware of potential biases in the training data and limitations in the model's understanding of nuanced human language. Further evaluation is recommended for specific use cases.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("syedzaidi-kiwi/Llama-2-7b-chat-finetune")
model = AutoModelForCausalLM.from_pretrained("syedzaidi-kiwi/Llama-2-7b-chat-finetune")

inputs = tokenizer("Hello, how are you?", return_tensors="pt")
response = model.generate(**inputs)
print(tokenizer.decode(response[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned using the dataset mlabonne/guanaco-llama2-1k.

Link: https://huggingface.co/datasets/mlabonne/guanaco-llama2-1k

Training Procedure

Training Hyperparameters

Training regime:

The model was fine-tuned using a mix of precision training techniques to balance training speed and model performance effectively.

While the exact precision format (e.g., fp32, fp16, bf16) utilized depends on the compute capabilities available, an emphasis was placed on leveraging mixed precision (fp16) training to accelerate the training process on compatible hardware. This approach allowed for faster computation and reduced memory usage without significant loss in training quality.

Users are encouraged to adjust the precision settings based on their hardware specifications to optimize performance further.

Speeds, Sizes, Times

To be tested by the KiwiTech Team

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model's performance was evaluated on a held-out test set from the mlabonne/guanaco-llama2-1k dataset.

This dataset comprises diverse conversational contexts to assess the model's generalization and robustness across various topics. [https://huggingface.co/datasets/mlabonne/guanaco-llama2-1k]

Factors

Evaluation focused on several key factors to ensure the model's versatility and reliability in conversational AI applications:

Context understanding: The model's ability to maintain context and coherence over long conversations. Diversity of responses: The variety in the model's responses to similar prompts, indicating its creative and dynamic conversational capabilities. Safety and bias: Monitoring for any unintended biases in responses or generation of inappropriate content.

Metrics

To comprehensively assess the model's performance, the following metrics were utilized:

Perplexity (PPL): Lower perplexity scores indicate better understanding and generation of the text. BLEU Score: For measuring the similarity between the model's generated responses and a set of reference responses, indicating the model's accuracy in reproducing human-like answers. F1 Score: Evaluating the balance between precision and recall in the model's responses, useful for assessing conversational relevance. Safety and Bias Evaluation: Custom metrics were developed to quantify the model's performance in generating safe, unbiased content.

Results

To be Evaulated, will be updated in this section.

Summary

The fine-tuned model demonstrates significant improvements in generating coherent, diverse, and contextually appropriate responses across various conversational settings.

It represents a step forward in developing conversational AI systems that are both efficient and effective.

Continuous evaluation and monitoring are advised to further enhance and maintain the model's performance standards.

Technical Specifications

Model Architecture and Objective

Transformers

Compute Infrastructure

T4 GPU

Hardware

Fine Tuned on Apple M3 Pro (Silicon Chip)

Software

Google Colab Notebook Used

Citation

OriginalLlama2Citation Title: Llama 2: Open Foundation and Fine-Tuned Chat Models}, Authors: Hugo Touvron∗ Louis Martin† Kevin Stone† Peter Albert Amjad Almahairi Yasmine Babaei Nikolay Bashlykov Soumya Batra Prajjwal Bhargava Shruti Bhosale Dan Bikel Lukas Blecher Cristian Canton Ferrer Moya Chen Guillem Cucurull David Esiobu Jude Fernandes Jeremy Fu Wenyin Fu Brian Fuller Cynthia Gao Vedanuj Goswami Naman Goyal Anthony Hartshorn Saghar Hosseini Rui Hou Hakan Inan Marcin Kardas Viktor Kerkez Madian Khabsa Isabel Kloumann Artem Korenev Punit Singh Koura Marie-Anne Lachaux Thibaut Lavril Jenya Lee Diana Liskovich Yinghai Lu Yuning Mao Xavier Martinet Todor Mihaylov Pushkar Mishra Igor Molybog Yixin Nie Andrew Poulton Jeremy Reizenstein Rashi Rungta Kalyan Saladi Alan Schelten Ruan Silva Eric Michael Smith Ranjan Subramanian Xiaoqing Ellen Tan Binh Tang Ross Taylor Adina Williams Jian Xiang Kuan Puxin Xu Zheng Yan Iliyan Zarov Yuchen Zhang Angela Fan Melanie Kambadur Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom

Journal: Gen AI, Meta Year: 2023

Link to Research Paper: https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/

Model Card Authors

Syed Asad

Model Card Contact

Syed Asad (syed.asad@kiwitech.com)

syedzaidi-kiwi
/

Llama-2-7b-chat-finetune