zebra-Llama/zebra-Llama-v0.2

Zebra-Llama v0.2 is a specialized version of the Llama-3.1-8b-instruct model, fine-tuned with data specific to the rare disease Ehlers-Danlos Syndrome (EDS) - a rare connective tissue disorder. We utilized textual information from over 4,000 EDS papers from PubMed, more than 8,000 Reddit posts about EDS, and over 5,000 posts from the Inspire forum to gather real-world concerns/questions related to EDS, which were used to fine-tune the model. As a result, this model is adept at providing accurate responses to questions regarding EDS.

The model is trained using a specialized approach called "context-aware training," where we provided context for each question from a custom vector database during the training phase. This approach enabled the model to demonstrate high precision and recall during the inference phase when utilizing the RAG context. Additionally, the model showed a higher likelihood of generating correct citations compared to the base model.

What is new in this version of zebra-Llama?

  • Compared to the previous version (zebraLLAMA/zebra-Llama-v0.1), the latest Zebra-Llama model delivers more comprehensive and in-depth explanations for questions about the rare disease Ehlers-Danlos Syndrome.

  • The latest version has a greater ability to provide citations consistently compared to the previous version.

  • In addition to improved citation ability, it has also been benchmarked against the base model (meta-llama/Llama-3.1-8B-Instruct) and demonstrates superior text generation capabilities in terms of thoroughness, accuracy, and clarity, based on expert evaluation.

  • From a modeling perspective, the latest version utilizes "meta-llama/Llama-3.1-8B-Instruct" as its base model, while the earlier version (v0.1) was built on "meta-llama/Meta-Llama-3-8B-Instruct".

Model Details

Base model : meta-llama/Llama-3.1-8B-Instruct

Model Diagram

Model Sources

Repository: https://github.com/karthiksoman/zebra-Llama

Custom built RAG API for rare diseases (focused on EDS):

• Base URL: https://zebra-llama-rag.onrender.com

• Endpoint: /search

Jupyter Notebook Demo of Zebra-Llama:

https://github.com/karthiksoman/zebra-Llama/blob/main/code/notebook/zebra_llama_v0.2_demo.ipynb

Uses

Zebra-Llama can be used to generate answers related to EDS questions.

Out-of-Scope Use

This Language Model is intended for academic and research purposes only. It is not for clinical use or medical decision-making. Consult a healthcare professional for medical advice.

Training Details

Fine tuning method : LoRA

LoRA rank : 16

LoRA alpha : 16

LORA dropout : 0.01

LORA target modules : ["q_proj", "k_proj", "v_proj"]

Train epochs : 2

Learning rate : 1e-4

LR scheduler type : constant

Max grad norm : 1

BATCH_SIZE_PER_GPU_FOR_TRAINING : 2

GRADIENT_ACCUMULATION_STEPS : 1

Citation

@misc{soman2024zebrallamacontextawarelargelanguage,
      title={Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge}, 
      author={Karthik Soman and Andrew Langdon and Catalina Villouta and Chinmay Agrawal and Lashaw Salta and Braian Peetoom and Gianmarco Bellucci and Orion J Buske},
      year={2024},
      eprint={2411.02657},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.02657}, 
}

Contact

Dr. Karthik Soman - karthi.soman@gmail.com

Andrew Langdon - andrewlngdn@gmail.com

Chinmay Agrawal - chag7212@colorado.edu

Catalina Villouta - catalina.villouta.r@gmail.com

Dr. Orion Buske - orion@phenotips.com

Lashaw Salta - lashawsalta@gmail.com

Downloads last month
22
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.