llama3-dnapretrain-kaniwa

This is a LoRA adapter.

The base model is the longer-context LLaMA-3-8b-Instruct developed by Gradient and Crusoe: gradientai/Llama-3-8B-Instruct-262k

The dataset was part of BYU's 2019 kaniwa (Chenopodium pallidicaule) genome, from https://genomevolution.org/coge/GenomeInfo.pl?gid=53872

The adapter was finetuned for 3 hours on an A100. The data was split into ~20k nucleotide snippets with an Alpaca like message format.

Training Notebook: https://colab.research.google.com/drive/1XZcCYGFQGtz3_AKSR4F67WYXl6DIwP4R

Sample message:

Write information about the nucleotide sequence.

### Sequence:
GCCTATAGTGTGTAGCTAATGAGCCTAGGTTATCGACCCTAATCT...

### Annotation:
Information about location in the kaniwa chromosome: >lcl|Cp5

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Genome Citation

Mangelson H, et al. The genome of Chenopodium pallidicaule: an emerging Andean super grain. Appl. Plant Sci. 2019;7:e11300. doi: 10.1002/aps3.11300

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for monsoon-nlp/llama3-dnapretrain-kaniwa

Finetuned
(20)
this model

Collection including monsoon-nlp/llama3-dnapretrain-kaniwa