Edit model card

Model Information

Model Details

Model Description

Llama3-ViettelSolutions-8B is a variant of the Meta Llama-3-8B model, continued pre-trained on the Vietnamese curated dataset and supervised fine-tuned on 5 million samples of Vietnamese instruct data.

  • Developed by: Viettel SolutionsSolutions
  • Funded by: NVIDIA
  • Model type: Autoregressive transformer model
  • Language(s) (NLP): Vietnamese, English
  • License: Llama 3 Community License
  • Finetuned from model: meta-llama/Meta-Llama-3-8B

Uses

Example snippet for usage with Transformers:

import transformers
import torch

model_id = "VTSNLP/Llama3-ViettelSolutions-8B"

pipeline = transformers.pipeline(
    "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)
pipeline("Xin chào!")

Training Details

Training Data

Training Procedure

Preprocessing

[More Information Needed]

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • Data sequence length: 8192
  • Tensor model parallel size: 4
  • Pipelinemodel parallel size: 1
  • Context parallel size: 1
  • Micro batch size: 1
  • Global batch size: 512

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

[More Information Needed]

Technical Specifications

  • Compute Infrastructure: NVIDIA DGX

  • Hardware: 4 x A100 80GB

  • Software: NeMo Framework

Citation

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

More Information

[More Information Needed]

Model Card Authors

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month
51
Safetensors
Model size
8.03B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for VTSNLP/Llama3-ViettelSolutions-8B

Finetuned
(317)
this model
Quantizations
2 models

Dataset used to train VTSNLP/Llama3-ViettelSolutions-8B