SmolLM2 -score0_mix_rephrased_from_beginning-600B (Version: main)

Model Details

  • Architecture: SmolLM2
  • Parameters: 1.7B

Training Configuration

optimizer:
  class_path: torch.optim.AdamW
  init_args:
    lr: 0.0005
    weight_decay: 0.01
precision: bf16-mixed
seed: 42
train:
  global_batch_size: 1024
  max_seq_length: 2048
  max_tokens: 600000000000
  micro_batch_size: 8

Model Loading and Revision System

This repository hosts multiple revisions of the model. To load a specific revision, use the revision parameter. For example:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("locuslab/-score0_mix_rephrased_from_beginning-600B", revision="final")
tokenizer = AutoTokenizer.from_pretrained("locuslab/-score0_mix_rephrased_from_beginning-600B", revision="final")

Replace "final" with the desired revision.

Downloads last month
3
Safetensors
Model size
1.81B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.