Edit model card

This model is randomly initialized, using the config from meta-llama/Meta-Llama-3-8B-Instruct but with smaller size. Note the model is in bfloat16.

"yujiepan/llama-3-tiny-random" and "yujiepan/meta-llama-3-tiny-random" shares exactly the same files except the repo name.

Codes:

import transformers
import torch
import os
from huggingface_hub import create_repo, upload_folder
import accelerate

source_model_id = 'meta-llama/Meta-Llama-3-8B-Instruct'
save_path = '/tmp/yujiepan/meta-llama-3-tiny-random'
repo_id = 'yujiepan/meta-llama-3-tiny-random'

os.system(f'rm -rf {save_path}')

config = transformers.AutoConfig.from_pretrained(
    source_model_id,
    trust_remote_code=True,
)
config._name_or_path = source_model_id
config.hidden_size = 4
config.intermediate_size = 14
config.num_attention_heads = 2
config.num_key_value_heads = 1
config.num_hidden_layers = 2
config.torch_dtype = "bfloat16"

model = transformers.AutoModelForCausalLM.from_config(
    config,
    trust_remote_code=True,
)

with accelerate.init_empty_weights():
    model.generation_config = transformers.AutoModelForCausalLM.from_pretrained(source_model_id).generation_config

model = model.to(torch.bfloat16)
model.save_pretrained(save_path)

tokenizer = transformers.AutoTokenizer.from_pretrained(
    source_model_id,
    trust_remote_code=True,
)
tokenizer.save_pretrained(save_path)

model.float().generate(torch.tensor([[1, 2, 3]]).long(), max_length=16)

os.system(f'ls -alh {save_path}')
# os.system(f'rm -rf {save_path}/model.safetensors')
create_repo(repo_id, exist_ok=True)
upload_folder(repo_id='yujiepan/meta-llama-3-tiny-random', folder_path=save_path)
upload_folder(repo_id='yujiepan/llama-3-tiny-random', folder_path=save_path)
Downloads last month
718
Safetensors
Model size
1.03M params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including yujiepan/llama-3-tiny-random