PyTorch
llama
llama8b-nnetnav-wa / README.md
smurty's picture
Update README.md
f99e404 verified
|
raw
history blame
3.13 kB
metadata
license: apache-2.0
metrics:
  - accuracy
base_model:
  - meta-llama/Llama-3.1-8B-Instruct

Model Card for Llama8b-NNetNav-WA

LLama8b-NNetNav-WA is a LLama-3.1-8B model that is instruct-tuned with NNetNav data collected via unsupervised exploration on WebArena websites, with a larger LLama-3.1-70B model.

Most details about this model along with details can be found in our paper: NNetNav: Unsupervised Learning of Browser Agents Through Environment Interaction in the Wild.

show an example trajectory from NNetNav-WA

Table of Contents

Model Details

Model Description

Uses

Bias, Risks, and Limitations

How to Get Started with the Model


Training Details

Training Data

This model was trained on the NNetnav-WA corpus.

Training Procedure

This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.

Environmental Impact

  • Hardware Type: 4 H100 GPUs (80G)
  • Hours used: Roughly 2 days.
  • Cloud Provider: Stanford compute.
  • Compute Region: Stanford energy grid.

Model Architecture and Objective

Compute Infrastructure

This model was trained on a slurm cluster.

Hardware

This model was trained on 4 H100s.

Software

This model was fine-tuned with Open-Instruct

Citation

BibTeX:


Model Card Authors [optional]

Shikhar Murty

Model Card Contact

smurty@cs.stanford.edu shikhar.murty@gmail.com