metadata

license: apache-2.0
metrics:
  - accuracy
base_model:
  - meta-llama/Llama-3.1-8B-Instruct

Model Card for Llama8b-NNetNav-WA

LLama8b-NNetNav-WA is a LLama-3.1-8B model that is instruct-tuned with NNetNav data collected via unsupervised exploration on WebArena websites, with a larger LLama-3.1-70B model.

Most details about this model along with details can be found in our paper: NNetNav: Unsupervised Learning of Browser Agents Through Environment Interaction in the Wild.

Model Card for Llama8b-NNetNav-WA
Table of Contents
Model Details
- Model Description
Uses
Bias, Risks, and Limitations
Training Details
- Training Data
- Training Procedure
Environmental Impact
Technical Specifications [optional]
- Model Architecture and Objective
- Compute Infrastructure
  - Hardware
  - Software
Citation
Model Card Authors [optional]
Model Card Contact
How to Get Started with the Model

Model Details

Model Description

Uses

Bias, Risks, and Limitations

How to Get Started with the Model

Training Details

Training Data

This model was trained on the NNetnav-WA corpus.

Training Procedure

This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.

Environmental Impact

Hardware Type: 4 H100 GPUs (80G)
Hours used: Roughly 2 days.
Cloud Provider: Stanford compute.
Compute Region: Stanford energy grid.

Model Architecture and Objective

Compute Infrastructure

This model was trained on a slurm cluster.

Hardware

This model was trained on 4 H100s.

Software

This model was fine-tuned with Open-Instruct

Citation

BibTeX:

Model Card Authors [optional]

Shikhar Murty

Model Card Contact

smurty@cs.stanford.edu shikhar.murty@gmail.com