stanfordnlp
/

llama8b-nnetnav-wa

PyTorch

llama

Model card Files Files and versions Community

smurty commited on Jan 29

Commit

cd9d5b6

verified ·

1 Parent(s): 76d38ec

Update README.md

Browse files

Files changed (1) hide show

README.md +8 -26

README.md CHANGED Viewed

@@ -21,27 +21,22 @@ Most details about this model along with details can be found in our paper: [NNe
 - [Model Card for Llama8b-NNetNav-WA](#model-card-for--model_id-)
 - [Table of Contents](#table-of-contents)
 - [Model Details](#model-details)
-  - [Model Description](#model-description)
-- [Uses](#uses)
 - [Bias, Risks, and Limitations](#bias-risks-and-limitations)
 - [Training Details](#training-details)
   - [Training Data](#training-data)
   - [Training Procedure](#training-procedure)
 - [Environmental Impact](#environmental-impact)
-- [Technical Specifications [optional]](#technical-specifications-optional)
-  - [Model Architecture and Objective](#model-architecture-and-objective)
-  - [Compute Infrastructure](#compute-infrastructure)
-    - [Hardware](#hardware)
-    - [Software](#software)
-- [Citation](#citation)
 - [Model Card Authors [optional]](#model-card-authors-optional)
 - [Model Card Contact](#model-card-contact)
 - [How to Get Started with the Model](#how-to-get-started-with-the-model)
 ## Model Details
-This model is intended to be used as a **web-agent** i.e. given an instruction such as "Upvote the post by user smurty123 on subreddit r/LocalLLaMA", and a web-url "reddit.com", the model can perform the task by executing a sequence of actions.
-### Action Space
 <!-- Provide a longer summary of what this model is/does. -->
 The action space of the model is as follows:
 ```plaintext
@@ -88,25 +83,20 @@ TODO
 ### Training Data
-This model was trained on the [NNetnav-WA](https://huggingface.co/datasets/stanfordnlp/nnetnav-wa) dataset, which is comprised of synthetic demonstrations entirely from self-hosted websites.
 ### Training Procedure
 This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.
-### Environmental Impact
 - **Hardware Type:** 4 H100 GPUs (80G)
 - **Hours used:** Roughly 2 days.
 - **Cloud Provider:** Stanford compute.
 - **Compute Region:** Stanford energy grid.
-### Model Architecture and Objective
-### Compute Infrastructure
-This model was trained on a slurm cluster.
 ### Hardware
@@ -116,14 +106,6 @@ This model was trained on 4 H100s.
 This model was fine-tuned with [Open-Instruct](https://github.com/allenai/open-instruct/tree/main)
-## Citation
-**BibTeX:**
-```
-```
 ## Model Card Authors [optional]

 - [Model Card for Llama8b-NNetNav-WA](#model-card-for--model_id-)
 - [Table of Contents](#table-of-contents)
 - [Model Details](#model-details)
+- [Results on Web-Agent Benchmarks](#results-on-benchmarks)
 - [Bias, Risks, and Limitations](#bias-risks-and-limitations)
 - [Training Details](#training-details)
   - [Training Data](#training-data)
   - [Training Procedure](#training-procedure)
 - [Environmental Impact](#environmental-impact)
+- [Technical Specifications](#technical-specifications)
+  - [Hardware](#hardware)
+  - [Software](#software)
 - [Model Card Authors [optional]](#model-card-authors-optional)
 - [Model Card Contact](#model-card-contact)
 - [How to Get Started with the Model](#how-to-get-started-with-the-model)
 ## Model Details
+This model is intended to be used as a **web-agent** i.e. given an instruction such as _Upvote the post by user smurty123 on subreddit r/LocalLLaMA_, and a web-url _reddit.com_, the model can perform the task by executing a sequence of actions.
 <!-- Provide a longer summary of what this model is/does. -->
 The action space of the model is as follows:
 ```plaintext
 ### Training Data
+This model was trained with SFT on the [NNetnav-WA](https://huggingface.co/datasets/stanfordnlp/nnetnav-wa) dataset, which is comprised of synthetic demonstrations entirely from self-hosted websites.
 ### Training Procedure
 This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.
+## Environmental Impact
 - **Hardware Type:** 4 H100 GPUs (80G)
 - **Hours used:** Roughly 2 days.
 - **Cloud Provider:** Stanford compute.
 - **Compute Region:** Stanford energy grid.
+## Technical Specifications
 ### Hardware
 This model was fine-tuned with [Open-Instruct](https://github.com/allenai/open-instruct/tree/main)
 ## Model Card Authors [optional]