Update README.md
Browse files
README.md
CHANGED
@@ -21,27 +21,22 @@ Most details about this model along with details can be found in our paper: [NNe
|
|
21 |
- [Model Card for Llama8b-NNetNav-WA](#model-card-for--model_id-)
|
22 |
- [Table of Contents](#table-of-contents)
|
23 |
- [Model Details](#model-details)
|
24 |
-
|
25 |
-
- [Uses](#uses)
|
26 |
- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
|
27 |
- [Training Details](#training-details)
|
28 |
- [Training Data](#training-data)
|
29 |
- [Training Procedure](#training-procedure)
|
30 |
- [Environmental Impact](#environmental-impact)
|
31 |
-
- [Technical Specifications
|
32 |
-
- [
|
33 |
-
- [
|
34 |
-
- [Hardware](#hardware)
|
35 |
-
- [Software](#software)
|
36 |
-
- [Citation](#citation)
|
37 |
- [Model Card Authors [optional]](#model-card-authors-optional)
|
38 |
- [Model Card Contact](#model-card-contact)
|
39 |
- [How to Get Started with the Model](#how-to-get-started-with-the-model)
|
40 |
|
41 |
## Model Details
|
42 |
-
This model is intended to be used as a **web-agent** i.e. given an instruction such as
|
43 |
|
44 |
-
### Action Space
|
45 |
<!-- Provide a longer summary of what this model is/does. -->
|
46 |
The action space of the model is as follows:
|
47 |
```plaintext
|
@@ -88,25 +83,20 @@ TODO
|
|
88 |
|
89 |
### Training Data
|
90 |
|
91 |
-
This model was trained on the [NNetnav-WA](https://huggingface.co/datasets/stanfordnlp/nnetnav-wa) dataset, which is comprised of synthetic demonstrations entirely from self-hosted websites.
|
92 |
|
93 |
### Training Procedure
|
94 |
|
95 |
This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.
|
96 |
|
97 |
-
|
98 |
|
99 |
- **Hardware Type:** 4 H100 GPUs (80G)
|
100 |
- **Hours used:** Roughly 2 days.
|
101 |
- **Cloud Provider:** Stanford compute.
|
102 |
- **Compute Region:** Stanford energy grid.
|
103 |
|
104 |
-
|
105 |
-
|
106 |
-
|
107 |
-
### Compute Infrastructure
|
108 |
-
|
109 |
-
This model was trained on a slurm cluster.
|
110 |
|
111 |
### Hardware
|
112 |
|
@@ -116,14 +106,6 @@ This model was trained on 4 H100s.
|
|
116 |
|
117 |
This model was fine-tuned with [Open-Instruct](https://github.com/allenai/open-instruct/tree/main)
|
118 |
|
119 |
-
## Citation
|
120 |
-
|
121 |
-
**BibTeX:**
|
122 |
-
|
123 |
-
```
|
124 |
-
|
125 |
-
```
|
126 |
-
|
127 |
|
128 |
## Model Card Authors [optional]
|
129 |
|
|
|
21 |
- [Model Card for Llama8b-NNetNav-WA](#model-card-for--model_id-)
|
22 |
- [Table of Contents](#table-of-contents)
|
23 |
- [Model Details](#model-details)
|
24 |
+
- [Results on Web-Agent Benchmarks](#results-on-benchmarks)
|
|
|
25 |
- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
|
26 |
- [Training Details](#training-details)
|
27 |
- [Training Data](#training-data)
|
28 |
- [Training Procedure](#training-procedure)
|
29 |
- [Environmental Impact](#environmental-impact)
|
30 |
+
- [Technical Specifications](#technical-specifications)
|
31 |
+
- [Hardware](#hardware)
|
32 |
+
- [Software](#software)
|
|
|
|
|
|
|
33 |
- [Model Card Authors [optional]](#model-card-authors-optional)
|
34 |
- [Model Card Contact](#model-card-contact)
|
35 |
- [How to Get Started with the Model](#how-to-get-started-with-the-model)
|
36 |
|
37 |
## Model Details
|
38 |
+
This model is intended to be used as a **web-agent** i.e. given an instruction such as _Upvote the post by user smurty123 on subreddit r/LocalLLaMA_, and a web-url _reddit.com_, the model can perform the task by executing a sequence of actions.
|
39 |
|
|
|
40 |
<!-- Provide a longer summary of what this model is/does. -->
|
41 |
The action space of the model is as follows:
|
42 |
```plaintext
|
|
|
83 |
|
84 |
### Training Data
|
85 |
|
86 |
+
This model was trained with SFT on the [NNetnav-WA](https://huggingface.co/datasets/stanfordnlp/nnetnav-wa) dataset, which is comprised of synthetic demonstrations entirely from self-hosted websites.
|
87 |
|
88 |
### Training Procedure
|
89 |
|
90 |
This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.
|
91 |
|
92 |
+
## Environmental Impact
|
93 |
|
94 |
- **Hardware Type:** 4 H100 GPUs (80G)
|
95 |
- **Hours used:** Roughly 2 days.
|
96 |
- **Cloud Provider:** Stanford compute.
|
97 |
- **Compute Region:** Stanford energy grid.
|
98 |
|
99 |
+
## Technical Specifications
|
|
|
|
|
|
|
|
|
|
|
100 |
|
101 |
### Hardware
|
102 |
|
|
|
106 |
|
107 |
This model was fine-tuned with [Open-Instruct](https://github.com/allenai/open-instruct/tree/main)
|
108 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
109 |
|
110 |
## Model Card Authors [optional]
|
111 |
|