PyTorch
llama
smurty commited on
Commit
cd9d5b6
·
verified ·
1 Parent(s): 76d38ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -26
README.md CHANGED
@@ -21,27 +21,22 @@ Most details about this model along with details can be found in our paper: [NNe
21
  - [Model Card for Llama8b-NNetNav-WA](#model-card-for--model_id-)
22
  - [Table of Contents](#table-of-contents)
23
  - [Model Details](#model-details)
24
- - [Model Description](#model-description)
25
- - [Uses](#uses)
26
  - [Bias, Risks, and Limitations](#bias-risks-and-limitations)
27
  - [Training Details](#training-details)
28
  - [Training Data](#training-data)
29
  - [Training Procedure](#training-procedure)
30
  - [Environmental Impact](#environmental-impact)
31
- - [Technical Specifications [optional]](#technical-specifications-optional)
32
- - [Model Architecture and Objective](#model-architecture-and-objective)
33
- - [Compute Infrastructure](#compute-infrastructure)
34
- - [Hardware](#hardware)
35
- - [Software](#software)
36
- - [Citation](#citation)
37
  - [Model Card Authors [optional]](#model-card-authors-optional)
38
  - [Model Card Contact](#model-card-contact)
39
  - [How to Get Started with the Model](#how-to-get-started-with-the-model)
40
 
41
  ## Model Details
42
- This model is intended to be used as a **web-agent** i.e. given an instruction such as "Upvote the post by user smurty123 on subreddit r/LocalLLaMA", and a web-url "reddit.com", the model can perform the task by executing a sequence of actions.
43
 
44
- ### Action Space
45
  <!-- Provide a longer summary of what this model is/does. -->
46
  The action space of the model is as follows:
47
  ```plaintext
@@ -88,25 +83,20 @@ TODO
88
 
89
  ### Training Data
90
 
91
- This model was trained on the [NNetnav-WA](https://huggingface.co/datasets/stanfordnlp/nnetnav-wa) dataset, which is comprised of synthetic demonstrations entirely from self-hosted websites.
92
 
93
  ### Training Procedure
94
 
95
  This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.
96
 
97
- ### Environmental Impact
98
 
99
  - **Hardware Type:** 4 H100 GPUs (80G)
100
  - **Hours used:** Roughly 2 days.
101
  - **Cloud Provider:** Stanford compute.
102
  - **Compute Region:** Stanford energy grid.
103
 
104
- ### Model Architecture and Objective
105
-
106
-
107
- ### Compute Infrastructure
108
-
109
- This model was trained on a slurm cluster.
110
 
111
  ### Hardware
112
 
@@ -116,14 +106,6 @@ This model was trained on 4 H100s.
116
 
117
  This model was fine-tuned with [Open-Instruct](https://github.com/allenai/open-instruct/tree/main)
118
 
119
- ## Citation
120
-
121
- **BibTeX:**
122
-
123
- ```
124
-
125
- ```
126
-
127
 
128
  ## Model Card Authors [optional]
129
 
 
21
  - [Model Card for Llama8b-NNetNav-WA](#model-card-for--model_id-)
22
  - [Table of Contents](#table-of-contents)
23
  - [Model Details](#model-details)
24
+ - [Results on Web-Agent Benchmarks](#results-on-benchmarks)
 
25
  - [Bias, Risks, and Limitations](#bias-risks-and-limitations)
26
  - [Training Details](#training-details)
27
  - [Training Data](#training-data)
28
  - [Training Procedure](#training-procedure)
29
  - [Environmental Impact](#environmental-impact)
30
+ - [Technical Specifications](#technical-specifications)
31
+ - [Hardware](#hardware)
32
+ - [Software](#software)
 
 
 
33
  - [Model Card Authors [optional]](#model-card-authors-optional)
34
  - [Model Card Contact](#model-card-contact)
35
  - [How to Get Started with the Model](#how-to-get-started-with-the-model)
36
 
37
  ## Model Details
38
+ This model is intended to be used as a **web-agent** i.e. given an instruction such as _Upvote the post by user smurty123 on subreddit r/LocalLLaMA_, and a web-url _reddit.com_, the model can perform the task by executing a sequence of actions.
39
 
 
40
  <!-- Provide a longer summary of what this model is/does. -->
41
  The action space of the model is as follows:
42
  ```plaintext
 
83
 
84
  ### Training Data
85
 
86
+ This model was trained with SFT on the [NNetnav-WA](https://huggingface.co/datasets/stanfordnlp/nnetnav-wa) dataset, which is comprised of synthetic demonstrations entirely from self-hosted websites.
87
 
88
  ### Training Procedure
89
 
90
  This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.
91
 
92
+ ## Environmental Impact
93
 
94
  - **Hardware Type:** 4 H100 GPUs (80G)
95
  - **Hours used:** Roughly 2 days.
96
  - **Cloud Provider:** Stanford compute.
97
  - **Compute Region:** Stanford energy grid.
98
 
99
+ ## Technical Specifications
 
 
 
 
 
100
 
101
  ### Hardware
102
 
 
106
 
107
  This model was fine-tuned with [Open-Instruct](https://github.com/allenai/open-instruct/tree/main)
108
 
 
 
 
 
 
 
 
 
109
 
110
  ## Model Card Authors [optional]
111