ledmands commited on
Commit
35a7301
1 Parent(s): ebcd7c3

Updated and formatted README

Browse files
Files changed (1) hide show
  1. README.md +15 -20
README.md CHANGED
@@ -26,8 +26,8 @@ model-index:
26
 
27
  # *Agent using DQN to play ALE/Pacman-v5*
28
 
29
- >## Update 20 May 2024: Latest DQN model is version 2.8
30
- >***NOTE:** Video preview is the best model of version 2.8 playing for 10,000 steps. Evaluation metrics are self-reported based on 10 episodes of evaluation. Can be found in `agents/dqn_v2-8/evals.txt`*
31
 
32
  This is an agent that is trained using Stable Baselines3 as part of the capstone project for South Hills School in Spring 2024. The goal of this project is to gain familiarity with reinforcement learning concepts and tools, and to train an agent to score up into the 400-500 point range in Pac-Man.
33
 
@@ -50,7 +50,7 @@ After cloning the repository to your local device, run:
50
  ```bash
51
  pip install -r requirements.txt
52
  ```
53
- >***NOTE:** The `requirements.txt` file will install all the extra dependencies for Stable Baselines and the entire version of TensorFlow. This is for ease of use for Stable Baselines and to ensure that extra data points and tools are available in TensorBoard. If you wish to install dependencies as needed, you can simply skip the `requirements.txt` file and install packages via `pip` as desired.*
54
 
55
  ---
56
 
@@ -153,42 +153,37 @@ python <script_name> --help
153
 
154
  ##### *watch_agent.py*
155
 
156
- >This will render the specified agent in real-time. Does not save any evaluation information.
157
 
158
  ##### *evaluate_agent.py*
159
 
160
- >This will evaluate a specified agent and append the results to a specified log file.
161
 
162
  ##### *get_config.py*
163
 
164
- >This will pull configuration information from the specified agent and save it in JSON format. The data is pulled from the data file in the agent's zip file and strips out the serialized data to make the data more human-readable. The default save file will save to the directory from which the command is run. Best practice is to save the file to the agent's directory.
165
 
166
  ##### *plot_improvement.py*
167
 
168
- >This plots the average score and standard deviation of the `dqn_v2` agent over all evaluation episodes during a training run as a bar graph with each training run shown as one bar. Removes the lowest and highest episode scores from each evaluation.
169
 
170
  ##### *record_video.py*
171
 
172
- >This will record a video of a specified agent being evaluated. Does not save any evaluation information. *Currently in major development. Currently located in development branch.*
173
 
174
  ##### *plot_evaluations.py*
175
 
176
- >This will plot the evaluation data that was gathered during the training run of the specified agent using MatPlotLib. Charts can be saved to a directory of the user's choosing. *Currently in major development. Currently located in development branch.*
177
 
178
  ---
179
 
180
  ## *External References*
181
 
182
- - [Foundations of Deep RL -- 6-lecture series by Pieter Abbeel](https://www.youtube.com/playlist?list=PLwRJQ4m4UJjNymuBM9RdmB3Z9N5-0IlY0)
183
- - This is an excellent introduction to some of the concepts behind Deep RL Algorithms. Pieter Abbeel is a machine learning and robotics researcher at UC Berkeley.
184
- - [Training AI to Play Pokemon with Reinforcement Learning](https://www.youtube.com/watch?v=DcYLT37ImBY)
185
- - Peter Whidden's video of using Proximal Policy Optimization was a major inspiration for this project and has some fantastic visualizations of the agent learning.
186
- - [Frame Skipping and Pre-Processing for Deep Q-Networks on Atari 2600 Games](https://danieltakeshi.github.io/2016/11/25/frame-skipping-and-preprocessing-for-deep-q-networks-on-atari-2600-games/)
187
- - Daniel Takeshi wrote an excellent post that helped me better understand some of the terminology around frame skipping.
188
- - [Playing Atari with Deep Reinforcement Learning](https://arxiv.org/abs/1312.5602)
189
- - This paper on Deep Q Networks is a landmark in the field of reinforcement learning.
190
- - [Hugging Face Deep Reinforcement Learning Course](https://huggingface.co/learn/deep-rl-course/unit0/introduction)
191
- - Another inspiration for this project and a great place to get hands-on experience.
192
  - [Stable Baselines3](https://stable-baselines3.readthedocs.io/en/master/)
193
  - [RL Zoo](https://rl-baselines3-zoo.readthedocs.io/en/master/)
194
  - [Gymnasium](https://gymnasium.farama.org/)
@@ -197,4 +192,4 @@ python <script_name> --help
197
 
198
  ## *Contact*
199
 
200
- Please feel free to contact me on [Twitter](https://x.com/ledmands) or [LinkedIn](https://linkedin.com/in/lucasedmands) or in in the Discussion section on the Community tab of this repository!
 
26
 
27
  # *Agent using DQN to play ALE/Pacman-v5*
28
 
29
+ ## Update 20 May 2024: Latest DQN model is version 2.8
30
+ ***NOTE:** Video preview is the best model of version 2.8 playing for 10,000 steps. Evaluation metrics are self-reported based on 10 episodes of evaluation. Can be found in `agents/dqn_v2-8/evals.txt`*
31
 
32
  This is an agent that is trained using Stable Baselines3 as part of the capstone project for South Hills School in Spring 2024. The goal of this project is to gain familiarity with reinforcement learning concepts and tools, and to train an agent to score up into the 400-500 point range in Pac-Man.
33
 
 
50
  ```bash
51
  pip install -r requirements.txt
52
  ```
53
+ ***NOTE:** The `requirements.txt` file will install all the extra dependencies for Stable Baselines and the entire version of TensorFlow. This is for ease of use for Stable Baselines and to ensure that extra data points and tools are available in TensorBoard. If you wish to install dependencies as needed, you can simply skip the `requirements.txt` file and install packages via `pip` as desired.*
54
 
55
  ---
56
 
 
153
 
154
  ##### *watch_agent.py*
155
 
156
+ - This will render the specified agent in real-time. Does not save any evaluation information.
157
 
158
  ##### *evaluate_agent.py*
159
 
160
+ - This will evaluate a specified agent and append the results to a specified log file.
161
 
162
  ##### *get_config.py*
163
 
164
+ - This will pull configuration information from the specified agent and save it in JSON format. The data is pulled from the data file in the agent's zip file and strips out the serialized data to make the data more human-readable. The default save file will save to the directory from which the command is run. Best practice is to save the file to the agent's directory.
165
 
166
  ##### *plot_improvement.py*
167
 
168
+ - This plots the average score and standard deviation of the `dqn_v2` agent over all evaluation episodes during a training run as a bar graph with each training run shown as one bar. Removes the lowest and highest episode scores from each evaluation.
169
 
170
  ##### *record_video.py*
171
 
172
+ - This will record a video of a specified agent being evaluated. Does not save any evaluation information. *Currently in major development. Currently located in development branch.*
173
 
174
  ##### *plot_evaluations.py*
175
 
176
+ - This will plot the evaluation data that was gathered during the training run of the specified agent using MatPlotLib. Charts can be saved to a directory of the user's choosing. *Currently in major development. Currently located in development branch.*
177
 
178
  ---
179
 
180
  ## *External References*
181
 
182
+ - [Foundations of Deep RL -- 6-lecture series by Pieter Abbeel](https://www.youtube.com/playlist?list=PLwRJQ4m4UJjNymuBM9RdmB3Z9N5-0IlY0). *This is an excellent introduction to some of the concepts behind Deep RL Algorithms. Pieter Abbeel is a machine learning and robotics researcher at UC Berkeley.*
183
+ - [Training AI to Play Pokemon with Reinforcement Learning](https://www.youtube.com/watch?v=DcYLT37ImBY). *Peter Whidden's video of using Proximal Policy Optimization was a major inspiration for this project and has some fantastic visualizations of the agent learning.*
184
+ - [Frame Skipping and Pre-Processing for Deep Q-Networks on Atari 2600 Games](https://danieltakeshi.github.io/2016/11/25/frame-skipping-and-preprocessing-for-deep-q-networks-on-atari-2600-games/). *Daniel Takeshi wrote an excellent post that helped me better understand some of the terminology around frame skipping.*
185
+ - [Playing Atari with Deep Reinforcement Learning](https://arxiv.org/abs/1312.5602). *This paper on Deep Q Networks is a landmark in the field of reinforcement learning.*
186
+ - [Hugging Face Deep Reinforcement Learning Course](https://huggingface.co/learn/deep-rl-course/unit0/introduction). *Another inspiration for this project and a great place to get hands-on experience.*
 
 
 
 
 
187
  - [Stable Baselines3](https://stable-baselines3.readthedocs.io/en/master/)
188
  - [RL Zoo](https://rl-baselines3-zoo.readthedocs.io/en/master/)
189
  - [Gymnasium](https://gymnasium.farama.org/)
 
192
 
193
  ## *Contact*
194
 
195
+ Please feel free to contact me on [Twitter](https://x.com/ledmands) or [LinkedIn](https://linkedin.com/in/lucasedmands) or in the Discussion section on the Community tab of this repository!