devdharpatel commited on
Commit
cef654c
1 Parent(s): e0c2dbd

Update README

Browse files
Files changed (1) hide show
  1. README.md +88 -3
README.md CHANGED
@@ -1,3 +1,88 @@
1
- ---
2
- license: bsd-3-clause
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bsd-3-clause
3
+ tags:
4
+ - InvertedPendulum-v2
5
+ - reinforcement-learning
6
+ - decisions
7
+ - TLA
8
+ - deep-reinforcement-learning
9
+ model-index:
10
+ - name: TLA
11
+ results:
12
+ - metrics:
13
+ - type: mean_reward
14
+ value: 1000.00
15
+ name: mean_reward
16
+ - type: Action Repetition
17
+ value: .8882
18
+ name: Action Repetition
19
+ - type: Average Decisions
20
+ value: 111.79
21
+ name: Average Decisions
22
+ task:
23
+ type: OpenAI Gym
24
+ name: OpenAI Gym
25
+ dataset:
26
+ name: InvertedPendulum-v2
27
+ type: InvertedPendulum-v2
28
+ Paper: https://arxiv.org/abs/2305.18701
29
+ Code: https://github.com/dee0512/Temporally-Layered-Architecture
30
+ ---
31
+ # Temporally Layered Architecture: Pendulum-v1
32
+
33
+ These are 10 trained models over **seeds (0-9)** of **[Temporally Layered Architecture (TLA)](https://github.com/dee0512/Temporally-Layered-Architecture)** agent playing **InvertedPendulum-v2**.
34
+
35
+ ## Model Sources
36
+
37
+ **Repository:** [https://github.com/dee0512/Temporally-Layered-Architecture](https://github.com/dee0512/Temporally-Layered-Architecture)
38
+ **Paper:** [https://doi.org/10.1162/neco_a_01718](https://doi.org/10.1162/neco_a_01718)
39
+ **Arxiv:** [arxiv.org/abs/2305.18701](https://arxiv.org/abs/2305.18701)
40
+
41
+ # Training Details:
42
+ Using the repository:
43
+
44
+ ```
45
+ python main.py --env_name <environment> --seed <seed>
46
+ ```
47
+
48
+ # Evaluation:
49
+
50
+ Download the models folder and place it in the same directory as the cloned repository.
51
+ Using the repository:
52
+
53
+ ```
54
+ python eval.py --env_name <environment>
55
+ ```
56
+
57
+ ## Metrics:
58
+
59
+ **mean_reward:** Mean reward over 10 seeds
60
+ **action_repeititon:** percentage of actions that are equal to the previous action
61
+ **mean_decisions:** Number of decisions required (neural network/model forward pass)
62
+
63
+
64
+ # Citation
65
+
66
+ The paper can be cited with the following bibtex entry:
67
+
68
+ ## BibTeX:
69
+
70
+ ```
71
+ @article{10.1162/neco_a_01718,
72
+ author = {Patel, Devdhar and Sejnowski, Terrence and Siegelmann, Hava},
73
+ title = "{Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures}",
74
+ journal = {Neural Computation},
75
+ pages = {1-30},
76
+ year = {2024},
77
+ month = {10},
78
+ issn = {0899-7667},
79
+ doi = {10.1162/neco_a_01718},
80
+ url = {https://doi.org/10.1162/neco\_a\_01718},
81
+ eprint = {https://direct.mit.edu/neco/article-pdf/doi/10.1162/neco\_a\_01718/2474695/neco\_a\_01718.pdf},
82
+ }
83
+ ```
84
+
85
+ ## APA:
86
+ ```
87
+ Patel, D., Sejnowski, T., & Siegelmann, H. (2024). Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures. Neural Computation, 1-30.
88
+ ```