tla-Pendulum-v1 / README.md
devdharpatel's picture
Update readme metadata
80310c4 verified
|
raw
history blame
2.51 kB
metadata
license: bsd-3-clause
tags:
  - Pendulum-v1
  - reinforcement-learning
  - decisions
  - TLA
  - deep-reinforcement-learning
model-index:
  - name: TLA
    results:
      - metrics:
          - type: mean_reward
            value: -154.92
            name: mean_reward
          - type: action_repetition
            value: 0.7032
            name: action_repetition
          - type: mean_decisions
            value: 62.31
            name: mean_decisions
        task:
          type: reinforcement-learning
          name: reinforcement-learning
        dataset:
          name: Pendulum-v1
          type: Pendulum-v1
    Paper: https://arxiv.org/abs/2305.18701
    Code: https://github.com/dee0512/Temporally-Layered-Architecture

Temporally Layered Architecture: Pendulum-v1

These are 10 trained models over seeds (0-9) of Temporally Layered Architecture (TLA) agent playing Pendulum-v1.

Model Sources

Repository: https://github.com/dee0512/Temporally-Layered-Architecture
Paper: https://doi.org/10.1162/neco_a_01718
Arxiv: arxiv.org/abs/2305.18701

Training Details:

Using the repository:

python main.py --env_name <environment> --seed <seed>

Evaluation:

Download the models folder and place it in the same directory as the cloned repository. Using the repository:

python eval.py --env_name <environment>

Metrics:

mean_reward: Mean reward over 10 seeds
action_repeititon: percentage of actions that are equal to the previous action
mean_decisions: Number of decisions required (neural network/model forward pass)

Citation

The paper can be cited with the following bibtex entry:

BibTeX:

@article{10.1162/neco_a_01718,
    author = {Patel, Devdhar and Sejnowski, Terrence and Siegelmann, Hava},
    title = "{Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures}",
    journal = {Neural Computation},
    pages = {1-30},
    year = {2024},
    month = {10},
    issn = {0899-7667},
    doi = {10.1162/neco_a_01718},
    url = {https://doi.org/10.1162/neco\_a\_01718},
    eprint = {https://direct.mit.edu/neco/article-pdf/doi/10.1162/neco\_a\_01718/2474695/neco\_a\_01718.pdf},
}

APA:

Patel, D., Sejnowski, T., & Siegelmann, H. (2024). Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures. Neural Computation, 1-30.