Yuan (Cyrus) Chiang commited on
Commit
da724dc
·
unverified ·
1 Parent(s): 419b35b

Major cleanup (#51)

Browse files

* clean up; add mattersim combustion; add mace-mpa

* add app test

* only push to hf after test passes on main

* add streamlit to test deps

* add test badge

* update readme

.github/README.md CHANGED
@@ -1,7 +1,8 @@
1
  <div align="center">
2
  <h1>MLIP Arena</h1>
 
3
  <a href="https://zenodo.org/doi/10.5281/zenodo.13704399"><img src="https://zenodo.org/badge/776930320.svg" alt="DOI"></a>
4
- <a href="https://huggingface.co/spaces/atomind/mlip-arena"><img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo-with-title.svg" style="height: 20px; background-color: white;" alt="Hugging Face"></a>
5
  <!-- <a href="https://discord.gg/W8WvdQtT8T"><img alt="Discord" src="https://img.shields.io/discord/1299613474820984832?logo=discord"> -->
6
  </a>
7
  </div>
@@ -107,8 +108,8 @@ streamlit run serve/app.py
107
  > - [Prefect molecular dynamics (MD)](../mlip_arena/tasks/md.py)
108
  > - [Prefect equation of states (EOS)](../mlip_arena/tasks/eos.py)
109
 
110
- 1. Follow the task template to implement the task class and upload the script along with metadata to the MLIP Arena [here](../mlip_arena/tasks/README.md).
111
- 2. Code a benchmark script to evaluate the performance of your model on the task. The script should be able to load the model and the dataset, and output the evaluation metrics.
112
 
113
  ### Add new MLIP models
114
 
@@ -129,12 +130,10 @@ If you have pretrained MLIP models that you would like to contribute to the MLIP
129
  2. Follow the template to code the I/O interface for your model [here](../mlip_arena/models/README.md).
130
  3. Update model [registry](../mlip_arena/models/registry.yaml) with metadata
131
 
132
- > [!NOTE]
133
- > CPU benchmarking will be performed automatically. Due to the limited amount GPU compute, if you would like to be considered for GPU benchmarking, please create a pull request to demonstrate the offline performance of your model (published paper or preprint). We will review and select the models to be benchmarked on GPU.
134
 
135
-
136
-
137
- ### Add new datasets
138
 
139
  The "ultimate" goal is to compile the copies of all the open data in a unified format for lifelong learning with [Hugging Face Auto-Train](https://huggingface.co/docs/hub/webhooks-guide-auto-retrain).
140
 
@@ -150,4 +149,4 @@ The "ultimate" goal is to compile the copies of all the open data in a unified f
150
  #### Molecular dynamics calculations
151
 
152
  - [ ] [MD17](http://www.sgdml.org/#datasets)
153
- - [ ] [MD22](http://www.sgdml.org/#datasets)
 
1
  <div align="center">
2
  <h1>MLIP Arena</h1>
3
+ <a href="https://github.com/atomind-ai/mlip-arena/actions"><img alt="GitHub Actions Workflow Status" src="https://img.shields.io/github/actions/workflow/status/atomind-ai/mlip-arena/test.yaml"></a>
4
  <a href="https://zenodo.org/doi/10.5281/zenodo.13704399"><img src="https://zenodo.org/badge/776930320.svg" alt="DOI"></a>
5
+ <a href="https://huggingface.co/spaces/atomind/mlip-arena"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Space-blue" alt="Hugging Face"></a>
6
  <!-- <a href="https://discord.gg/W8WvdQtT8T"><img alt="Discord" src="https://img.shields.io/discord/1299613474820984832?logo=discord"> -->
7
  </a>
8
  </div>
 
108
  > - [Prefect molecular dynamics (MD)](../mlip_arena/tasks/md.py)
109
  > - [Prefect equation of states (EOS)](../mlip_arena/tasks/eos.py)
110
 
111
+ <!-- 1. Follow the task template to implement the task class and upload the script along with metadata to the MLIP Arena [here](../mlip_arena/tasks/README.md).
112
+ 2. Code a benchmark script to evaluate the performance of your model on the task. The script should be able to load the model and the dataset, and output the evaluation metrics. -->
113
 
114
  ### Add new MLIP models
115
 
 
130
  2. Follow the template to code the I/O interface for your model [here](../mlip_arena/models/README.md).
131
  3. Update model [registry](../mlip_arena/models/registry.yaml) with metadata
132
 
133
+ <!-- > [!NOTE]
134
+ > CPU benchmarking will be performed automatically. Due to the limited amount GPU compute, if you would like to be considered for GPU benchmarking, please create a pull request to demonstrate the offline performance of your model (published paper or preprint). We will review and select the models to be benchmarked on GPU. -->
135
 
136
+ <!-- ### Add new datasets
 
 
137
 
138
  The "ultimate" goal is to compile the copies of all the open data in a unified format for lifelong learning with [Hugging Face Auto-Train](https://huggingface.co/docs/hub/webhooks-guide-auto-retrain).
139
 
 
149
  #### Molecular dynamics calculations
150
 
151
  - [ ] [MD17](http://www.sgdml.org/#datasets)
152
+ - [ ] [MD22](http://www.sgdml.org/#datasets) -->
.github/workflows/sync-hf.yaml CHANGED
@@ -3,6 +3,7 @@ name: Sync to Hugging Face hub
3
  on:
4
  workflow_run:
5
  workflows: [Python Test]
 
6
  types: [completed]
7
  workflow_dispatch:
8
 
 
3
  on:
4
  workflow_run:
5
  workflows: [Python Test]
6
+ branches: [main]
7
  types: [completed]
8
  workflow_dispatch:
9
 
.github/workflows/test.yaml CHANGED
@@ -61,4 +61,4 @@ jobs:
61
  PREFECT_API_KEY: ${{ secrets.PREFECT_API_KEY }}
62
  PREFECT_API_URL: ${{ secrets.PREFECT_API_URL }}
63
  run: |
64
- pytest --dist=loadscope -vra tests -n 5
 
61
  PREFECT_API_KEY: ${{ secrets.PREFECT_API_KEY }}
62
  PREFECT_API_URL: ${{ secrets.PREFECT_API_URL }}
63
  run: |
64
+ pytest -vra -n 5 tests
.gitignore CHANGED
@@ -1,11 +1,11 @@
1
  *.out
2
- *.ipynb
3
  *.extxyz
4
  *.traj
5
  mlip_arena/tasks/*/
6
  examples/
7
  lab/
8
  manuscripts/
 
9
 
10
  # Byte-compiled / optimized / DLL files
11
  __pycache__/
 
1
  *.out
 
2
  *.extxyz
3
  *.traj
4
  mlip_arena/tasks/*/
5
  examples/
6
  lab/
7
  manuscripts/
8
+ datasets/
9
 
10
  # Byte-compiled / optimized / DLL files
11
  __pycache__/
mlip_arena/models/externals/mace-mp.py CHANGED
@@ -37,3 +37,33 @@ class MACE_MP_Medium(MACECalculator):
37
  super().__init__(
38
  model_paths=model, device=device, default_dtype=default_dtype, **kwargs
39
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  super().__init__(
38
  model_paths=model, device=device, default_dtype=default_dtype, **kwargs
39
  )
40
+
41
+ class MACE_MPA(MACECalculator):
42
+ def __init__(
43
+ self,
44
+ checkpoint="https://github.com/ACEsuit/mace-mp/releases/download/mace_mpa_0/mace-mpa-0-medium.model",
45
+ device: str | None = None,
46
+ default_dtype="float32",
47
+ **kwargs,
48
+ ):
49
+ cache_dir = Path.home() / ".cache" / "mace"
50
+ checkpoint_url_name = "".join(
51
+ c for c in os.path.basename(checkpoint) if c.isalnum() or c in "_"
52
+ )
53
+ cached_model_path = f"{cache_dir}/{checkpoint_url_name}"
54
+ if not os.path.isfile(cached_model_path):
55
+ import urllib
56
+
57
+ os.makedirs(cache_dir, exist_ok=True)
58
+ _, http_msg = urllib.request.urlretrieve(checkpoint, cached_model_path)
59
+ if "Content-Type: text/html" in http_msg:
60
+ raise RuntimeError(
61
+ f"Model download failed, please check the URL {checkpoint}"
62
+ )
63
+ model = cached_model_path
64
+
65
+ device = device or str(get_freer_device())
66
+
67
+ super().__init__(
68
+ model_paths=model, device=device, default_dtype=default_dtype, **kwargs
69
+ )
mlip_arena/models/registry.yaml CHANGED
@@ -2,7 +2,7 @@ MACE-MP(M):
2
  module: externals
3
  class: MACE_MP_Medium
4
  family: mace-mp
5
- package: mace-torch==0.3.4
6
  checkpoint: 2023-12-03-mace-128-L1_epoch-199.model
7
  username: cyrusyc
8
  last-update: 2024-03-25T14:30:00
@@ -60,6 +60,7 @@ M3GNet:
60
  gpu-tasks:
61
  - homonuclear-diatomics
62
  - combustion
 
63
  github: https://github.com/materialsvirtuallab/matgl
64
  doi: https://doi.org/10.1038/s43588-022-00349-3
65
  date: 2022-02-05
@@ -85,6 +86,7 @@ MatterSim:
85
  gpu-tasks:
86
  - homonuclear-diatomics
87
  - stability
 
88
  github: https://github.com/microsoft/mattersim
89
  doi: https://arxiv.org/abs/2405.04967
90
  date: 2024-12-05
@@ -167,6 +169,28 @@ eqV2(OMat):
167
  doi: https://arxiv.org/abs/2410.12771
168
  license: Modified Apache-2.0 (Meta)
169
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
170
 
171
  EquiformerV2(OC22):
172
  module: externals
@@ -237,7 +261,7 @@ MACE-OFF(M):
237
  module: externals
238
  class: MACE_OFF_Medium
239
  family: mace-off
240
- package: mace-torch==0.3.4
241
  checkpoint: MACE-OFF23_medium.model
242
  username: cyrusyc
243
  last-update: 2024-03-25T14:30:00
@@ -272,7 +296,7 @@ ANI2x:
272
  date: 2024-05-23
273
  prediction: EFS
274
  nvt: true
275
- npt: false
276
  license: MIT
277
 
278
  ALIGNN:
 
2
  module: externals
3
  class: MACE_MP_Medium
4
  family: mace-mp
5
+ package: mace-torch==0.3.9
6
  checkpoint: 2023-12-03-mace-128-L1_epoch-199.model
7
  username: cyrusyc
8
  last-update: 2024-03-25T14:30:00
 
60
  gpu-tasks:
61
  - homonuclear-diatomics
62
  - combustion
63
+ - stability
64
  github: https://github.com/materialsvirtuallab/matgl
65
  doi: https://doi.org/10.1038/s43588-022-00349-3
66
  date: 2022-02-05
 
86
  gpu-tasks:
87
  - homonuclear-diatomics
88
  - stability
89
+ - combustion
90
  github: https://github.com/microsoft/mattersim
91
  doi: https://arxiv.org/abs/2405.04967
92
  date: 2024-12-05
 
169
  doi: https://arxiv.org/abs/2410.12771
170
  license: Modified Apache-2.0 (Meta)
171
 
172
+ MACE-MPA:
173
+ module: externals
174
+ class: MACE_MPA
175
+ family: mace-mp
176
+ package: mace-torch==0.3.9
177
+ checkpoint: mace-mpa-0-medium.model
178
+ username:
179
+ last-update: 2025-11-19T00:00:00
180
+ datetime: 2024-12-09T00:00:00 # TODO: Fake datetime
181
+ datasets:
182
+ - MPTrj # TODO: fake HF dataset repo
183
+ - Alexandria
184
+ gpu-tasks:
185
+ - homonuclear-diatomics
186
+ - stability
187
+ github: https://github.com/ACEsuit/mace
188
+ doi:
189
+ date: 2024-12-09
190
+ prediction: EFS
191
+ nvt: true
192
+ npt: true
193
+ license: MIT
194
 
195
  EquiformerV2(OC22):
196
  module: externals
 
261
  module: externals
262
  class: MACE_OFF_Medium
263
  family: mace-off
264
+ package: mace-torch==0.3.9
265
  checkpoint: MACE-OFF23_medium.model
266
  username: cyrusyc
267
  last-update: 2024-03-25T14:30:00
 
296
  date: 2024-05-23
297
  prediction: EFS
298
  nvt: true
299
+ npt: true
300
  license: MIT
301
 
302
  ALIGNN:
mlip_arena/tasks/README.md CHANGED
@@ -1,8 +1,8 @@
1
- ## Note on task registration
2
 
3
  1. Use `ast` to parse task classes from the uploaded script.
4
  2. Add the classes and their supported tasks to the task registry file `registry.yaml`.
5
  3. Run tests on HF Space to ensure the task is working as expected.
6
  4. [Push task script to the Space](https://huggingface.co/docs/huggingface_hub/guides/upload) and sync with github repository.
7
  5. Create task folder in [mlip-arena](https://huggingface.co/datasets/atomind/mlip-arena) HF Dataset.
8
- 6.
 
1
+ <!-- ## Note on task registration
2
 
3
  1. Use `ast` to parse task classes from the uploaded script.
4
  2. Add the classes and their supported tasks to the task registry file `registry.yaml`.
5
  3. Run tests on HF Space to ensure the task is working as expected.
6
  4. [Push task script to the Space](https://huggingface.co/docs/huggingface_hub/guides/upload) and sync with github repository.
7
  5. Create task folder in [mlip-arena](https://huggingface.co/datasets/atomind/mlip-arena) HF Dataset.
8
+ 6. -->
mlip_arena/tasks/combustion/mattersim/hydrogen.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:00c0c38af5321151ff4a3fc64935df168689030ba31cad0be2589379360b333b
3
+ size 226556
mlip_arena/tasks/combustion/water.ipynb CHANGED
@@ -4,30 +4,23 @@
4
  "cell_type": "code",
5
  "execution_count": null,
6
  "metadata": {},
7
- "outputs": [
8
- {
9
- "name": "stdout",
10
- "output_type": "stream",
11
- "text": [
12
- "No module named 'deepmd'\n"
13
- ]
14
- }
15
- ],
16
  "source": [
17
  "from pathlib import Path\n",
18
  "\n",
19
- "from ase import units, Atoms\n",
20
- "from ase.build import molecule\n",
21
- "from ase.io import read, write\n",
22
  "from dask.distributed import Client\n",
23
  "from dask_jobqueue import SLURMCluster\n",
 
 
 
24
  "from prefect import flow\n",
25
  "from prefect_dask import DaskTaskRunner\n",
26
- "from pymatgen.core import Molecule\n",
27
- "from pymatgen.io.packmol import PackmolBoxGen\n",
28
  "\n",
29
- "from mlip_arena.models import REGISTRY, MLIPEnum\n",
30
- "from mlip_arena.tasks.md import run as MD"
 
 
 
31
  ]
32
  },
33
  {
@@ -42,7 +35,7 @@
42
  },
43
  {
44
  "cell_type": "code",
45
- "execution_count": 3,
46
  "metadata": {},
47
  "outputs": [],
48
  "source": [
@@ -57,7 +50,7 @@
57
  },
58
  {
59
  "cell_type": "code",
60
- "execution_count": 4,
61
  "metadata": {},
62
  "outputs": [],
63
  "source": [
@@ -68,7 +61,7 @@
68
  },
69
  {
70
  "cell_type": "code",
71
- "execution_count": 5,
72
  "metadata": {},
73
  "outputs": [],
74
  "source": [
@@ -88,15 +81,7 @@
88
  "cell_type": "code",
89
  "execution_count": null,
90
  "metadata": {},
91
- "outputs": [
92
- {
93
- "name": "stdout",
94
- "output_type": "stream",
95
- "text": [
96
- "Atoms(symbols='H256O128', pbc=True, cell=[30.0, 30.0, 30.0])\n"
97
- ]
98
- }
99
- ],
100
  "source": [
101
  "tolerance = 2.0\n",
102
  "input_gen = PackmolBoxGen(\n",
@@ -132,19 +117,11 @@
132
  },
133
  {
134
  "cell_type": "code",
135
- "execution_count": 2,
136
  "metadata": {
137
  "tags": []
138
  },
139
- "outputs": [
140
- {
141
- "name": "stdout",
142
- "output_type": "stream",
143
- "text": [
144
- "Atoms(symbols='H256O128', pbc=True, cell=[30.0, 30.0, 30.0])\n"
145
- ]
146
- }
147
- ],
148
  "source": [
149
  "atoms = read(\"H256O128.extxyz\")\n",
150
  "print(atoms)"
@@ -154,40 +131,7 @@
154
  "cell_type": "code",
155
  "execution_count": null,
156
  "metadata": {},
157
- "outputs": [
158
- {
159
- "name": "stdout",
160
- "output_type": "stream",
161
- "text": [
162
- "#!/bin/bash\n",
163
- "\n",
164
- "#SBATCH -A matgen\n",
165
- "#SBATCH --mem=0\n",
166
- "#SBATCH -t 02:00:00\n",
167
- "#SBATCH -J combustion-water\n",
168
- "#SBATCH -q regular\n",
169
- "#SBATCH -N 1\n",
170
- "#SBATCH -C gpu\n",
171
- "#SBATCH -G 4\n",
172
- "#SBATCH --exclusive\n",
173
- "source ~/.bashrc\n",
174
- "module load python\n",
175
- "source activate /pscratch/sd/c/cyrusyc/.conda/mlip-arena\n",
176
- "/pscratch/sd/c/cyrusyc/.conda/mlip-arena/bin/python -m distributed.cli.dask_worker tcp://128.55.64.15:38781 --name dummy-name --nthreads 1 --memory-limit 59.60GiB --nanny --death-timeout 86400\n",
177
- "\n"
178
- ]
179
- },
180
- {
181
- "name": "stderr",
182
- "output_type": "stream",
183
- "text": [
184
- "/pscratch/sd/c/cyrusyc/.conda/mlip-arena/lib/python3.11/site-packages/distributed/node.py:187: UserWarning: Port 8787 is already in use.\n",
185
- "Perhaps you already have a cluster running?\n",
186
- "Hosting the HTTP server on port 44831 instead\n",
187
- " warnings.warn(\n"
188
- ]
189
- }
190
- ],
191
  "source": [
192
  "nodes_per_alloc = 1\n",
193
  "gpus_per_alloc = 4\n",
@@ -197,8 +141,8 @@
197
  " cores=1,\n",
198
  " memory=\"64 GB\",\n",
199
  " shebang=\"#!/bin/bash\",\n",
200
- " account=\"matgen\",\n",
201
- " walltime=\"02:00:00\",\n",
202
  " job_mem=\"0\",\n",
203
  " job_script_prologue=[\n",
204
  " \"source ~/.bashrc\",\n",
@@ -208,7 +152,7 @@
208
  " job_directives_skip=[\"-n\", \"--cpus-per-task\", \"-J\"],\n",
209
  " job_extra_directives=[\n",
210
  " \"-J combustion-water\",\n",
211
- " \"-q regular\",\n",
212
  " f\"-N {nodes_per_alloc}\",\n",
213
  " \"-C gpu\",\n",
214
  " f\"-G {gpus_per_alloc}\",\n",
@@ -221,7 +165,7 @@
221
  "\n",
222
  "\n",
223
  "print(cluster.job_script())\n",
224
- "cluster.adapt(minimum_jobs=2, maximum_jobs=2)\n",
225
  "client = Client(cluster)"
226
  ]
227
  },
@@ -236,18 +180,23 @@
236
  " futures = []\n",
237
  "\n",
238
  " for model in MLIPEnum:\n",
 
 
 
239
  " future = MD.submit(\n",
240
  " atoms=atoms,\n",
241
- " calculator_name=model,\n",
242
- " calculator_kwargs=None,\n",
 
 
243
  " ensemble=\"nvt\",\n",
244
  " dynamics=\"nose-hoover\",\n",
245
  " time_step=None,\n",
246
- " ase_md_kwargs=dict(ttime=25 * units.fs, pfactor=None),\n",
247
  " total_time=1000_000,\n",
248
  " temperature=[300, 3000, 3000, 300],\n",
249
  " pressure=None,\n",
250
- " md_velocity_seed=0,\n",
251
  " traj_file=Path(REGISTRY[model.name][\"family\"])\n",
252
  " / f\"{model.name}_{atoms.get_chemical_formula()}.traj\",\n",
253
  " traj_interval=1000,\n",
@@ -269,6 +218,54 @@
269
  "source": [
270
  "results = combustion(atoms)"
271
  ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
272
  }
273
  ],
274
  "metadata": {
 
4
  "cell_type": "code",
5
  "execution_count": null,
6
  "metadata": {},
7
+ "outputs": [],
 
 
 
 
 
 
 
 
8
  "source": [
9
  "from pathlib import Path\n",
10
  "\n",
 
 
 
11
  "from dask.distributed import Client\n",
12
  "from dask_jobqueue import SLURMCluster\n",
13
+ "from mlip_arena.models import REGISTRY, MLIPEnum\n",
14
+ "from mlip_arena.tasks.md import run as MD\n",
15
+ "from mlip_arena.tasks.utils import get_calculator\n",
16
  "from prefect import flow\n",
17
  "from prefect_dask import DaskTaskRunner\n",
 
 
18
  "\n",
19
+ "from ase import Atoms, units\n",
20
+ "from ase.build import molecule\n",
21
+ "from ase.io import read, write\n",
22
+ "from pymatgen.core import Molecule\n",
23
+ "from pymatgen.io.packmol import PackmolBoxGen"
24
  ]
25
  },
26
  {
 
35
  },
36
  {
37
  "cell_type": "code",
38
+ "execution_count": null,
39
  "metadata": {},
40
  "outputs": [],
41
  "source": [
 
50
  },
51
  {
52
  "cell_type": "code",
53
+ "execution_count": null,
54
  "metadata": {},
55
  "outputs": [],
56
  "source": [
 
61
  },
62
  {
63
  "cell_type": "code",
64
+ "execution_count": null,
65
  "metadata": {},
66
  "outputs": [],
67
  "source": [
 
81
  "cell_type": "code",
82
  "execution_count": null,
83
  "metadata": {},
84
+ "outputs": [],
 
 
 
 
 
 
 
 
85
  "source": [
86
  "tolerance = 2.0\n",
87
  "input_gen = PackmolBoxGen(\n",
 
117
  },
118
  {
119
  "cell_type": "code",
120
+ "execution_count": null,
121
  "metadata": {
122
  "tags": []
123
  },
124
+ "outputs": [],
 
 
 
 
 
 
 
 
125
  "source": [
126
  "atoms = read(\"H256O128.extxyz\")\n",
127
  "print(atoms)"
 
131
  "cell_type": "code",
132
  "execution_count": null,
133
  "metadata": {},
134
+ "outputs": [],
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
135
  "source": [
136
  "nodes_per_alloc = 1\n",
137
  "gpus_per_alloc = 4\n",
 
141
  " cores=1,\n",
142
  " memory=\"64 GB\",\n",
143
  " shebang=\"#!/bin/bash\",\n",
144
+ " account=\"m4282\",\n",
145
+ " walltime=\"00:30:00\",\n",
146
  " job_mem=\"0\",\n",
147
  " job_script_prologue=[\n",
148
  " \"source ~/.bashrc\",\n",
 
152
  " job_directives_skip=[\"-n\", \"--cpus-per-task\", \"-J\"],\n",
153
  " job_extra_directives=[\n",
154
  " \"-J combustion-water\",\n",
155
+ " \"-q debug\",\n",
156
  " f\"-N {nodes_per_alloc}\",\n",
157
  " \"-C gpu\",\n",
158
  " f\"-G {gpus_per_alloc}\",\n",
 
165
  "\n",
166
  "\n",
167
  "print(cluster.job_script())\n",
168
+ "cluster.adapt(minimum_jobs=1, maximum_jobs=1)\n",
169
  "client = Client(cluster)"
170
  ]
171
  },
 
180
  " futures = []\n",
181
  "\n",
182
  " for model in MLIPEnum:\n",
183
+ " if model.name != \"MatterSim\":\n",
184
+ " continue\n",
185
+ "\n",
186
  " future = MD.submit(\n",
187
  " atoms=atoms,\n",
188
+ " calculator=get_calculator(\n",
189
+ " calculator_name=model,\n",
190
+ " calculator_kwargs=None,\n",
191
+ " ),\n",
192
  " ensemble=\"nvt\",\n",
193
  " dynamics=\"nose-hoover\",\n",
194
  " time_step=None,\n",
195
+ " dynamics_kwargs=dict(ttime=25 * units.fs, pfactor=None),\n",
196
  " total_time=1000_000,\n",
197
  " temperature=[300, 3000, 3000, 300],\n",
198
  " pressure=None,\n",
199
+ " velocity_seed=0,\n",
200
  " traj_file=Path(REGISTRY[model.name][\"family\"])\n",
201
  " / f\"{model.name}_{atoms.get_chemical_formula()}.traj\",\n",
202
  " traj_interval=1000,\n",
 
218
  "source": [
219
  "results = combustion(atoms)"
220
  ]
221
+ },
222
+ {
223
+ "cell_type": "code",
224
+ "execution_count": null,
225
+ "metadata": {},
226
+ "outputs": [],
227
+ "source": [
228
+ "def combustion(atoms: Atoms):\n",
229
+ " futures = []\n",
230
+ "\n",
231
+ " for model in MLIPEnum:\n",
232
+ " if model.name != \"MatterSim\":\n",
233
+ " continue\n",
234
+ "\n",
235
+ " future = MD(\n",
236
+ " atoms=atoms,\n",
237
+ " calculator=get_calculator(\n",
238
+ " calculator_name=model,\n",
239
+ " calculator_kwargs=None,\n",
240
+ " ),\n",
241
+ " ensemble=\"nvt\",\n",
242
+ " dynamics=\"nose-hoover\",\n",
243
+ " time_step=None,\n",
244
+ " dynamics_kwargs=dict(ttime=25 * units.fs, pfactor=None),\n",
245
+ " total_time=1000_000,\n",
246
+ " temperature=[300, 3000, 3000, 300],\n",
247
+ " pressure=None,\n",
248
+ " velocity_seed=0,\n",
249
+ " traj_file=Path(REGISTRY[model.name][\"family\"])\n",
250
+ " / f\"{model.name}_{atoms.get_chemical_formula()}.traj\",\n",
251
+ " traj_interval=1000,\n",
252
+ " restart=True,\n",
253
+ " )\n",
254
+ "\n",
255
+ " futures.append(future)\n",
256
+ "\n",
257
+ " return [future.result() for future in futures]\n",
258
+ "\n",
259
+ "\n",
260
+ "results = combustion(atoms)"
261
+ ]
262
+ },
263
+ {
264
+ "cell_type": "code",
265
+ "execution_count": null,
266
+ "metadata": {},
267
+ "outputs": [],
268
+ "source": []
269
  }
270
  ],
271
  "metadata": {
mlip_arena/tasks/diatomics/ani/homonuclear-diatomics.json CHANGED
The diff for this file is too large to render. See raw diff
 
mlip_arena/tasks/diatomics/mace-mp/homonuclear-diatomics.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1f6f4a2d4f36071625db988dde933674cdf5478951cf227e6eacc0d818c13a1f
3
- size 1915573
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ad34875760232ee25a34b7d6e8a54d75e3b8ddd38efaf5de3aa7a3d6e19474d
3
+ size 3837066
mlip_arena/tasks/diatomics/run.ipynb CHANGED
@@ -14,17 +14,15 @@
14
  "\n",
15
  "import numpy as np\n",
16
  "import pandas as pd\n",
17
- "from ase import Atom, Atoms\n",
18
- "from ase.data import chemical_symbols, covalent_radii, vdw_alvarez\n",
19
- "from ase.io import read, write\n",
20
- "from pymatgen.core import Element\n",
21
  "from scipy import stats\n",
22
- "from scipy.interpolate import splrep, BSpline, CubicSpline, UnivariateSpline\n",
23
  "from tqdm.auto import tqdm\n",
24
  "\n",
25
- "from mlip_arena.models import MLIPEnum, REGISTRY\n",
26
- "\n",
27
- "%matplotlib inline"
 
 
28
  ]
29
  },
30
  {
@@ -48,18 +46,16 @@
48
  "outputs": [],
49
  "source": [
50
  "for model in MLIPEnum:\n",
51
- " \n",
52
  " model_name = model.name\n",
53
- " \n",
54
- " if model_name != 'MatterSim':\n",
55
  " continue\n",
56
- " \n",
57
  " print(f\"========== {model_name} ==========\")\n",
58
  "\n",
59
  " calc = MLIPEnum[model_name].value()\n",
60
  "\n",
61
  " for symbol in tqdm(chemical_symbols[1:]):\n",
62
- "\n",
63
  " s = set([symbol])\n",
64
  "\n",
65
  " if \"X\" in s:\n",
@@ -68,13 +64,14 @@
68
  " try:\n",
69
  " atom = Atom(symbol)\n",
70
  " rmin = 0.9 * covalent_radii[atom.number]\n",
71
- " rvdw = vdw_alvarez.vdw_radii[atom.number] if atom.number < len(vdw_alvarez.vdw_radii) else np.nan\n",
 
 
 
 
72
  " rmax = 3.1 * rvdw if not np.isnan(rvdw) else 6\n",
73
  " rstep = 0.01\n",
74
- "\n",
75
- " a = 2 * rmax\n",
76
- "\n",
77
- " npts = int((rmax - rmin)/rstep)\n",
78
  "\n",
79
  " rs = np.linspace(rmin, rmax, npts)\n",
80
  " es = np.zeros_like(rs)\n",
@@ -92,15 +89,15 @@
92
  " m = element.valence[1]\n",
93
  " if element.valence == (0, 2):\n",
94
  " m = 0\n",
95
- " except:\n",
96
  " m = 0\n",
97
  "\n",
98
- "\n",
99
  " r = rs[0]\n",
100
  "\n",
101
  " positions = [\n",
102
- " [a/2-r/2, a/2, a/2],\n",
103
- " [a/2+r/2, a/2, a/2],\n",
104
  " ]\n",
105
  "\n",
106
  " traj_fpath = out_dir / f\"{model_name}.extxyz\"\n",
@@ -115,8 +112,8 @@
115
  " da,\n",
116
  " positions=positions,\n",
117
  " # magmoms=magmoms,\n",
118
- " cell=[a, a+0.001, a+0.002],\n",
119
- " pbc=True\n",
120
  " )\n",
121
  "\n",
122
  " print(atoms)\n",
@@ -124,13 +121,12 @@
124
  " atoms.calc = calc\n",
125
  "\n",
126
  " for i, r in enumerate(tqdm(rs)):\n",
127
- "\n",
128
  " if i < skip:\n",
129
  " continue\n",
130
  "\n",
131
  " positions = [\n",
132
- " [a/2-r/2, a/2, a/2],\n",
133
- " [a/2+r/2, a/2, a/2],\n",
134
  " ]\n",
135
  "\n",
136
  " # atoms.set_initial_magnetic_moments(magmoms)\n",
@@ -162,48 +158,47 @@
162
  },
163
  "outputs": [],
164
  "source": [
165
- "\n",
166
- "\n",
167
  "for model in MLIPEnum:\n",
168
- " \n",
169
  " model_name = model.name\n",
170
- " \n",
171
  " # if model_name != \"MatterSim\":\n",
172
  " # continue\n",
173
  "\n",
174
  " print(f\"========== {model_name} ==========\")\n",
175
- " \n",
176
- " df = pd.DataFrame(columns=[\n",
177
- " \"name\", \n",
178
- " \"method\", \n",
179
- " \"R\", \"E\", \"F\", \"S^2\",\n",
180
- " \"force-flip-times\",\n",
181
- " \"force-total-variation\",\n",
182
- " \"force-jump\",\n",
183
- " \"energy-diff-flip-times\",\n",
184
- " \"energy-grad-norm-max\",\n",
185
- " \"energy-jump\",\n",
186
- " \"energy-total-variation\",\n",
187
- " \"tortuosity\",\n",
188
- " \"conservation-deviation\",\n",
189
- " \"spearman-descending-force\",\n",
190
- " \"spearman-ascending-force\",\n",
191
- " \"spearman-repulsion-energy\",\n",
192
- " \"spearman-attraction-energy\",\n",
193
- " \"pbe-energy-mae\",\n",
194
- " \"pbe-force-mae\"\n",
195
- " ])\n",
196
- " \n",
197
  "\n",
198
- " for symbol in tqdm(chemical_symbols[1:]):\n",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
199
  "\n",
 
200
  " da = symbol + symbol\n",
201
  "\n",
202
  " out_dir = Path(REGISTRY[model_name][\"family\"]) / da\n",
203
  "\n",
204
  " traj_fpath = out_dir / f\"{model_name}.extxyz\"\n",
205
  "\n",
206
- "\n",
207
  " if traj_fpath.exists():\n",
208
  " traj = read(traj_fpath, index=\":\")\n",
209
  " else:\n",
@@ -211,11 +206,10 @@
211
  "\n",
212
  " Rs, Es, Fs, S2s = [], [], [], []\n",
213
  " for atoms in traj:\n",
214
- "\n",
215
  " vec = atoms.positions[1] - atoms.positions[0]\n",
216
  " r = np.linalg.norm(vec)\n",
217
  " e = atoms.get_potential_energy()\n",
218
- " f = np.inner(vec/r, atoms.get_forces()[1])\n",
219
  " # s2 = np.mean(np.power(atoms.get_magnetic_moments(), 2))\n",
220
  "\n",
221
  " Rs.append(r)\n",
@@ -243,33 +237,36 @@
243
  "\n",
244
  " # avoid numerical sensitity close to zero\n",
245
  " rounded_fs = np.copy(fs)\n",
246
- " rounded_fs[np.abs(rounded_fs) < 1e-2] = 0 # 10meV/A\n",
247
  " fs_sign = np.sign(rounded_fs)\n",
248
  " mask = fs_sign != 0\n",
249
  " rounded_fs = rounded_fs[mask]\n",
250
  " fs_sign = fs_sign[mask]\n",
251
  " f_flip = np.diff(fs_sign) != 0\n",
252
- " \n",
253
  " fdiff = np.diff(fs)\n",
254
  " fdiff_sign = np.sign(fdiff)\n",
255
  " mask = fdiff_sign != 0\n",
256
  " fdiff = fdiff[mask]\n",
257
  " fdiff_sign = fdiff_sign[mask]\n",
258
  " fdiff_flip = np.diff(fdiff_sign) != 0\n",
259
- " fjump = np.abs(fdiff[:-1][fdiff_flip]).sum() + np.abs(fdiff[1:][fdiff_flip]).sum()\n",
260
- " \n",
 
261
  "\n",
262
  " ediff = np.diff(es)\n",
263
- " ediff[np.abs(ediff) < 1e-3] = 0 # 1meV\n",
264
  " ediff_sign = np.sign(ediff)\n",
265
  " mask = ediff_sign != 0\n",
266
  " ediff = ediff[mask]\n",
267
  " ediff_sign = ediff_sign[mask]\n",
268
  " ediff_flip = np.diff(ediff_sign) != 0\n",
269
- " ejump = np.abs(ediff[:-1][ediff_flip]).sum() + np.abs(ediff[1:][ediff_flip]).sum()\n",
270
- " \n",
 
 
271
  " try:\n",
272
- " pbe_traj = read(f'./vasp/{da}/PBE.extxyz', index=\":\")\n",
273
  "\n",
274
  " pbe_rs, pbe_es, pbe_fs = [], [], []\n",
275
  "\n",
@@ -278,7 +275,7 @@
278
  " r = np.linalg.norm(vec)\n",
279
  " pbe_rs.append(r)\n",
280
  " pbe_es.append(atoms.get_potential_energy())\n",
281
- " pbe_fs.append(np.inner(vec/r, atoms.get_forces()[1]))\n",
282
  "\n",
283
  " pbe_rs = np.array(pbe_rs)\n",
284
  " pbe_es = np.array(pbe_es)\n",
@@ -302,43 +299,9 @@
302
  " print(e)\n",
303
  " pbe_energy_mae = None\n",
304
  " pbe_force_mae = None\n",
305
- " \n",
306
- " \n",
307
- "# edged_es = np.convolve(es, [1, -2, 1], mode='valid')\n",
308
- "# # edged_es[np.abs(edged_es) < 0.1] = 0\n",
309
- "# prob = np.exp(-es[1:-1]) / np.sum(np.exp(-es[1:-1]))\n",
310
- "# edged_es *= prob\n",
311
- "# # edged_es /= np.abs(es[1:-1])\n",
312
- "# ejump = np.linalg.norm(edged_es)\n",
313
- "# ejump = np.abs(edged_es).sum() / 2.0\n",
314
- " \n",
315
- "# edged_fs = np.convolve(fs, [1, -2, 1], mode='valid')\n",
316
- "# # edged_fs[np.abs(edged_fs) < 0.1] = 0\n",
317
- "# edged_fs *= prob\n",
318
- "# fjump = np.linalg.norm(edged_fs)\n",
319
- " # fjump = np.abs(edged_fs).sum() / 2.0\n",
320
- " \n",
321
- "# fig, axes = plt.subplot_mosaic(\n",
322
- "# \"\"\"\n",
323
- "# ac\n",
324
- "# bd\n",
325
- "# \"\"\",\n",
326
- "# constrained_layout=True\n",
327
- "# )\n",
328
- " \n",
329
- "\n",
330
- "# axes['a'].plot(rs, es)\n",
331
- "# axes['b'].plot(rs[1:-1], edged_es)\n",
332
- "# # axes['b'].plot(0.5*(rs[1:] + rs[:-1]), np.diff(es))\n",
333
- "# axes['b'].text(0.7, 0.7, f\"{ejump:.3e}\", transform=axes['b'].transAxes)\n",
334
- " \n",
335
- "# axes['c'].plot(rs, fs)\n",
336
- "# axes['d'].plot(rs[1:-1], edged_fs)\n",
337
- "# axes['d'].text(0.7, 0.7, f\"{fjump:.3e}\", transform=axes['d'].transAxes)\n",
338
- " \n",
339
  "\n",
340
  " conservation_deviation = np.mean(np.abs(fs + de_dr))\n",
341
- " \n",
342
  " etv = np.sum(np.abs(np.diff(es)))\n",
343
  "\n",
344
  " data = {\n",
@@ -358,12 +321,20 @@
358
  " \"energy-total-variation\": etv,\n",
359
  " \"tortuosity\": etv / (abs(es[0] - es.min()) + (es[-1] - es.min())),\n",
360
  " \"conservation-deviation\": conservation_deviation,\n",
361
- " \"spearman-descending-force\": stats.spearmanr(rs[iminf:], fs[iminf:]).statistic,\n",
362
- " \"spearman-ascending-force\": stats.spearmanr(rs[:iminf], fs[:iminf]).statistic,\n",
363
- " \"spearman-repulsion-energy\": stats.spearmanr(rs[imine:], es[imine:]).statistic,\n",
364
- " \"spearman-attraction-energy\": stats.spearmanr(rs[:imine], es[:imine]).statistic,\n",
 
 
 
 
 
 
 
 
365
  " \"pbe-energy-mae\": pbe_energy_mae,\n",
366
- " \"pbe-force-mae\": pbe_force_mae\n",
367
  " }\n",
368
  "\n",
369
  " df = pd.concat([df, pd.DataFrame([data])], ignore_index=True)\n",
@@ -373,37 +344,17 @@
373
  " if json_fpath.exists():\n",
374
  " df0 = pd.read_json(json_fpath)\n",
375
  " df = pd.concat([df0, df], ignore_index=True)\n",
376
- " df.drop_duplicates(inplace=True, subset=[\"name\", \"method\"], keep='last')\n",
377
  "\n",
378
  " df.to_json(json_fpath, orient=\"records\")"
379
  ]
380
- },
381
- {
382
- "cell_type": "code",
383
- "execution_count": null,
384
- "id": "e0dd4367-3dca-440f-a7a9-7fdd84183f2c",
385
- "metadata": {
386
- "tags": []
387
- },
388
- "outputs": [],
389
- "source": [
390
- "df"
391
- ]
392
- },
393
- {
394
- "cell_type": "code",
395
- "execution_count": null,
396
- "id": "4e6ae884-89f3-43f2-8fd9-19bf00c91566",
397
- "metadata": {},
398
- "outputs": [],
399
- "source": []
400
  }
401
  ],
402
  "metadata": {
403
  "kernelspec": {
404
- "display_name": "mlip-arena",
405
  "language": "python",
406
- "name": "mlip-arena"
407
  },
408
  "language_info": {
409
  "codemirror_mode": {
 
14
  "\n",
15
  "import numpy as np\n",
16
  "import pandas as pd\n",
 
 
 
 
17
  "from scipy import stats\n",
18
+ "from scipy.interpolate import UnivariateSpline\n",
19
  "from tqdm.auto import tqdm\n",
20
  "\n",
21
+ "from ase import Atom, Atoms\n",
22
+ "from ase.data import chemical_symbols, covalent_radii, vdw_alvarez\n",
23
+ "from ase.io import read, write\n",
24
+ "from mlip_arena.models import REGISTRY, MLIPEnum\n",
25
+ "from pymatgen.core import Element"
26
  ]
27
  },
28
  {
 
46
  "outputs": [],
47
  "source": [
48
  "for model in MLIPEnum:\n",
 
49
  " model_name = model.name\n",
50
+ "\n",
51
+ " if model_name != \"MACE-MPA\":\n",
52
  " continue\n",
53
+ "\n",
54
  " print(f\"========== {model_name} ==========\")\n",
55
  "\n",
56
  " calc = MLIPEnum[model_name].value()\n",
57
  "\n",
58
  " for symbol in tqdm(chemical_symbols[1:]):\n",
 
59
  " s = set([symbol])\n",
60
  "\n",
61
  " if \"X\" in s:\n",
 
64
  " try:\n",
65
  " atom = Atom(symbol)\n",
66
  " rmin = 0.9 * covalent_radii[atom.number]\n",
67
+ " rvdw = (\n",
68
+ " vdw_alvarez.vdw_radii[atom.number]\n",
69
+ " if atom.number < len(vdw_alvarez.vdw_radii)\n",
70
+ " else np.nan\n",
71
+ " )\n",
72
  " rmax = 3.1 * rvdw if not np.isnan(rvdw) else 6\n",
73
  " rstep = 0.01\n",
74
+ " npts = int((rmax - rmin) / rstep)\n",
 
 
 
75
  "\n",
76
  " rs = np.linspace(rmin, rmax, npts)\n",
77
  " es = np.zeros_like(rs)\n",
 
89
  " m = element.valence[1]\n",
90
  " if element.valence == (0, 2):\n",
91
  " m = 0\n",
92
+ " except Exception:\n",
93
  " m = 0\n",
94
  "\n",
95
+ " a = 2 * rmax\n",
96
  " r = rs[0]\n",
97
  "\n",
98
  " positions = [\n",
99
+ " [a / 2 - r / 2, a / 2, a / 2],\n",
100
+ " [a / 2 + r / 2, a / 2, a / 2],\n",
101
  " ]\n",
102
  "\n",
103
  " traj_fpath = out_dir / f\"{model_name}.extxyz\"\n",
 
112
  " da,\n",
113
  " positions=positions,\n",
114
  " # magmoms=magmoms,\n",
115
+ " cell=[a, a + 0.001, a + 0.002],\n",
116
+ " pbc=True,\n",
117
  " )\n",
118
  "\n",
119
  " print(atoms)\n",
 
121
  " atoms.calc = calc\n",
122
  "\n",
123
  " for i, r in enumerate(tqdm(rs)):\n",
 
124
  " if i < skip:\n",
125
  " continue\n",
126
  "\n",
127
  " positions = [\n",
128
+ " [a / 2 - r / 2, a / 2, a / 2],\n",
129
+ " [a / 2 + r / 2, a / 2, a / 2],\n",
130
  " ]\n",
131
  "\n",
132
  " # atoms.set_initial_magnetic_moments(magmoms)\n",
 
158
  },
159
  "outputs": [],
160
  "source": [
 
 
161
  "for model in MLIPEnum:\n",
 
162
  " model_name = model.name\n",
163
+ "\n",
164
  " # if model_name != \"MatterSim\":\n",
165
  " # continue\n",
166
  "\n",
167
  " print(f\"========== {model_name} ==========\")\n",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
168
  "\n",
169
+ " df = pd.DataFrame(\n",
170
+ " columns=[\n",
171
+ " \"name\",\n",
172
+ " \"method\",\n",
173
+ " \"R\",\n",
174
+ " \"E\",\n",
175
+ " \"F\",\n",
176
+ " \"S^2\",\n",
177
+ " \"force-flip-times\",\n",
178
+ " \"force-total-variation\",\n",
179
+ " \"force-jump\",\n",
180
+ " \"energy-diff-flip-times\",\n",
181
+ " \"energy-grad-norm-max\",\n",
182
+ " \"energy-jump\",\n",
183
+ " \"energy-total-variation\",\n",
184
+ " \"tortuosity\",\n",
185
+ " \"conservation-deviation\",\n",
186
+ " \"spearman-descending-force\",\n",
187
+ " \"spearman-ascending-force\",\n",
188
+ " \"spearman-repulsion-energy\",\n",
189
+ " \"spearman-attraction-energy\",\n",
190
+ " \"pbe-energy-mae\",\n",
191
+ " \"pbe-force-mae\",\n",
192
+ " ]\n",
193
+ " )\n",
194
  "\n",
195
+ " for symbol in tqdm(chemical_symbols[1:]):\n",
196
  " da = symbol + symbol\n",
197
  "\n",
198
  " out_dir = Path(REGISTRY[model_name][\"family\"]) / da\n",
199
  "\n",
200
  " traj_fpath = out_dir / f\"{model_name}.extxyz\"\n",
201
  "\n",
 
202
  " if traj_fpath.exists():\n",
203
  " traj = read(traj_fpath, index=\":\")\n",
204
  " else:\n",
 
206
  "\n",
207
  " Rs, Es, Fs, S2s = [], [], [], []\n",
208
  " for atoms in traj:\n",
 
209
  " vec = atoms.positions[1] - atoms.positions[0]\n",
210
  " r = np.linalg.norm(vec)\n",
211
  " e = atoms.get_potential_energy()\n",
212
+ " f = np.inner(vec / r, atoms.get_forces()[1])\n",
213
  " # s2 = np.mean(np.power(atoms.get_magnetic_moments(), 2))\n",
214
  "\n",
215
  " Rs.append(r)\n",
 
237
  "\n",
238
  " # avoid numerical sensitity close to zero\n",
239
  " rounded_fs = np.copy(fs)\n",
240
+ " rounded_fs[np.abs(rounded_fs) < 1e-2] = 0 # 10meV/A\n",
241
  " fs_sign = np.sign(rounded_fs)\n",
242
  " mask = fs_sign != 0\n",
243
  " rounded_fs = rounded_fs[mask]\n",
244
  " fs_sign = fs_sign[mask]\n",
245
  " f_flip = np.diff(fs_sign) != 0\n",
246
+ "\n",
247
  " fdiff = np.diff(fs)\n",
248
  " fdiff_sign = np.sign(fdiff)\n",
249
  " mask = fdiff_sign != 0\n",
250
  " fdiff = fdiff[mask]\n",
251
  " fdiff_sign = fdiff_sign[mask]\n",
252
  " fdiff_flip = np.diff(fdiff_sign) != 0\n",
253
+ " fjump = (\n",
254
+ " np.abs(fdiff[:-1][fdiff_flip]).sum() + np.abs(fdiff[1:][fdiff_flip]).sum()\n",
255
+ " )\n",
256
  "\n",
257
  " ediff = np.diff(es)\n",
258
+ " ediff[np.abs(ediff) < 1e-3] = 0 # 1meV\n",
259
  " ediff_sign = np.sign(ediff)\n",
260
  " mask = ediff_sign != 0\n",
261
  " ediff = ediff[mask]\n",
262
  " ediff_sign = ediff_sign[mask]\n",
263
  " ediff_flip = np.diff(ediff_sign) != 0\n",
264
+ " ejump = (\n",
265
+ " np.abs(ediff[:-1][ediff_flip]).sum() + np.abs(ediff[1:][ediff_flip]).sum()\n",
266
+ " )\n",
267
+ "\n",
268
  " try:\n",
269
+ " pbe_traj = read(f\"./vasp/{da}/PBE.extxyz\", index=\":\")\n",
270
  "\n",
271
  " pbe_rs, pbe_es, pbe_fs = [], [], []\n",
272
  "\n",
 
275
  " r = np.linalg.norm(vec)\n",
276
  " pbe_rs.append(r)\n",
277
  " pbe_es.append(atoms.get_potential_energy())\n",
278
+ " pbe_fs.append(np.inner(vec / r, atoms.get_forces()[1]))\n",
279
  "\n",
280
  " pbe_rs = np.array(pbe_rs)\n",
281
  " pbe_es = np.array(pbe_es)\n",
 
299
  " print(e)\n",
300
  " pbe_energy_mae = None\n",
301
  " pbe_force_mae = None\n",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
302
  "\n",
303
  " conservation_deviation = np.mean(np.abs(fs + de_dr))\n",
304
+ "\n",
305
  " etv = np.sum(np.abs(np.diff(es)))\n",
306
  "\n",
307
  " data = {\n",
 
321
  " \"energy-total-variation\": etv,\n",
322
  " \"tortuosity\": etv / (abs(es[0] - es.min()) + (es[-1] - es.min())),\n",
323
  " \"conservation-deviation\": conservation_deviation,\n",
324
+ " \"spearman-descending-force\": stats.spearmanr(\n",
325
+ " rs[iminf:], fs[iminf:]\n",
326
+ " ).statistic,\n",
327
+ " \"spearman-ascending-force\": stats.spearmanr(\n",
328
+ " rs[:iminf], fs[:iminf]\n",
329
+ " ).statistic,\n",
330
+ " \"spearman-repulsion-energy\": stats.spearmanr(\n",
331
+ " rs[imine:], es[imine:]\n",
332
+ " ).statistic,\n",
333
+ " \"spearman-attraction-energy\": stats.spearmanr(\n",
334
+ " rs[:imine], es[:imine]\n",
335
+ " ).statistic,\n",
336
  " \"pbe-energy-mae\": pbe_energy_mae,\n",
337
+ " \"pbe-force-mae\": pbe_force_mae,\n",
338
  " }\n",
339
  "\n",
340
  " df = pd.concat([df, pd.DataFrame([data])], ignore_index=True)\n",
 
344
  " if json_fpath.exists():\n",
345
  " df0 = pd.read_json(json_fpath)\n",
346
  " df = pd.concat([df0, df], ignore_index=True)\n",
347
+ " df.drop_duplicates(inplace=True, subset=[\"name\", \"method\"], keep=\"last\")\n",
348
  "\n",
349
  " df.to_json(json_fpath, orient=\"records\")"
350
  ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
351
  }
352
  ],
353
  "metadata": {
354
  "kernelspec": {
355
+ "display_name": "Python 3",
356
  "language": "python",
357
+ "name": "python3"
358
  },
359
  "language_info": {
360
  "codemirror_mode": {
mlip_arena/tasks/md.py CHANGED
@@ -143,7 +143,7 @@ def _get_ensemble_schedule(
143
  isinstance(pressure, np.ndarray) and pressure.ndim == 1
144
  ):
145
  p_schedule = _interpolate_quantity(pressure, n_steps)
146
- elif isinstance(pressure, np.ndarray) and pressure.ndim == 4:
147
  p_schedule = interp1d(np.arange(n_steps + 1), pressure, kind="linear")
148
  assert isinstance(p_schedule, np.ndarray)
149
  else:
 
143
  isinstance(pressure, np.ndarray) and pressure.ndim == 1
144
  ):
145
  p_schedule = _interpolate_quantity(pressure, n_steps)
146
+ elif isinstance(pressure, np.ndarray) and pressure.ndim == 3:
147
  p_schedule = interp1d(np.arange(n_steps + 1), pressure, kind="linear")
148
  assert isinstance(p_schedule, np.ndarray)
149
  else:
pyproject.toml CHANGED
@@ -63,7 +63,8 @@ test = [
63
  "pytest",
64
  "pytest-xdist",
65
  "prefect==3.1.11",
66
- "pymatgen>=2025.1.9"
 
67
  ]
68
  mace = [
69
  "mace-torch==0.3.9",
 
63
  "pytest",
64
  "pytest-xdist",
65
  "prefect==3.1.11",
66
+ "pymatgen>=2025.1.9",
67
+ "streamlit==1.38.0"
68
  ]
69
  mace = [
70
  "mace-torch==0.3.9",
serve/leaderboard.py CHANGED
@@ -119,16 +119,15 @@ for task in TASKS:
119
 
120
  # Call the function from the imported module
121
  if hasattr(task_module, "render"):
 
 
 
 
 
122
  task_module.render()
123
  # if st.button(f"Go to task page"):
124
  # st.switch_page(f"tasks/{TASKS[task]['task-page']}.py")
125
  else:
126
  st.write(
127
- "Rank metrics are not available yet but the task has been implemented. Please see the following task page for more information."
128
  )
129
-
130
- st.page_link(
131
- f"tasks/{TASKS[task]['task-page']}.py",
132
- label="Go to the associated task page",
133
- icon=":material/link:",
134
- )
 
119
 
120
  # Call the function from the imported module
121
  if hasattr(task_module, "render"):
122
+ st.page_link(
123
+ f"tasks/{TASKS[task]['task-page']}.py",
124
+ label="Go to the associated task page",
125
+ icon=":material/link:",
126
+ )
127
  task_module.render()
128
  # if st.button(f"Go to task page"):
129
  # st.switch_page(f"tasks/{TASKS[task]['task-page']}.py")
130
  else:
131
  st.write(
132
+ "Rank metrics are not available yet but the task has been implemented. Please see the task page for more information."
133
  )
 
 
 
 
 
 
serve/ranks/homonuclear-diatomics.py CHANGED
@@ -173,3 +173,4 @@ def render():
173
  - **Force flips**: The number of force direction changes.
174
  """
175
  )
 
 
173
  - **Force flips**: The number of force direction changes.
174
  """
175
  )
176
+ st.info('PBE energies and forces are provided __only__ for reference. Due to the known convergence issue of plane-wave DFT with diatomic molecules and different dataset the models might be trained on, comparing models with PBE is not rigorous and thus these metrics are excluded from rank aggregation.', icon=":material/warning:")
serve/tasks/combustion.py CHANGED
@@ -6,7 +6,6 @@ import plotly.colors as pcolors
6
  import plotly.express as px
7
  import plotly.graph_objects as go
8
  import streamlit as st
9
-
10
  from mlip_arena.models import REGISTRY as MODELS
11
 
12
  DATA_DIR = Path("mlip_arena/tasks/combustion")
@@ -36,6 +35,7 @@ models = container.multiselect(
36
  "ORBv2",
37
  "EquiformerV2(OC20)",
38
  "eSCN(OC20)",
 
39
  ],
40
  )
41
 
@@ -64,7 +64,9 @@ if not models:
64
  def get_data(models):
65
  # List comprehension for concise looping and filtering
66
  dfs = [
67
- pd.read_json(DATA_DIR / MODELS[str(model)]["family"].lower() / "hydrogen.json")[lambda df: df["method"] == model]
 
 
68
  for model in models
69
  ]
70
  # Concatenate all filtered DataFrames
@@ -177,8 +179,8 @@ st.plotly_chart(fig)
177
 
178
  # Energy
179
 
180
- exp_ref = -68.3078 # kcal/mol
181
- factor = 23.0609
182
  nh2os = 128
183
 
184
  fig = go.Figure()
@@ -205,10 +207,12 @@ target_steps = df["target_steps"].iloc[0]
205
  fig.add_shape(
206
  go.layout.Shape(
207
  type="line",
208
- x0=0, x1=target_steps,
209
- y0=exp_ref, y1=exp_ref, # y-values for the horizontal line
 
 
210
  line=dict(color="Red", width=2, dash="dash"),
211
- layer="below"
212
  )
213
  )
214
 
@@ -281,28 +285,36 @@ st.plotly_chart(fig)
281
  fig = go.Figure()
282
 
283
 
284
- df["reaction_energy"] = df["energies"].apply(lambda x: x[-1] - x[0]) / nh2os * factor # kcal/mol
 
 
285
 
286
  df["reaction_energy_abs_err"] = np.abs(df["reaction_energy"] - exp_ref)
287
 
288
  df.sort_values("reaction_energy_abs_err", inplace=True)
289
 
290
- fig.add_traces([
291
- go.Bar(
292
- x=df["method"],
293
- y=df["reaction_energy"],
294
- marker=dict(color=[method_color_mapping[method] for method in df["method"]]),
295
- text=[f"{y:.2f}" for y in df["reaction_energy"]],
296
- ),
297
- ])
 
 
 
 
298
 
299
  fig.add_shape(
300
  go.layout.Shape(
301
  type="line",
302
- x0=-0.5, x1=len(df["method"]) - 0.5, # range covering the bars
303
- y0=exp_ref, y1=exp_ref, # y-values for the horizontal line
 
 
304
  line=dict(color="Red", width=2, dash="dash"),
305
- layer="below"
306
  )
307
  )
308
 
@@ -356,7 +368,7 @@ fig.add_trace(
356
  fig.update_layout(
357
  title="Reaction yield (2H2 + O2 -> 2H2O, 64 units)",
358
  xaxis_title="Yield (%)",
359
- yaxis_title="Method"
360
  )
361
 
362
  st.plotly_chart(fig)
@@ -433,7 +445,6 @@ for method in df_exploded["method"].unique():
433
  ),
434
  marker=dict(color=method_color_mapping[method], size=3),
435
  showlegend=True,
436
-
437
  ),
438
  )
439
 
@@ -564,5 +575,4 @@ st.markdown("""
564
  [1] Hasche, A., Navid, A., Krause, H., & Eckart, S. (2023). Experimental and numerical assessment of the effects of hydrogen admixtures on premixed methane-oxygen flames. Fuel, 352, 128964.
565
 
566
  [2] Lide, D. R. (Ed.). (2004). CRC handbook of chemistry and physics (Vol. 85). CRC press.
567
- """
568
- )
 
6
  import plotly.express as px
7
  import plotly.graph_objects as go
8
  import streamlit as st
 
9
  from mlip_arena.models import REGISTRY as MODELS
10
 
11
  DATA_DIR = Path("mlip_arena/tasks/combustion")
 
35
  "ORBv2",
36
  "EquiformerV2(OC20)",
37
  "eSCN(OC20)",
38
+ "MatterSim",
39
  ],
40
  )
41
 
 
64
  def get_data(models):
65
  # List comprehension for concise looping and filtering
66
  dfs = [
67
+ pd.read_json(DATA_DIR / MODELS[str(model)]["family"].lower() / "hydrogen.json")[
68
+ lambda df: df["method"] == model
69
+ ]
70
  for model in models
71
  ]
72
  # Concatenate all filtered DataFrames
 
179
 
180
  # Energy
181
 
182
+ exp_ref = -68.3078 # kcal/mol
183
+ factor = 23.0609
184
  nh2os = 128
185
 
186
  fig = go.Figure()
 
207
  fig.add_shape(
208
  go.layout.Shape(
209
  type="line",
210
+ x0=0,
211
+ x1=target_steps,
212
+ y0=exp_ref,
213
+ y1=exp_ref, # y-values for the horizontal line
214
  line=dict(color="Red", width=2, dash="dash"),
215
+ layer="below",
216
  )
217
  )
218
 
 
285
  fig = go.Figure()
286
 
287
 
288
+ df["reaction_energy"] = (
289
+ df["energies"].apply(lambda x: x[-1] - x[0]) / nh2os * factor
290
+ ) # kcal/mol
291
 
292
  df["reaction_energy_abs_err"] = np.abs(df["reaction_energy"] - exp_ref)
293
 
294
  df.sort_values("reaction_energy_abs_err", inplace=True)
295
 
296
+ fig.add_traces(
297
+ [
298
+ go.Bar(
299
+ x=df["method"],
300
+ y=df["reaction_energy"],
301
+ marker=dict(
302
+ color=[method_color_mapping[method] for method in df["method"]]
303
+ ),
304
+ text=[f"{y:.2f}" for y in df["reaction_energy"]],
305
+ ),
306
+ ]
307
+ )
308
 
309
  fig.add_shape(
310
  go.layout.Shape(
311
  type="line",
312
+ x0=-0.5,
313
+ x1=len(df["method"]) - 0.5, # range covering the bars
314
+ y0=exp_ref,
315
+ y1=exp_ref, # y-values for the horizontal line
316
  line=dict(color="Red", width=2, dash="dash"),
317
+ layer="below",
318
  )
319
  )
320
 
 
368
  fig.update_layout(
369
  title="Reaction yield (2H2 + O2 -> 2H2O, 64 units)",
370
  xaxis_title="Yield (%)",
371
+ yaxis_title="Method",
372
  )
373
 
374
  st.plotly_chart(fig)
 
445
  ),
446
  marker=dict(color=method_color_mapping[method], size=3),
447
  showlegend=True,
 
448
  ),
449
  )
450
 
 
575
  [1] Hasche, A., Navid, A., Krause, H., & Eckart, S. (2023). Experimental and numerical assessment of the effects of hydrogen admixtures on premixed methane-oxygen flames. Fuel, 352, 128964.
576
 
577
  [2] Lide, D. R. (Ed.). (2004). CRC handbook of chemistry and physics (Vol. 85). CRC press.
578
+ """)
 
serve/tasks/homonuclear-diatomics.py CHANGED
@@ -5,11 +5,10 @@ import pandas as pd
5
  import plotly.colors as pcolors
6
  import plotly.graph_objects as go
7
  import streamlit as st
8
- from ase.data import chemical_symbols
9
  from plotly.subplots import make_subplots
10
- from scipy.interpolate import CubicSpline
11
 
12
- from mlip_arena.models import REGISTRY
13
 
14
  st.markdown(
15
  """
@@ -30,10 +29,24 @@ valid_models = [
30
  mlip_methods = container.multiselect(
31
  "MLIPs",
32
  valid_models,
33
- ["MACE-MP(M)", "CHGNet", "M3GNet", "MatterSim", "SevenNet", "ORBv2", "eqV2(OMat)", "ANI2x"],
 
 
 
 
 
 
 
 
 
34
  )
35
  dft_methods = container.multiselect("DFT Methods", ["PBE"], ["PBE"])
36
 
 
 
 
 
 
37
  st.markdown("### Settings")
38
  vis = st.container(border=True)
39
  energy_plot = vis.checkbox("Show energy curves", value=True)
@@ -119,11 +132,10 @@ def get_plots(df, energy_plot: bool, force_plot: bool, method_color_mapping: dic
119
  rs = rs[ind]
120
  es = es[ind]
121
  fs = fs[ind]
122
-
123
  # if method not in ["PBE"]:
124
  es = es - es[-1]
125
 
126
-
127
  # if method in ["PBE"]:
128
  # xs = np.linspace(rs.min() * 0.99, rs.max() * 1.01, int(5e2))
129
  # else:
 
5
  import plotly.colors as pcolors
6
  import plotly.graph_objects as go
7
  import streamlit as st
8
+ from mlip_arena.models import REGISTRY
9
  from plotly.subplots import make_subplots
 
10
 
11
+ from ase.data import chemical_symbols
12
 
13
  st.markdown(
14
  """
 
29
  mlip_methods = container.multiselect(
30
  "MLIPs",
31
  valid_models,
32
+ [
33
+ "MACE-MP(M)",
34
+ "CHGNet",
35
+ "M3GNet",
36
+ "MatterSim",
37
+ "SevenNet",
38
+ "ORBv2",
39
+ "eqV2(OMat)",
40
+ "ANI2x",
41
+ ],
42
  )
43
  dft_methods = container.multiselect("DFT Methods", ["PBE"], ["PBE"])
44
 
45
+ container.info(
46
+ "PBE energies and forces are provided __only__ for reference. Due to the known convergence issue of plane-wave DFT with diatomic molecules and different dataset the models might be trained on, comparing models with PBE is not rigorous and thus these metrics are excluded from rank aggregation.",
47
+ icon=":material/warning:",
48
+ )
49
+
50
  st.markdown("### Settings")
51
  vis = st.container(border=True)
52
  energy_plot = vis.checkbox("Show energy curves", value=True)
 
132
  rs = rs[ind]
133
  es = es[ind]
134
  fs = fs[ind]
135
+
136
  # if method not in ["PBE"]:
137
  es = es - es[-1]
138
 
 
139
  # if method in ["PBE"]:
140
  # xs = np.linspace(rs.min() * 0.99, rs.max() * 1.01, int(5e2))
141
  # else:
tests/test_app.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ from streamlit.testing.v1 import AppTest
3
+ import pytest
4
+ from pathlib import Path
5
+
6
+ path = Path(__file__).parents[1] / "serve"
7
+
8
+ @pytest.fixture
9
+ def home():
10
+ at = AppTest.from_file(str(path / "app.py"), default_timeout=60)
11
+ at.run()
12
+ assert not at.exception
13
+ return at
14
+
15
+ def test_leaderboard(home):
16
+ # Test the leaderboard page by simulating navigation.
17
+ at = home.switch_page(str(path / "leaderboard.py"))
18
+ assert not at.exception
19
+
20
+ def test_task_pages(home):
21
+ # Test each task page using the TASKS registry.
22
+ from mlip_arena.tasks import REGISTRY as TASKS
23
+
24
+ for task, details in TASKS.items():
25
+ page_path = str(path / f"tasks/{details['task-page']}.py")
26
+ at = home.switch_page(page_path)
27
+ assert not at.exception