advanced_manufacturing

Sleeping

App Files Files Community

advanced_manufacturing / model_cards /article.md

jannisborn

update

bf24cff unverified 10 months ago

preview code

raw

history blame contribute delete

No virus

3.54 kB

	# Model documentation & parameters

	Algorithm Version: Which model version to use.

	Target binding energy: The desired binding energy. The optimal range determined in [literature](https://doi.org/10.1039/C8SC01949E) is between -31.1 and -23.0 kcal/mol.

	Primer SMILES: A SMILES string is used to prime the generation.

	Maximal sequence length: The maximal number of tokens in the generated molecule.

	Number of points: Number of points to sample with the Gaussian Process.

	Number of steps: Number of optimization steps in the Gaussian Process optimization.

	Number of samples: How many samples should be generated (between 1 and 50).


	# Model card -- AdvancedManufacturing

	Model Details: AdvancedManufacturing is a sequence-based molecular generator tuned to generate catalysts. The model relies on a recurrent Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.

	Developers: Oliver Schilter and colleagues from IBM Research.

	Distributors: Original authors' code integrated into GT4SD.

	Model date: Not yet published. Manuscript accepted.

	Model version: Different types of models trained on 7054 data points are represented either as SMILES or SELFIES. Augmentation was used to broaden the scope augmentation.

	Model type: A sequence-based molecular generator tuned to generate catalysts. The model relies on a recurrent Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.

	Information about training algorithms, parameters, fairness constraints or other applied approaches, and features:
	N.A.

	Paper or other resources for more information:


	License: MIT

	Where to send questions or comments about the model: Open an issue on [GT4SD repository](https://github.com/GT4SD/gt4sd-core).

	Intended Use. Use cases that were envisioned during development: Chemical research, in particular, to discover new Suzuki cross-coupling catalysts.

	Primary intended uses/users: Researchers and computational chemists using the model for research exploration purposes.

	Out-of-scope use cases: Production-level inference, producing molecules with harmful properties.

	Metrics: N.A.

	Datasets: Data used for training was provided through the NCCR and can be found [here](https://doi.org/10.24435/materialscloud:2018.0014/v1) and [here](https://doi.org/10.24435/materialscloud:2019.0007/v3).

	Ethical Considerations: Unclear, please consult with original authors in case of questions.

	Caveats and Recommendations: Unclear, please consult with original authors in case of questions.

	Model card prototype inspired by [Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)

	## Citation
	Please cite:
	```bib
	@article{manica2023accelerating,
	title={Accelerating material design with the generative toolkit for scientific discovery},
	author={Manica, Matteo and Born, Jannis and Cadow, Joris and Christofidellis, Dimitrios and Dave, Ashish and Clarke, Dean and Teukam, Yves Gaetan Nana and Giannone, Giorgio and Hoffman, Samuel C and Buchan, Matthew and others},
	journal={npj Computational Materials},
	volume={9},
	number={1},
	pages={69},
	year={2023},
	publisher={Nature Publishing Group UK London}
	}
	```

	# Model documentation & parameters

	Algorithm Version: Which model version to use.

	Target binding energy: The desired binding energy. The optimal range determined in [literature](https://doi.org/10.1039/C8SC01949E) is between -31.1 and -23.0 kcal/mol.

	Primer SMILES: A SMILES string is used to prime the generation.

	Maximal sequence length: The maximal number of tokens in the generated molecule.

	Number of points: Number of points to sample with the Gaussian Process.

	Number of steps: Number of optimization steps in the Gaussian Process optimization.

	Number of samples: How many samples should be generated (between 1 and 50).


	# Model card -- AdvancedManufacturing

	Model Details: AdvancedManufacturing is a sequence-based molecular generator tuned to generate catalysts. The model relies on a recurrent Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.

	Developers: Oliver Schilter and colleagues from IBM Research.

	Distributors: Original authors' code integrated into GT4SD.

	Model date: Not yet published. Manuscript accepted.

	Model version: Different types of models trained on 7054 data points are represented either as SMILES or SELFIES. Augmentation was used to broaden the scope augmentation.

	Model type: A sequence-based molecular generator tuned to generate catalysts. The model relies on a recurrent Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.

	Information about training algorithms, parameters, fairness constraints or other applied approaches, and features:
	N.A.

	Paper or other resources for more information:


	License: MIT

	Where to send questions or comments about the model: Open an issue on [GT4SD repository](https://github.com/GT4SD/gt4sd-core).

	Intended Use. Use cases that were envisioned during development: Chemical research, in particular, to discover new Suzuki cross-coupling catalysts.

	Primary intended uses/users: Researchers and computational chemists using the model for research exploration purposes.

	Out-of-scope use cases: Production-level inference, producing molecules with harmful properties.

	Metrics: N.A.

	Datasets: Data used for training was provided through the NCCR and can be found [here](https://doi.org/10.24435/materialscloud:2018.0014/v1) and [here](https://doi.org/10.24435/materialscloud:2019.0007/v3).

	Ethical Considerations: Unclear, please consult with original authors in case of questions.

	Caveats and Recommendations: Unclear, please consult with original authors in case of questions.

	Model card prototype inspired by [Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)

	## Citation
	Please cite:
	```bib
	@article{manica2023accelerating,
	title={Accelerating material design with the generative toolkit for scientific discovery},
	author={Manica, Matteo and Born, Jannis and Cadow, Joris and Christofidellis, Dimitrios and Dave, Ashish and Clarke, Dean and Teukam, Yves Gaetan Nana and Giannone, Giorgio and Hoffman, Samuel C and Buchan, Matthew and others},
	journal={npj Computational Materials},
	volume={9},
	number={1},
	pages={69},
	year={2023},
	publisher={Nature Publishing Group UK London}
	}
	```