Number of cells recommended for fine tuning

#335
by schhina - opened

I am following the cell classification jupyter notebook (https://huggingface.co/ctheodoris/Geneformer/blob/main/examples/cell_classification.ipynb) and was wondering if there is a recommended number of cells to use for the fine tuning step. Thanks!

Thank you for your question! There is no universal recommendation. Generally the more fine-tuning data the better the model will learn about the domain, but if the fine-tuning data is not that diverse or representative, overfitting may occur so it is likely best to try with the number of cells you have in the data and evaluate performance in a held-out set to confirm generalizability.

ctheodoris changed discussion status to closed

Sign up or log in to comment