DragonAI-Python-SmolLM2_model.py???

#1
by MartialTerran - opened

Can you provide a functional pytorch model.py and train.py that supports at least inference mode of this model using the hyperparameters in the config.json. There is no way to do model architecture research or educational experimentation on "autotransformer" which conceals the actual python scripts. Further, Huggingface sometimes vandalizes the hidden autotransformer model.py making it inoperable. For example the SmolLM2 weights are already useless for research because of a change in "autotransformers" that makes them have a size mismatch in the projections (k and v) apparently because of 3x re-use of k and v projection matrices. See e.g., https://huggingface.co/HuggingFaceTB/SmolLM2-360M/discussions The published SmolLM2 weights have already become unusable and unstudyable without providing a fixed and definite model.py and train.py to document how to implement the unusual config hyperparameters. Please help keep models usable and help the independent research community by providing working pytorch model.py and train.py for each of your model variants.

Hi, this repo only contains gguf files, not a transformers model. You probably meant to ask this on the original model.

mradermacher changed discussion status to closed

Yes, I already asked THERE (original model) for Huggingface to publish a standalone model.py (pytorch only, not "autotransformers"), but the request is ignored. This behavior hinders research and innovation and model optimization. The mentality seems to be that the only interest should be training new models and then deployment for Application Development, not modifying and optimizing or porting the underlying model.py code. Here is an audio discussion about the disappointing direction of the current fixation on training and scaling without re-examination of model architecture. https://huggingface.co/MartialTerran/Toy_GPTs_LLMs_for_CPU_Educational
specifically:
https://huggingface.co/MartialTerran/Toy_GPTs_LLMs_for_CPU_Educational/blob/main/The%20AI%20Revolution_%20A%20Debate.wav

You will not reach huggingface either there or here, unfortunately, these are just pages for a specific model.

Sign up or log in to comment