Trick or ResNet Treat
β’
3
Yeah, it's been working out well in runs so far, but as is often the case with new optimizers or optimizer enhancements milage can vary depending on many variables, curious to know how it works for your case. Case in point I had some great fine-tune results with adopt, but in this mini-imagenet case it rather flopped. But MARS, is actually doing really well here, and MARS w/ caution even better so it's very hard to cover all ground with new optimizers. MARS results to be added soon though
timm
release, v 1.0.12, with a focus on optimizers. The optimizer factory has been refactored, there's now a timm.optim.list_optimizers()
and new way to register optimizers and their attributes. As always you can use an timm
optimizer like a torch
one, just replace torch.optim
with timm.optim
adfactorbv
adopt
/ adoptw
(decoupled decay)mars
laprop
c
as well as cadamw
, cnadamw
, csgdw
, clamb
, crmsproptf
timm
, OpenCLIP
, and hopefully more.timm
scripts soon:
timm/plant-pathology-2021 timm
support for object detection, eventually segmentation, is finally under development :Otimm
model to use before commiting to download or training with a large dataset? Try mini-imagenet:
timm/mini-imagenet diffusers
π§¨bistandbytes
as the official backend but using others like torchao
is already very simple. enable_model_cpu_offload()
timm
release (1.0.11) is out now. A also wrote an article on one of the included models: https://huggingface.co/blog/rwightman/mambaoutpip install gradio --pre
sdk_version
to be 5.0.0b3
in the README.md
file on Spaces.progress=gr.Progress(track_tqdm=True)
)