Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
2
4
Alan Tseng
agentlans
Follow
buckedunicorn's profile picture
WesPro's profile picture
NikolayKozloff's profile picture
6 followers
Β·
4 following
agentlans
AI & ML interests
Small data, boring AI
Recent Activity
updated
a model
37 minutes ago
agentlans/Gemma2-9B-AdvancedFuse
reacted
to
grimjim
's
post
with π
about 3 hours ago
I'm (finally) releasing a Python script that trims excess weights in Gemma2 full-weight models that bloated by ~1B parameters due to an early mergekit bug. https://github.com/jim-plus/Gemma2-mergekit-remediation I'd noticed something was off when merges of Gemma2 9B models ended up having ~10B parameters. The current mergekit package is fine, but there are still bloated models on HF that could stand to be fixed. The script assumes that it will be run from the same directory as the model weights, and will trim the unnecessary lm_head.weight tensor and corresponding index entry.
replied
to
grimjim
's
post
about 3 hours ago
I'm (finally) releasing a Python script that trims excess weights in Gemma2 full-weight models that bloated by ~1B parameters due to an early mergekit bug. https://github.com/jim-plus/Gemma2-mergekit-remediation I'd noticed something was off when merges of Gemma2 9B models ended up having ~10B parameters. The current mergekit package is fine, but there are still bloated models on HF that could stand to be fixed. The script assumes that it will be run from the same directory as the model weights, and will trim the unnecessary lm_head.weight tensor and corresponding index entry.
View all activity
Organizations
None yet
agentlans
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
2 models
23 days ago
microsoft/deberta-v3-xsmall
Fill-Mask
β’
Updated
Sep 26, 2022
β’
64.9k
β’
42
google/flan-t5-small
Text2Text Generation
β’
Updated
Oct 10, 2023
β’
560k
β’
β’
296
liked
a Space
23 days ago
Running
on
Zero
77
πΎπΎ
PawMatchAI
Smart Dog Breed Detection, Comparison, and Matching Tool
liked
a model
2 months ago
ZeroXClem/Llama3.1-DarkStorm-Aspire-8B
Text Generation
β’
Updated
Oct 24, 2024
β’
28
β’
3