YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This repo contains sparsity report for each of the pruned model in the table below.

The report (csv) shows layer-wise sparsity, sparsity by tile of 128x16, sparsity by col and row global to its layers.

Perplexity over Sparsity

Pruning meta-llama/Meta-Llama-3.1-8B with Wanda

Weight Target Sparsity Perplexity (lower is better)
0 (dense, baseline) 5.8393
10 5.8781
20 6.0102
30 6.3076
40 7.0094
50 9.0642
60 20.2265
70 103.5209

For a more granular sparsity report within a given tile, pls continue below.

Install

pip install torch ipython pandas

Interative look up a specific tile of a layer

# pls make sure git lfs is installed at your end
git clone https://huggingface.co/vuiseng9/24-0830-wanda-llama3.1-8B
cd 24-0830-wanda-llama3.1-8B
./interactive_sparsity.sh

Expected outcome in as follows, it will be in ipython console with the needed functionality loaded.

$ ./interactive_sparsity.sh 
Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.26.0 -- An enhanced Interactive Python. Type '?' for help.

- Help ------------------

h = SparseBlob("path to sparsity blob")

SparseBlob.preview:
        preview sparsity dataframe, intend to show row id, short id for look up
        eg. h.preview()
        

SparseBlob.ls_layers:
        list all available layer ids for look up
        eg. h.ls_layers()
        

SparseBlob.get_sparsity_by_short_id:
        return a sparsity stats of a layer via short_id lookup.
        eg. h.get_sparsity_by_short_id('tx.0.attn.v')
        

SparseBlob.get_sparsity_by_row_id:
        return a sparsity stats of a layer via row id lookup.
        eg. h.get_sparsity_by_row_id(36)
        

SparseBlob.get_sparsity_of_tile:
        zoom into a specific layer and a specific tile,
        return the sparsity stats of the tile down to col, row granularity
        eg. h.get_sparsity_by_row_id(36, (5, 6))
        

SparseBlob.show_help:
        print help for available function of SparseBlob
        eg. h.show_help()
        

- End of Help ------------------
In [1]: 

Sample usage:

In [1]: ls blob*
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.0
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.1
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.2
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.3
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.4
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.5
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.6
blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.7

In [2]: h = SparseBlob("blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.5")


In [3]: h.preview()
                             layer_id        short_id  ... row_med row_max
0     model.layers.0.self_attn.q_proj     tx.0.attn.q  ...  0.5000  1.0000
1     model.layers.0.self_attn.k_proj     tx.0.attn.k  ...  0.5000  1.0000
2     model.layers.0.self_attn.v_proj     tx.0.attn.v  ...  0.5000  1.0000
3     model.layers.0.self_attn.o_proj     tx.0.attn.o  ...  0.5000  1.0000
4        model.layers.0.mlp.gate_proj   tx.0.mlp.gate  ...  0.5000  1.0000
5          model.layers.0.mlp.up_proj     tx.0.mlp.up  ...  0.5000  1.0000
6        model.layers.0.mlp.down_proj   tx.0.mlp.down  ...  0.5000  1.0000
7     model.layers.1.self_attn.q_proj     tx.1.attn.q  ...  0.5000  1.0000
8     model.layers.1.self_attn.k_proj     tx.1.attn.k  ...  0.5000  1.0000
9     model.layers.1.self_attn.v_proj     tx.1.attn.v  ...  0.5000  1.0000
10    model.layers.1.self_attn.o_proj     tx.1.attn.o  ...  0.5000  1.0000
11       model.layers.1.mlp.gate_proj   tx.1.mlp.gate  ...  0.5000  1.0000
.
.
.
222       model.layers.31.mlp.up_proj    tx.31.mlp.up  ...  0.5000  1.0000
223     model.layers.31.mlp.down_proj  tx.31.mlp.down  ...  0.5000  1.0000
224                           lm_head         lm_head  ...  0.0000  0.0000

[225 rows x 23 columns]


In [4]: h.get_sparsity_by_row_id(10)
Out[4]: 
layer_id        model.layers.1.self_attn.o_proj
short_id                            tx.1.attn.o
layer_type                               Linear
param_type                               weight
shape                              [4096, 4096]
nparam                                 16777216
nnz                                     8388608
sparsity                                 0.5000
tile_shape                            (128, 16)
n_tile                                 32 x 256
n_tile_total                               8192
tile_avg                                 0.5000
tile_min                                 0.2197
tile_med                                 0.5073
tile_max                                 0.9678
col_avg                                  0.5000
col_min                                  0.0312
col_med                                  0.4609
col_max                                  1.0000
row_avg                                  0.5000
row_min                                  0.0000
row_med                                  0.5000
row_max                                  1.0000
Name: 10, dtype: object


In [5]: h.get_sparsity_of_tile(10, (30, 245))
                               (30, 245) : tile_id
         model.layers.1.self_attn.o_proj : layer_id
                               (128, 16) : tiled by
                                  0.2861 : tile_sparsity
                                      16 : col_count
                                  0.2861 : col_avg
                                  0.2266 : col_min
                                  0.2734 : col_med
                                  0.3594 : col_max
                                     128 : row_count
                                  0.2861 : row_avg
                                  0.0000 : row_min
                                  0.2500 : row_med
                                  0.6250 : row_max

Internal notes

see patch wanda branch here. see the raw!

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.