Upload folder using huggingface_hub

by sharpenb - opened 14 days ago

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

-8

Files changed (4) hide show

README.md +5 -5
config.json +1 -1
model.safetensors +1 -1
smash_config.json +1 -1

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 thumbnail: "https://assets-global.website-files.com/646b351987a8d8ce158d1940/64ec9e96b4334c0e1ac41504_Logo%20with%20white%20text.svg"
-base_model: ORIGINAL_REPO_NAME
 metrics:
 - memory_disk
 - memory_inference
@@ -52,7 +52,7 @@ tags:
 You can run the smashed model with these steps:
-0. Check requirements from the original repo ORIGINAL_REPO_NAME installed. In particular, check python, cuda, and transformers versions.
 1. Make sure that you have installed quantization related packages.
     ```bash
     pip install transformers accelerate bitsandbytes>0.37.0
@@ -63,7 +63,7 @@ You can run the smashed model with these steps:
    model = AutoModelForCausalLM.from_pretrained("PrunaAI/neeleshg23-jamba-1.9b-7-bnb-8bit-smashed", trust_remote_code=True, device_map='auto')
-   tokenizer = AutoTokenizer.from_pretrained("ORIGINAL_REPO_NAME")
    input_ids = tokenizer("What is the color of prunes?,", return_tensors='pt').to(model.device)["input_ids"]
@@ -77,9 +77,9 @@ The configuration info are in `smash_config.json`.
 ## Credits & License
-The license of the smashed model follows the license of the original model. Please check the license of the original model ORIGINAL_REPO_NAME before using this model which provided the base model. The license  of the `pruna-engine` is [here](https://pypi.org/project/pruna-engine/) on Pypi.
 ## Want to compress other models?
 - Contact us and tell us which model to compress next [here](https://www.pruna.ai/contact).
-- Request access to easily compress your own AI models [here](https://z0halsaff74.typeform.com/pruna-access?typeform-source=www.pruna.ai).

 ---
 thumbnail: "https://assets-global.website-files.com/646b351987a8d8ce158d1940/64ec9e96b4334c0e1ac41504_Logo%20with%20white%20text.svg"
+base_model: neeleshg23/jamba-1.9b-7
 metrics:
 - memory_disk
 - memory_inference
 You can run the smashed model with these steps:
+0. Check requirements from the original repo neeleshg23/jamba-1.9b-7 installed. In particular, check python, cuda, and transformers versions.
 1. Make sure that you have installed quantization related packages.
     ```bash
     pip install transformers accelerate bitsandbytes>0.37.0
    model = AutoModelForCausalLM.from_pretrained("PrunaAI/neeleshg23-jamba-1.9b-7-bnb-8bit-smashed", trust_remote_code=True, device_map='auto')
+   tokenizer = AutoTokenizer.from_pretrained("neeleshg23/jamba-1.9b-7")
    input_ids = tokenizer("What is the color of prunes?,", return_tensors='pt').to(model.device)["input_ids"]
 ## Credits & License
+The license of the smashed model follows the license of the original model. Please check the license of the original model neeleshg23/jamba-1.9b-7 before using this model which provided the base model. The license  of the `pruna-engine` is [here](https://pypi.org/project/pruna-engine/) on Pypi.
 ## Want to compress other models?
 - Contact us and tell us which model to compress next [here](https://www.pruna.ai/contact).
+- Do it by yourself [here](https://docs.pruna.ai/en/latest/setup/pip.html).

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-    "_name_or_path": "/covalent/.cache/models/tmp0n_hx_3e9y8ksise",
     "architectures": [
         "JambaForCausalLM"
     ],

 {
+    "_name_or_path": "/covalent/.cache/models/tmp_52zyzai_lzq9dm2",
     "architectures": [
         "JambaForCausalLM"
     ],

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0391bbf2ec4f9b3f9c88e358b8d0bfc6f4e066cdf4aef7a96060582f8b5ab5e5
 size 2425351143

 version https://git-lfs.github.com/spec/v1
+oid sha256:7be7cf1530fd026f370bd5a3955e8760139d14c19157325ae38f67e9a4520095
 size 2425351143

smash_config.json CHANGED Viewed

@@ -28,7 +28,7 @@
     "quant_llm-int8_weight_bits": 8,
     "max_batch_size": 1,
     "device": "cuda",
-    "cache_dir": "/covalent/.cache/models/tmp0n_hx_3e",
     "task": "",
     "save_load_fn": "bitsandbytes",
     "save_load_fn_args": {}

     "quant_llm-int8_weight_bits": 8,
     "max_batch_size": 1,
     "device": "cuda",
+    "cache_dir": "/covalent/.cache/models/tmp_52zyzai",
     "task": "",
     "save_load_fn": "bitsandbytes",
     "save_load_fn_args": {}