ruliad
/

deepthought-8b-llama-v0.01-alpha

Text Generation

Model card Files Files and versions Community

DhanOS commited on 10 days ago

Commit

dc41833

•

1 Parent(s): 20d9383

Updated FA info

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -65,7 +65,7 @@ model = AutoModelForCausalLM.from_pretrained(
     model_name,
     torch_dtype=torch.bfloat16,
     device_map="auto",
-    attn_implementation="flash_attention_2",  # Use "default" if flash_attn not installed
     use_cache=True,
     trust_remote_code=True,
 )

     model_name,
     torch_dtype=torch.bfloat16,
     device_map="auto",
+    attn_implementation="flash_attention_2",  # Use "eager" (or omit) if flash_attn is not installed
     use_cache=True,
     trust_remote_code=True,
 )