dacorvo HF staff commited on
Commit
b41e94c
1 Parent(s): b1279f9

Rename inference-cache-config/Llama-3.1-70B.json to inference-cache-config/Llama3.1-70B.json

Browse files
inference-cache-config/Llama-3.1-70B.json DELETED
@@ -1 +0,0 @@
1
- meta-llama/Llama-3.1-70B
 
 
inference-cache-config/Llama3.1-70B.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "meta-llama/Llama-3.1-70B": [
3
+ {
4
+ "batch_size": 1,
5
+ "sequence_length": 4096,
6
+ "num_cores": 24,
7
+ "auto_cast_type": "bf16"
8
+ },
9
+ {
10
+ "batch_size": 4,
11
+ "sequence_length": 4096,
12
+ "num_cores": 24,
13
+ "auto_cast_type": "bf16"
14
+ }
15
+ ]
16
+ }