Update README.md
Browse files
README.md
CHANGED
@@ -31,6 +31,9 @@ It achieves the following results on the evaluation set:
|
|
31 |
|
32 |
## usage
|
33 |
|
|
|
|
|
|
|
34 |
an example of aggregating summaries from chunks of a long document:
|
35 |
|
36 |
```py
|
@@ -66,6 +69,7 @@ res = pipe(
|
|
66 |
print(res[0]["generated_text"])
|
67 |
```
|
68 |
|
|
|
69 |
## Training procedure
|
70 |
|
71 |
### Training hyperparameters
|
|
|
31 |
|
32 |
## usage
|
33 |
|
34 |
+
> [!TIP]
|
35 |
+
> BART supports several speedups for inference on GPU, including [flash-attention2](https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2) and [torch SDPA](https://huggingface.co/docs/transformers/perf_infer_gpu_one#pytorch-scaled-dot-product-attention)
|
36 |
+
|
37 |
an example of aggregating summaries from chunks of a long document:
|
38 |
|
39 |
```py
|
|
|
69 |
print(res[0]["generated_text"])
|
70 |
```
|
71 |
|
72 |
+
|
73 |
## Training procedure
|
74 |
|
75 |
### Training hyperparameters
|