retrieva-jp
/

bert-1.3b

Model card Files Files and versions Community

Add SDPA attention

#2

by Katsumata420 - opened Jul 8, 2024

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

Retrieva, Inc. org Jul 8, 2024

SDPA attention の追加

下記のようにすることで Attention 部分の処理が torch.matmul から torch の sdpa に変更されます（指定しない場合は eager）

model = AutoModel.from_pretrained("retrieva-jp/bert-1.3b", trust_remote_code=True, attn_implementation="sdpa")

SDPA Attention の検証結果

SDPA Attention を利用した場合と、これまでの Attention（eager）を利用した場合で出力が大きく変わらないことを検証済み

SDPA Attention を有効にすることで、秒間あたりのトークン処理数などが改善されることを確認済み

Add SDPA attentionfd90ef8d

Retrieva, Inc. org Jul 9, 2024

内部でも SDPA の有無で出力が変更しないことを確認できたためマージします

Katsumata420 changed pull request status to merged Jul 9, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment