Troubleshooting Tensor Dimension Mismatch
Hello,
When I try to compile the example from the model card, I encounter a matrix error.
modeling_phi.py:
Code line 313: padding_mask.masked_fill_(key_padding_mask, 0.0)
RuntimeError: The size of tensor a (749) must match the size of tensor b (750) at non-singleton dimension 1.
The error indicates a tensor size mismatch during a computation in your model, specifically when tensors of size 749 and 750 are attempted to be processed together at a dimension where their sizes must match. Any ideas why this is happening? Thank you.
Operating system: Windows 11 Home
Operating system version: 10.0.22631
Python version: 3.12.2
PyTorch version: 2.2.1+cu121
Torchvision version: 2.2.1+cu121
CUDA version: 12.1
CUDNN version: 8801
Current CUDA device: 0
Number of CUDA devices available: 1
Name of current CUDA device: NVIDIA GeForce RTX 3070 Ti
Check if CUDA is available: True
There was a backward incompatible change to the KV cache introduced in the 4.38.0 release of transformers. Three options:
- Use moondream2, where the issue is fixed.
- Downgrade transformers to 4.37.2.
- Try the patch mentioned here.