Error while running it with huggingface transformers
Error type 1 - Using code in your github forked version to install transformers
/CPMBee-fork-transformer/transformers/src/transformers/models/cpmbee/modeling_cpmbee.py:572 in forward
│ │
│ 569 │ │ self.inv_freq = inv_freq.to(config.torch_dtype) │
│ 570 │ │
│ 571 │ def forward(self, x: torch.Tensor, x_pos: torch.Tensor): │
│ ❱ 572 │ │ inv_freq = self.inv_freq.to(device=x.device, dtype=self.dtype) │
│ 573 │ │ │
│ 574 │ │ x_pos = x_pos * self.distance_scale │
│ 575 │ │ freqs = x_pos[..., None].to(self.dtype) * inv_freq[None, :] # (..., dim/2) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: CUDA error: device-side assert triggered
Error type 2 - Using the code in this huggingface repo with setting trust_remote_code=True
modeling_cpmbee.py:787 in forward
│ │
│ 784 │ │ │ │ + segment_rel_offset[:, :, None], │
│ 785 │ │ │ │ ~( │
│ 786 │ │ │ │ │ (sample_ids[:, :, None] == sample_ids[:, None, :]) │
│ ❱ 787 │ │ │ │ │ & (span[:, None, :] == span[:, :, None]) │
│ 788 │ │ │ │ ), # not in the same span or sample │
│ 789 │ │ │ │ 0, # avoid torch.gather overflow │
│ 790 │ │ │ ).view(batch, seqlen * seqlen) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: 'NoneType' object is not subscriptable
You should not use the forked github code, please follow the case in model card.
As type 2, please use model.generate()
. When you use model.forward()
, you should process the data by tokenizer.prepare_for_finetune()
.