fix rmsnorm init weight bug.

#59

by Shan1990 - opened Jul 31, 2024

base: refs/heads/main

←

from: refs/pr/59

Discussion Files changed

-1

Shan1990

Jul 31, 2024

Using torch.ones to init rmsnorm weight. And torch.empty gets random weight tensor, which maybe out of float value limits.

fix rmsnorm init weight bug.9d3d7be5

Shan1990

Jul 31, 2024

@chielo pls review this pr, thx.

chielo

Aug 1, 2024

@Shan1990 Well. I don't belong to the organization, nor do I have the permission to merge.
But this PR looks good and friendly to initialize chatglm3 for training, etc.
It will be better to have this fix.

DuanKW

Aug 5, 2024

@zRzRzRzRzRzRzR pls review this pr, thx.

zRzRzRzRzRzRzR

Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University org Aug 13, 2024

check now

zRzRzRzRzRzRzR changed pull request status to merged Aug 13, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment