This PR updates the layernorm epsilon value to reflect the original implementation.
This was tested with @ydshieh and integration tests are not affected (logits still match).
· Sign up or log in to comment