MartialTerran
/

Method_for_Dynamically_Reducing_Logit_Computation_in_LLMs

Model card Files Files and versions Community

MartialTerran commited on Nov 30, 2024

Commit

d88c91d

·

verified ·

1 Parent(s): 582b308

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -104,7 +104,7 @@ Neural Network Language Model with Dynamic Output Vocabulary Filtering
 1st Independent Claim (subject to change) in Patent Application for an Apparatus and Method for Context-Aware Logit Computation in Transformer-Based Language Models
 A method for generating text using a neural network language model comprising a plurality of layers including a final layer configured to output a hidden state representation, and an output layer comprising an unembedding matrix and a bias vector, the method comprising:
 a) receiving an input sequence of tokens;
-b) identifying a set of deactivated tokens based on at least one of the input sequence, a hidden state representation generated from processing the input sequence through one or more layers of the neural network language model, and a persistent memory;
 c) processing the input sequence through the plurality of layers to generate the hidden state representation;
 d) generating a reduced unembedding matrix and a reduced bias vector by selecting rows of the unembedding matrix and elements of the bias vector corresponding to tokens not present in the set of deactivated tokens;
 e) calculating logits for the tokens not present in the set of deactivated tokens by performing a matrix multiplication of a hidden state representation and the reduced unembedding matrix and adding the reduced bias vector;

 1st Independent Claim (subject to change) in Patent Application for an Apparatus and Method for Context-Aware Logit Computation in Transformer-Based Language Models
 A method for generating text using a neural network language model comprising a plurality of layers including a final layer configured to output a hidden state representation, and an output layer comprising an unembedding matrix and a bias vector, the method comprising:
 a) receiving an input sequence of tokens;
+b) identifying a set of deactivated tokens based on at least one of: a hyperparameter, a vocab file, a tokenizer file, a control signal, a user command, a keyboard press, a mouse click, a special token, the input sequence, a hidden state representation generated from processing the input sequence through one or more layers of the neural network language model, and a persistent memory;
 c) processing the input sequence through the plurality of layers to generate the hidden state representation;
 d) generating a reduced unembedding matrix and a reduced bias vector by selecting rows of the unembedding matrix and elements of the bias vector corresponding to tokens not present in the set of deactivated tokens;
 e) calculating logits for the tokens not present in the set of deactivated tokens by performing a matrix multiplication of a hidden state representation and the reduced unembedding matrix and adding the reduced bias vector;