MartialTerran
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -104,7 +104,7 @@ Neural Network Language Model with Dynamic Output Vocabulary Filtering
|
|
104 |
1st Independent Claim (subject to change) in Patent Application for an Apparatus and Method for Context-Aware Logit Computation in Transformer-Based Language Models
|
105 |
A method for generating text using a neural network language model comprising a plurality of layers including a final layer configured to output a hidden state representation, and an output layer comprising an unembedding matrix and a bias vector, the method comprising:
|
106 |
a) receiving an input sequence of tokens;
|
107 |
-
b) identifying a set of deactivated tokens based on at least one of the input sequence, a hidden state representation generated from processing the input sequence through one or more layers of the neural network language model, and a persistent memory;
|
108 |
c) processing the input sequence through the plurality of layers to generate the hidden state representation;
|
109 |
d) generating a reduced unembedding matrix and a reduced bias vector by selecting rows of the unembedding matrix and elements of the bias vector corresponding to tokens not present in the set of deactivated tokens;
|
110 |
e) calculating logits for the tokens not present in the set of deactivated tokens by performing a matrix multiplication of a hidden state representation and the reduced unembedding matrix and adding the reduced bias vector;
|
|
|
104 |
1st Independent Claim (subject to change) in Patent Application for an Apparatus and Method for Context-Aware Logit Computation in Transformer-Based Language Models
|
105 |
A method for generating text using a neural network language model comprising a plurality of layers including a final layer configured to output a hidden state representation, and an output layer comprising an unembedding matrix and a bias vector, the method comprising:
|
106 |
a) receiving an input sequence of tokens;
|
107 |
+
b) identifying a set of deactivated tokens based on at least one of: a hyperparameter, a vocab file, a tokenizer file, a control signal, a user command, a keyboard press, a mouse click, a special token, the input sequence, a hidden state representation generated from processing the input sequence through one or more layers of the neural network language model, and a persistent memory;
|
108 |
c) processing the input sequence through the plurality of layers to generate the hidden state representation;
|
109 |
d) generating a reduced unembedding matrix and a reduced bias vector by selecting rows of the unembedding matrix and elements of the bias vector corresponding to tokens not present in the set of deactivated tokens;
|
110 |
e) calculating logits for the tokens not present in the set of deactivated tokens by performing a matrix multiplication of a hidden state representation and the reduced unembedding matrix and adding the reduced bias vector;
|