MartialTerran commited on
Commit
d88c91d
·
verified ·
1 Parent(s): 582b308

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -104,7 +104,7 @@ Neural Network Language Model with Dynamic Output Vocabulary Filtering
104
  1st Independent Claim (subject to change) in Patent Application for an Apparatus and Method for Context-Aware Logit Computation in Transformer-Based Language Models
105
  A method for generating text using a neural network language model comprising a plurality of layers including a final layer configured to output a hidden state representation, and an output layer comprising an unembedding matrix and a bias vector, the method comprising:
106
  a) receiving an input sequence of tokens;
107
- b) identifying a set of deactivated tokens based on at least one of the input sequence, a hidden state representation generated from processing the input sequence through one or more layers of the neural network language model, and a persistent memory;
108
  c) processing the input sequence through the plurality of layers to generate the hidden state representation;
109
  d) generating a reduced unembedding matrix and a reduced bias vector by selecting rows of the unembedding matrix and elements of the bias vector corresponding to tokens not present in the set of deactivated tokens;
110
  e) calculating logits for the tokens not present in the set of deactivated tokens by performing a matrix multiplication of a hidden state representation and the reduced unembedding matrix and adding the reduced bias vector;
 
104
  1st Independent Claim (subject to change) in Patent Application for an Apparatus and Method for Context-Aware Logit Computation in Transformer-Based Language Models
105
  A method for generating text using a neural network language model comprising a plurality of layers including a final layer configured to output a hidden state representation, and an output layer comprising an unembedding matrix and a bias vector, the method comprising:
106
  a) receiving an input sequence of tokens;
107
+ b) identifying a set of deactivated tokens based on at least one of: a hyperparameter, a vocab file, a tokenizer file, a control signal, a user command, a keyboard press, a mouse click, a special token, the input sequence, a hidden state representation generated from processing the input sequence through one or more layers of the neural network language model, and a persistent memory;
108
  c) processing the input sequence through the plurality of layers to generate the hidden state representation;
109
  d) generating a reduced unembedding matrix and a reduced bias vector by selecting rows of the unembedding matrix and elements of the bias vector corresponding to tokens not present in the set of deactivated tokens;
110
  e) calculating logits for the tokens not present in the set of deactivated tokens by performing a matrix multiplication of a hidden state representation and the reduced unembedding matrix and adding the reduced bias vector;