NetBERT ๐Ÿ“ถ




   NetBERT is a BERT-base model further pre-trained on a huge corpus of computer networking text (~23Gb).



Usage

You can use the raw model for masked language modeling (MLM), but it's mostly intended to be fine-tuned on a downstream task, especially one that uses the whole sentence to make decisions such as text classification, extractive question answering, or semantic search.

You can use this model directly with a pipeline for masked language modeling:

from transformers import pipeline

unmasker = pipeline('fill-mask', model='antoinelouis/netbert')
unmasker("The nodes of a computer network may include [MASK].")

You can also use this model to extract the features of a given text:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained('antoinelouis/netbert')
model = AutoModel.from_pretrained('antoinelouis/netbert')

text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

Documentation

Detailed documentation on the pre-trained model, its implementation, and the data can be found on Github.

Citation

For attribution in academic contexts, please cite this work as:

@mastersthesis{louis2020netbert,
    title={NetBERT: A Pre-trained Language Representation Model for Computer Networking},
    author={Louis, Antoine},
    year={2020},
    school={University of Liege}
}
Downloads last month
16
Safetensors
Model size
109M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including antoinelouis/netbert