se2p
/

AsserT5 / README.md
b-fein's picture
link arXiv preprint in readme
ad82e08 verified
|
raw
history blame
1.03 kB
metadata
license: cc-by-4.0
base_model:
  - Salesforce/codet5-large

AsserT5: Test Assertion Generation Using a Fine-Tuned Code Language Model

Part of the replication package for our paper at AST 2025 (preprint: https://arxiv.org/abs/2502.02708). To be used in combination with our main replication package at https://doi.org/10.5281/zenodo.14703162.

AsserT5 is a fine-tuned CodeT5 trained to generate assertion statements for Java JUnit test cases. It was trained on an extended variant of the methods2test dataset.

Structure

  • The top-level number indicates the maximum number of assertions allowed per test case in the training dataet.
  • The next level below indicates the model variant:
    • abstract: Identifiers in the data are replaced with abstract tokens.
    • raw: The source code is tokenised as-is.
    • test-method: The model is trained only on the test case code rather than test case + focal method pairs.