link arXiv preprint in readme
Browse files
README.md
CHANGED
@@ -1,22 +1,22 @@
|
|
1 |
-
---
|
2 |
-
license: cc-by-4.0
|
3 |
-
base_model:
|
4 |
-
- Salesforce/codet5-large
|
5 |
-
---
|
6 |
-
|
7 |
-
# AsserT5: Test Assertion Generation Using a Fine-Tuned Code Language Model
|
8 |
-
|
9 |
-
Part of the replication package for our paper at AST 2025 (
|
10 |
-
To be used in combination with our main replication package at https://doi.org/10.5281/zenodo.14703162.
|
11 |
-
|
12 |
-
AsserT5 is a fine-tuned [CodeT5](https://huggingface.co/Salesfoce/codet5-large) trained to generate assertion statements for Java JUnit test cases.
|
13 |
-
It was trained on an extended variant of the [methods2test](https://github.com/microsoft/methods2test) dataset.
|
14 |
-
|
15 |
-
|
16 |
-
## Structure
|
17 |
-
|
18 |
-
- The top-level number indicates the maximum number of assertions allowed per test case in the training dataet.
|
19 |
-
- The next level below indicates the model variant:
|
20 |
-
- `abstract`: Identifiers in the data are replaced with abstract tokens.
|
21 |
-
- `raw`: The source code is tokenised as-is.
|
22 |
-
- `test-method`: The model is trained only on the test case code rather than test case + focal method pairs.
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-4.0
|
3 |
+
base_model:
|
4 |
+
- Salesforce/codet5-large
|
5 |
+
---
|
6 |
+
|
7 |
+
# AsserT5: Test Assertion Generation Using a Fine-Tuned Code Language Model
|
8 |
+
|
9 |
+
Part of the replication package for our paper at AST 2025 (preprint: https://arxiv.org/abs/2502.02708).
|
10 |
+
To be used in combination with our main replication package at https://doi.org/10.5281/zenodo.14703162.
|
11 |
+
|
12 |
+
AsserT5 is a fine-tuned [CodeT5](https://huggingface.co/Salesfoce/codet5-large) trained to generate assertion statements for Java JUnit test cases.
|
13 |
+
It was trained on an extended variant of the [methods2test](https://github.com/microsoft/methods2test) dataset.
|
14 |
+
|
15 |
+
|
16 |
+
## Structure
|
17 |
+
|
18 |
+
- The top-level number indicates the maximum number of assertions allowed per test case in the training dataet.
|
19 |
+
- The next level below indicates the model variant:
|
20 |
+
- `abstract`: Identifiers in the data are replaced with abstract tokens.
|
21 |
+
- `raw`: The source code is tokenised as-is.
|
22 |
+
- `test-method`: The model is trained only on the test case code rather than test case + focal method pairs.
|