|
I have conducted some simple tests for qiqi, including the following: |
|
First, I added three sets of lora: |
|
1. Same training set, removed regular datasets. |
|
2. Same training set, removed regular datasets + modified dim to 32dim. |
|
3. Introduced qiqi lora that I trained myself last time as a control group (250 images + 1000 steps * 8bs, definitely enough training). |
|
|
|
I first tested the original version, with a weight of 0.8. |
|
Then, I replaced the weight with 1 for testing on epoch 000018. |
|
Finally, I added "ofuda" at the beginning of the prompts. |
|
|
|
Please refer to the Excel file for details. |
|
|
|
In summary, I believe the reasons for this phenomenon consist of multiple components. |
|
|
|
1.4dim itself is not easy to overfit. If a weight less than 1 is used simultaneously, it will cause the model show underfit. |
|
|
|
2.The number of epochs given in the range of 1000 to 2000 images may be too small. |
|
In the case of no regular set, the best epoch maybe is 18/20, which is already at a critical point, and it is uncertain whether it is a one-time outlier. |
|
|
|
3.Under 2, a large number of regular datasets were added, which slows down the fitting speed and requires more epochs. However, the epoch empirical formula is based on training without reg datasets. |
|
|
|
4.The "ofuda" label should be a core tag, but it was not deleted, so if the ofuda prompt is not used, the ccip score will decrease. |
|
|
|
The results are a result of the combined effects of 1, 2, 3, 4. |
|
|
|
|
|
To address this issue, I suggest: |
|
|
|
1.Change the testing weight to 1 |
|
|
|
2.Increase the number of epochs for 1000 to 2000 images. Consider increasing it to 24 or even 26. |
|
|
|
3.Further increase the number of epochs when introducing regular datasets. The specific amount needs to be tested (I have not extensively used regular training datasets, so I cannot be sure). |
|
|
|
4.I don't think 4dim are insufficient, but if necessary, a slight increase to 6dim/8dim can be considered. |
|
|