NebulaeWis commited on
Commit
5eb6f11
1 Parent(s): b42cc8e

Create readme

Browse files
Files changed (1) hide show
  1. readme +35 -0
readme ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ I have conducted some simple tests for qiqi, including the following:
2
+ First, I added three sets of lora:
3
+ 1. Same training set, removed regular datasets.
4
+ 2. Same training set, removed regular datasets + modified dim to 32dim.
5
+ 3. Introduced qiqi lora that I trained myself last time as a control group (250 images + 1000 steps * 8bs, definitely enough training).
6
+
7
+ I first tested the original version, with a weight of 0.8.
8
+ Then, I replaced the weight with 1 for testing on epoch 000018.
9
+ Finally, I added "ofuda" at the beginning of the prompts.
10
+
11
+ Please refer to the Excel file for details.
12
+
13
+ In summary, I believe the reasons for this phenomenon consist of multiple components.
14
+
15
+ 1.4dim itself is not easy to overfit. If a weight less than 1 is used simultaneously, it will cause the model show underfit.
16
+
17
+ 2.The number of epochs given in the range of 1000 to 2000 images may be too small.
18
+ In the case of no regular set, the best epoch maybe is 18/20, which is already at a critical point, and it is uncertain whether it is a one-time outlier.
19
+
20
+ 3.Under 2, a large number of regular datasets were added, which slows down the fitting speed and requires more epochs. However, the epoch empirical formula is based on training without reg datasets.
21
+
22
+ 4.The "ofuda" label should be a core tag, but it was not deleted, so if the ofuda prompt is not used, the ccip score will decrease.
23
+
24
+ The results are a result of the combined effects of 1, 2, 3, 4.
25
+
26
+
27
+ To address this issue, I suggest:
28
+
29
+ 1.Change the testing weight to 1
30
+
31
+ 2.Increase the number of epochs for 1000 to 2000 images. Consider increasing it to 24 or even 26.
32
+
33
+ 3.Further increase the number of epochs when introducing regular datasets. The specific amount needs to be tested (I have not extensively used regular training datasets, so I cannot be sure).
34
+
35
+ 4.I don't think 4dim are insufficient, but if necessary, a slight increase to 6dim/8dim can be considered.