Inconsistent data formatting
#2
by
XuyaoWang
- opened
When using the data, it was discovered that the data format is not completely uniform.
- According to the example, the text dialogue is in ShareGPT Format, but the data with id "education_164" contains the keyword "instruction", which is a keyword for Alpaca format data, causing confusion.
- For the data with id "multi_concept_gen_64", the number of tags does not match the number of image paths.
The above two are just examples. Many data entries exhibit these two situations.