Inconsistent data formatting

#2
by XuyaoWang - opened

When using the data, it was discovered that the data format is not completely uniform.

  1. According to the example, the text dialogue is in ShareGPT Format, but the data with id "education_164" contains the keyword "instruction", which is a keyword for Alpaca format data, causing confusion.
  2. For the data with id "multi_concept_gen_64", the number of tags does not match the number of image paths.

The above two are just examples. Many data entries exhibit these two situations.

Sign up or log in to comment