Getting DatasetGenerationError: An error occurred while generating the dataset
Please help me to solve the following issue while downloading the dataset:
/root/.cache/huggingface/modules/datasets_modules/datasets/common_voice/220833898d6a60c50f621126e51fb22eb2dfe5244392c70dccd8e6e2f055f4bf/common_voice.py:634: FutureWarning:
This version of the Common Voice dataset is deprecated.
You can download the latest one with
>>> load_dataset("mozilla-foundation/common_voice_11_0", "en")
warnings.warn(
Generating train split: 0%
0/2009 [00:00<?, ? examples/s]
ReadError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/datasets/builder.py in _prepare_split_single(self, gen_kwargs, fpath, file_format, max_shard_size, split_info, check_duplicate_keys, job_id)
1749 _time = time.time()
-> 1750 for key, record in generator:
1751 if max_shard_size is not None and writer._num_bytes > max_shard_size:
13 frames
ReadError: truncated header
The above exception was the direct cause of the following exception:
DatasetGenerationError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/datasets/builder.py in _prepare_split_single(self, gen_kwargs, fpath, file_format, max_shard_size, split_info, check_duplicate_keys, job_id)
1784 if isinstance(e, SchemaInferenceError) and e.context is not None:
1785 e = e.context
-> 1786 raise DatasetGenerationError("An error occurred while generating the dataset") from e
1787
1788 yield job_id, True, (total_num_examples, total_num_bytes, writer._features, num_shards, shard_lengths)
DatasetGenerationError: An error occurred while generating the dataset