"dedup" result errors when tokenizing
#471
by
ZYSK-huggingface
- opened
Hi !
When tokenizing raw loom data, if set parameter 'collapse_gene_ids'=T, code will generate '_dedup' files. However, '_dedup' files do no have 'ensembl_id' col, which will cause errors. Is this fixed in the latest version?
Thank you for your question. The attribute name should be "ensembl_id_collapsed" in these intermediate files. The user is welcome to rename the attribute but we do this to distinguish the file that went through the deduplication process.
ctheodoris
changed discussion status to
closed