"dedup" result errors when tokenizing

#471
by ZYSK-huggingface - opened

Hi !

When tokenizing raw loom data, if set parameter 'collapse_gene_ids'=T, code will generate '_dedup' files. However, '_dedup' files do no have 'ensembl_id' col, which will cause errors. Is this fixed in the latest version?

Thank you for your question. The attribute name should be "ensembl_id_collapsed" in these intermediate files. The user is welcome to rename the attribute but we do this to distinguish the file that went through the deduplication process.

ctheodoris changed discussion status to closed

Sign up or log in to comment