Hi again,
I am now having a new think about ERCC. Have you tested what happens with spike-ins with any of the datasets for the paper? I am thinking that it might be more natural to treat them separately, i.e. as they are non-biological, they cannot be supposed to follow the same distribution as the rest. On the other side, they need to be transformed too to be possible to use down-stream, and as they are so few in number compared to the genes, they might not have a major impact, even if they are on average much more highly expressed than the average normal transcript. So what would you recommend? For now, I will create qUMI counts for the ERCC with all the other transcripts included, and then exclude them to calculate the qUMIs for the true transcripts.
Best regards
Jakob
Hi again,
I am now having a new think about ERCC. Have you tested what happens with spike-ins with any of the datasets for the paper? I am thinking that it might be more natural to treat them separately, i.e. as they are non-biological, they cannot be supposed to follow the same distribution as the rest. On the other side, they need to be transformed too to be possible to use down-stream, and as they are so few in number compared to the genes, they might not have a major impact, even if they are on average much more highly expressed than the average normal transcript. So what would you recommend? For now, I will create qUMI counts for the ERCC with all the other transcripts included, and then exclude them to calculate the qUMIs for the true transcripts.
Best regards
Jakob