I want to use dna2vec for E. coli genome.
When I set 2<=k<=8, I got (86479,100);
When I set 3<=k<8, I got (86614,100), and the correct dimension should be (87360,100) that $87360+16=4^2+4^3+4^4+4^5+4^6+4^7+4^8$.
So I don' know why I got 2 different results.
I also check every Kmer from 2 to 8, I find the dimension is correct from 2 to 7.
However, in k=8, the dimension is (64450,100) rather than (65536,100), and $65536-64450 != 87630-86614$.
This is horrible! There is nowhere to match.
I want to use dna2vec for E. coli genome.$87360+16=4^2+4^3+4^4+4^5+4^6+4^7+4^8$ .$65536-64450 != 87630-86614$ .
When I set
2<=k<=8, I got(86479,100);When I set
3<=k<8, I got(86614,100), and the correct dimension should be(87360,100)thatSo I don' know why I got 2 different results.
I also check every Kmer from 2 to 8, I find the dimension is correct from 2 to 7.
However, in
k=8, the dimension is(64450,100)rather than(65536,100), andThis is horrible! There is nowhere to match.