-
Notifications
You must be signed in to change notification settings - Fork 9
Description
I am facing an issue where GraphBin fails to recognize contigs from the assembly graph produced by MEGAHIT. Since MEGAHIT does not output a .gfa file directly, I used megahit_toolkit to convert the intermediate contigs to .fastg format. However, GraphBin reports 0 contigs and 0 edges, followed by an "Unexpected ID" error.
It seems the IDs in the generated .fastg do not match the IDs in the final .fa file.
Command & Steps executed
Convert to FASTG:
Bash
/path/to/megahit_toolkit contig2fastg 141 intermediate_contigs/k141.contigs.fa > k141.fastg
Run GraphBin:
Bash
graphbin --assembler megahit --graph k141.fastg --contigs final.contigs.fa --binned cluster.csv --output graphbin
Error Log
Plaintext
2026-01-20 16:09:37,988 - INFO - Total number of contigs available: 0
2026-01-20 16:09:38,066 - INFO - Total number of edges in the assembly graph: 0
...
2026-01-20 16:09:38,071 - ERROR - Unexpected 'k141_126935'
2026-01-20 16:09:38,071 - ERROR - Please make sure that you have provided the correct assembler type and the correct path to the binning result file in the correct format.
My Findings
After manual inspection, the Contig IDs in k141.fastg (generated by the toolkit) do not correspond to the IDs in final.contigs.fa. For example:
In .fa: >k141_126935
In .fastg: [Describe what it looks like, e.g., it might have extra information or different naming]
Questions
What is the recommended workflow for using MEGAHIT output with GraphBin, given that MEGAHIT doesn't produce GFA?
Is megahit_toolkit contig2fastg the correct way to prepare the graph for GraphBin?
How should I resolve the ID mismatch between the assembly graph and the contigs file?