inner -> outer join with imputation

Right now the paired counts are inner joined (the default behavior of `merge`).

https://github.com/sheltzer-lab/crispr-screening/blob/1a6f8c1cbe94433e4abfc02d47247ba92c21ade4/bin/extract-reads.py#L75-L85

However, the downstream analysis (i.e., MAGeCK mle) should be able to handle merging counts properly if we do a full (outer) join then impute with ones/1s.

	finalDf = initialDf.merge(
	finalDf, on=["id", "gene"], suffixes=["_initial", "_final"]
	).rename(
	columns={
	"id": "sgRNA",
	"gene": "Gene",
	"count_initial": sample + ".initial",
	"count_final": sample + ".final",
	}
	)
	finalDf.to_csv("count-" + sample + "-i.f.csv", index=False)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inner -> outer join with imputation #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

inner -> outer join with imputation #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions