Skip to content

Reece's figures and tables#15

Open
reeceaa wants to merge 89 commits into
mainfrom
reece_analysis
Open

Reece's figures and tables#15
reeceaa wants to merge 89 commits into
mainfrom
reece_analysis

Conversation

@reeceaa
Copy link
Copy Markdown
Collaborator

@reeceaa reeceaa commented May 18, 2026

Creation of table S1:
https://github.com/PlatigLab/ENCODE-RNA-Binding-Protein-Network-Modeling/blob/reece_analysis/analysis/07_reece_biological_validations/1_Crispr_CTRL_SHAP/Analyses/supp_table_S1.ipynb

Creation of Figure 2B
https://github.com/PlatigLab/ENCODE-RNA-Binding-Protein-Network-Modeling/blob/reece_analysis/analysis/07_reece_biological_validations/1_Crispr_CTRL_SHAP/Figures/figure_2_B_generation.ipynb

Crispr Analysis: HepG2 example
Slurm scripts:
https://github.com/PlatigLab/ENCODE-RNA-Binding-Protein-Network-Modeling/blob/reece_analysis/analysis/07_reece_biological_validations/1_Crispr_CTRL_SHAP/HepG2/preprocess_HepG2.sh

https://github.com/PlatigLab/ENCODE-RNA-Binding-Protein-Network-Modeling/blob/reece_analysis/analysis/07_reece_biological_validations/1_Crispr_CTRL_SHAP/HepG2/HepG2_slurm.sh

Precompute the BAT -> https://github.com/PlatigLab/ENCODE-RNA-Binding-Protein-Network-Modeling/blob/reece_analysis/analysis/07_reece_biological_validations/1_Crispr_CTRL_SHAP/HepG2/HepG2_precompute_rmats_matches.py

Then run this to get the SHAP and dPSI stuff -> https://github.com/PlatigLab/ENCODE-RNA-Binding-Protein-Network-Modeling/blob/reece_analysis/analysis/07_reece_biological_validations/1_Crispr_CTRL_SHAP/HepG2/HepG2_Crispr_KD.py

Creation of Figure 4B
https://github.com/PlatigLab/ENCODE-RNA-Binding-Protein-Network-Modeling/blob/reece_analysis/analysis/07_reece_biological_validations/2_RBP_Activity/figure_4_B.ipynb

Creation of Figure 4C
https://github.com/PlatigLab/ENCODE-RNA-Binding-Protein-Network-Modeling/blob/reece_analysis/analysis/07_reece_biological_validations/2_RBP_Activity/figure_4_C.ipynb

Creation of Supp. Fig S4
https://github.com/PlatigLab/ENCODE-RNA-Binding-Protein-Network-Modeling/blob/reece_analysis/analysis/07_reece_biological_validations/2_RBP_Activity/S4_RBP_activity_counts.ipynb

@reeceaa reeceaa closed this May 19, 2026
@reeceaa reeceaa reopened this May 19, 2026
@reeceaa reeceaa changed the title Reece analysis Reece's figures and tables May 19, 2026
@YogiOnBioinformatics
Copy link
Copy Markdown
Member

YogiOnBioinformatics commented May 20, 2026

Across Repo:

  • Replace Crispr to be CRISPR in all cases.
  • Do not split things by cell line unless absolutely need be.
    • Consolidate them into one script or notebook and just iterate across cell lines.
  • Remove the INTERACTION_DATA_PATH.txt and DATA_PATH.txt files from the top level folder of the repo.
    • Replace those paths to the relative paths you need within each of your scripts.
  • Save all figures that you make as PDFs
    • Run the below code before plt.show()
    • plt.savefig({figure_name}.pdf, bbox_inches='tight')

Specific Portions:

Figure 2B

  • Use vertical_relaxed instead of diagonal_relaxed.
  • Need to not write the index column for when you’re outputting the final file.
  • Name the final MCC calculation file as CRISPR_dPSI_vs_CTRL_SHAP_MCC.csv .
    • The file that’s curren tly named this is all the data and not the summary table.
  • Your significance threshold that you wrote in the paper is wrong since it’s less than or equal but you put less than .

CRISPR_KD.py

  • For the sake of accuracy and precision, update line 142 to be _CTRL- for the string
  • NO CHANGES NEEDED BUT IMPORTANT TO NOTE:
    • It seems harder and potentially more tricky to precompute all the control row tables for each RBP when you can just make the ctrl row reference one time? (Unless I’m missing something….)
  • It does not seem that you do the sorting but I remember you fixed that? Can you point to that?
  • Your write_result input for the path seems to not contain the _individual_results folder where your concatenation step later on needs that data. Am I mistaken?

Figure 4B

  • Potential bug — when filtering using Sample Name you need to do _KD- or you need to do RBP_KD_Target != "CTRL" .

  • Your significance threshold that you wrote in the paper is wrong since it’s less than or equal but you put less than .

  • Tricky and to discuss in-person: you use in-silico KD rows but don’t check for those cases where we had a control as well.

  • You need to sort a final dataframe with the output for the Activity score calculated because I need to use that dataframe to create a supplementary table that also includes that metric.

  • For the line that goes:

        # Ensure identical RBP order
        hepg2_data = hepg2_data.reindex(k562_data.index)
    • Would this line be necessary based on the lines that are above this?
  • Leave a massive comment where you set the y ticks and labels manually to make it clear that this is dangerous and may need to be modified in the future.

  • Remove the “RBP” x-axis because John removed it for my heatmap and so we’ll stick with that.

  • You are saving two different PDFs for the figure at the end which doesn’t make sense. Keep the more descriptive name of the two.

  • There’s no need to return axes when you don’t use that later.

Figure 4C

  • Potential bug — when filtering using Sample Name you need to do _KD- or you need to do RBP_KD_Target != "CTRL" .
  • Your significance threshold that you wrote in the paper is wrong since it’s less than or equal but you put less than .
  • Tricky and to discuss in-person: you use in-silico KD rows but don’t check for those cases where we had a control as well.
  • Remove all this code for recalculating the Activity score and load it in from saved file.
  • Avoid getting the glossary_data using globals() and pass it into the function or access it based on its path (whichever is easier).
  • Do not manually set the x limit since that can be variable. Hence, remove the line about plt.xlim(-1.5, 1.3)
  • Bold the x and y axis titles and move the x and y axis titles ever so slightly closer to the plot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants