Skip to content

DEXSeq_comparison.R -- request cleaner numeric formats in dpsi.csv #20

@SolKatzman

Description

@SolKatzman

junctionCounts version 1.0.0

The dpsi.csv output file from DEXSeq_comparison.R seems to be using the default R write.csv() function.

Unfortunately, for numerical fields, this has some drawbacks for human readers

  1. excessive (and variable) digits to the right of the decimal point
  2. mix of exponential and floating point formats for the same field depending on value

I note that the format of outputs in the psi.tsv files from junctionCounts.py is much nicer.

I propose the following C-style formats for the fields in dpsi.csv

%s    event_id       
%4.3f dpsi           
%4.3e event_qval     
%4.3f cond_mean_psi 
%d    cond_mean_ijc 
%d    cond_mean_ejc 
%s    event_type     
%s    chr            
%d    start          
%d    end            
%s    strand         
%s    gene
%d    sig

I have implemented this after I converted the dpsi.csv file to a dpsi.tsv file. I also added an "abs_dpsi" field to make it easy to sort on the absolute value of dpsi. And I rearranged the order of the fields, putting "genes" at the end because it is inherently variable length. This makes everything more readable. Finally, I sorted the lines by qval (primary) and absdpsi (secondary), because that is or most interest to the biologists. (I did not include the sig field, allowing for a user to decide later what parameters to use to determine significance.)

Here is a sample:

SE_control_u2muca_dpsi.tsv

Sol Katzman
UC Santa Cruz Genomics Institute

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions