Skip to content

Commit 0ed2508

Browse files
author
Quartz Syncer
committed
Published multiple files
1 parent c9df7c0 commit 0ed2508

1 file changed

Lines changed: 45 additions & 46 deletions

File tree

content/dscode.md

Lines changed: 45 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -31,50 +31,49 @@ However, they do not address the single or double strandedness of DNA.
3131

3232
The dscode alphabet is a super set of the IUPAC alphabet. The symbols take on a different meaning as each symbol represent a base pair (a base in a DNA strand and its complementary base on the other strand) instead of a single base.
3333

34-
The alphabet uses an additional ten symbols to represent single stranded regions where there is no complementary base (see table below). Dscode remains 100% backward compatible with the IUPAC alphabet. [link](https://docs.google.com/document/d/1QAjGeCByWemjVZnva7ap3sg7e8wIwVUK4-pIjt4GIZ0/edit?usp=sharing)
35-
36-
| Alphabet | Symbol | Complement | Bases | dsIUPAC extended meaning |
37-
| ---------- | ------ | ---------- | ------------------------------------------- | ------------------------ |
38-
| IUPAC | G | C | G | G/C |
39-
| " | A | T | A | A/T |
40-
| " | T | A | T | T/A |
41-
| " | C | G | C | C/G |
42-
| " | R | Y | G or A | R/Y |
43-
| " | Y | R | T or C | Y/R |
44-
| " | M | K | A or C | M/K |
45-
| " | K | M | G or T | K/M |
46-
| " | S | S | G or C | S/S |
47-
| " | W | W | A or T | W/W |
48-
| " | H | D | A or C or T | H/D |
49-
| " | B | V | G or T or C | B/V |
50-
| " | V | B | G or C or A | V/B |
51-
| " | D | H | G or A or T | D/H |
52-
| " | N | N | G or A or T or C | N/N |
53-
| **dscode** | U | O | U in top strand, A in complementary strand | U/A |
54-
| " | O | U | A in top strand, U in complementary strand | A/U |
55-
| " | E | F | A in top strand, complementary strand empty | A/◻ |
56-
| **"** | I | J | C " | C/◻ |
57-
| **"** | P | Q | G " | G/◻ |
58-
| **"** | X | Z | T " | T/◻ |
59-
| **"** | Z | X | A in complementary strand, top strand empty | ◻/A |
60-
| **"** | Q | P | C " | ◻/C |
61-
| **"** | J | I | G " | ◻/G |
62-
| **"** | F | E | T " | ◻/T |
63-
| " | ! | A | A in upper strand A in lower strand | A/A |
64-
| " | # | C | A in upper strand C in lower strand | A/C |
65-
| **"** | $ | G | A in upper strand G in lower strand | A/G |
66-
| **"** | % | A | C in upper strand A in lower strand | C/A |
67-
| **"** | & | C | C in upper strand C in lower strand | C/C |
68-
| " | * | T | C in upper strand T in lower strand | C/T |
69-
| " | ( | A | G in upper strand A in lower strand | G/A |
70-
| **"** | ) | G | G in upper strand G in lower strand | G/G |
71-
| **"** | < | T | G in upper strand T in lower strand | G/T |
72-
| **"** | > | C | T in upper strand C in lower strand | T/C |
73-
| **"** | @ | G | T in upper strand G in lower strand | T/G |
74-
| **"** | : | T | T in upper strand T in lower strand | T/T |
75-
| **"** | ? | G | U in upper strand G in lower strand | U/G |
76-
| **"** | [ | C | U in upper strand C in lower strand | U/C |
77-
| **"** | ] | T | U in upper strand T in lower strand | U/T |
34+
35+
| Alphabet | Symbol | Complement | Bases | dscode meaning |
36+
| ---------- | ------ | ---------- | ------------------------------------------- | -------------- |
37+
| IUPAC | G | C | G | G/C |
38+
| " | A | T | A | A/T |
39+
| " | T | A | T | T/A |
40+
| " | C | G | C | C/G |
41+
| " | R | Y | G or A | R/Y |
42+
| " | Y | R | T or C | Y/R |
43+
| " | M | K | A or C | M/K |
44+
| " | K | M | G or T | K/M |
45+
| " | S | S | G or C | S/S |
46+
| " | W | W | A or T | W/W |
47+
| " | H | D | A or C or T | H/D |
48+
| " | B | V | G or T or C | B/V |
49+
| " | V | B | G or C or A | V/B |
50+
| " | D | H | G or A or T | D/H |
51+
| " | N | N | G or A or T or C | N/N |
52+
| **dscode** | U | O | U in top strand, A in complementary strand | U/A |
53+
| " | O | U | A in top strand, U in complementary strand | A/U |
54+
| " | E | F | A in top strand, complementary strand empty | A/◻ |
55+
| **"** | I | J | C " | C/◻ |
56+
| **"** | P | Q | G " | G/◻ |
57+
| **"** | X | Z | T " | T/◻ |
58+
| **"** | Z | X | A in complementary strand, top strand empty | ◻/A |
59+
| **"** | Q | P | C " | ◻/C |
60+
| **"** | J | I | G " | ◻/G |
61+
| **"** | F | E | T " | ◻/T |
62+
| " | ! | A | A in upper strand A in lower strand | A/A |
63+
| " | # | C | A in upper strand C in lower strand | A/C |
64+
| **"** | $ | G | A in upper strand G in lower strand | A/G |
65+
| **"** | % | A | C in upper strand A in lower strand | C/A |
66+
| **"** | & | C | C in upper strand C in lower strand | C/C |
67+
| " | * | T | C in upper strand T in lower strand | C/T |
68+
| " | ( | A | G in upper strand A in lower strand | G/A |
69+
| **"** | ) | G | G in upper strand G in lower strand | G/G |
70+
| **"** | < | T | G in upper strand T in lower strand | G/T |
71+
| **"** | > | C | T in upper strand C in lower strand | T/C |
72+
| **"** | @ | G | T in upper strand G in lower strand | T/G |
73+
| **"** | : | T | T in upper strand T in lower strand | T/T |
74+
| **"** | ? | G | U in upper strand G in lower strand | U/G |
75+
| **"** | [ | C | U in upper strand C in lower strand | U/C |
76+
| **"** | ] | T | U in upper strand T in lower strand | U/T |
7877

7978
The symbols PEXI and QFZJ that are not occupied by the extended IUPAC alphabet were adopted to imply single stranded DNA on either
8079
strand where no complementary bas exist.
@@ -84,7 +83,7 @@ GATCaUaAa ad-hoc representation
8483
tAtUtCTAG
8584
8685
87-
PEXIaUaOaQFZJ representation using dsIUPAC
86+
PEXIaUaOaQFZJ representation using dscode
8887
```
8988

9089
The choice of symbols for the dscode extension facilitate intuitive recognition of compatible single stranded regions, i.e. sticky-ends. The symbols that can anneal are adjacent in the alphabet eg. `Q-P`, `E-F`, `I-J`, only broken by X-Z due to necessity as Y is a parth of the IUPAC alphabet.
@@ -111,7 +110,7 @@ CTAGttt CTAGttt
111110
```
112111
ASCII CAPS = ABCDEFGHIJKLMNOPQRSTUVWXYZ
113112
IUPAC = ABCD GH K MN RST VW Y
114-
dsIUPAC = EF IJ L OPQ U X Z + IUPAC
113+
dscode = EF IJ L OPQ U X Z + IUPAC
115114
116115
punctuation = ! # $ % & * + ( ) < = > @ /: ' , - . ; ? [ \ ] ^ _ ` { | } ~ "
117116
```

0 commit comments

Comments
 (0)