Skip to content

Commit b04cc1d

Browse files
author
Raktim Mitra
committed
more extensive documentation
1 parent da7b89c commit b04cc1d

2 files changed

Lines changed: 256 additions & 3 deletions

File tree

models/rfd3/docs/examples/atom23_design.json

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
{
2+
"multipolymer": {
3+
"contig": "40-50R,/0,10-20D,/0,80-110",
4+
"length": "130-180",
5+
"input": "../input_pdbs/AMP.pdb"
6+
},
27
"W05": {
38
"ss_dbn": ".(((((((((((((((((((..[[[[[[.)))))(((....)))(((....)))))))))))))))))((((((..]]]]]].)))))).",
49
"select_fixed_atoms": false,
@@ -66,8 +71,28 @@
6671
"C1-4": "ALL",
6772
"C79-86": "ALL"
6873
}
69-
}
70-
71-
74+
},
75+
"dict_input_ss": {
76+
"ss_dbn_dict": {
77+
"A6-25":"(((..)))....(((..)))",
78+
"B1-20":"((((..))))...((...))"
79+
},
80+
"contig":"30-30R,/0,30-30R",
81+
"length":"60-60",
82+
"input":"../input_pdbs/AMP.pdb"
83+
},
84+
"paired_region_input_ss": {
85+
"paired_region_list": ["A20-25,B10-15"],
86+
"loop_region_list":["A10-19","B20-30"],
87+
"contig":"50-50R,/0,50-50R",
88+
"length":"100-100",
89+
"input":"../input_pdbs/AMP.pdb"
90+
},
91+
"paired_position_input_ss": {
92+
"paired_position_list": ["A3,B3","A5,B5","A7,B7","A9,B9","A11,B11","A13,B13","A15,B15","A17,B17","A19,B19"],
93+
"contig":"20-20R,/0,20-20R",
94+
"length":"40-40",
95+
"input":"../input_pdbs/AMP.pdb"
7296

97+
}
7398
}
Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
# RNA / DNA Design in RFdiffusion3
2+
3+
This guide describes extensions to RFdiffusion3 for nucleic acid and hybrid RNA–protein design, including:
4+
5+
- RNA/DNA-aware contigs (`R` / `D` suffix)
6+
- Ligand-conditioned aptamer design
7+
- Secondary structure (SS) conditioning
8+
- Base-pair constraints (region- and position-level)
9+
- Partial structure fixing and unindexing
10+
11+
---
12+
13+
## 1. Contig Syntax for RNA/DNA
14+
15+
Contigs now support nucleic acid specification:
16+
17+
- `R` → RNA segment
18+
- `D` → DNA segment
19+
- No suffix → protein (default)
20+
21+
### Example
22+
23+
```json
24+
{
25+
"contig": "40-50R,/0,10-20D,/0,80-110"
26+
}
27+
```
28+
This corresponds to: 40–50 nt RNA, chain break, 10–20 nt DNA, chain break, 80–110 aa protein
29+
30+
Multipolymer Design
31+
32+
```json
33+
34+
{
35+
"multipolymer": {
36+
"contig": "40-50R,/0,10-20D,/0,80-110",
37+
"length": "130-180",
38+
"input": "../input_pdbs/AMP.pdb"
39+
}
40+
}
41+
```
42+
43+
## 2. Secondary Structure Conditioning
44+
### 2.1 Dot-Bracket Notation (Global)
45+
```json
46+
{
47+
"W05": {
48+
"ss_dbn": ".(((((((((((((((((((..[[[[[[.)))))(((....)))(((....)))))))))))))))))((((((..]]]]]].)))))).",
49+
"select_fixed_atoms": false,
50+
"contig": "90-90R",
51+
"length": "90-90",
52+
"input": "../input_pdbs/AMP.pdb"
53+
}
54+
}
55+
```
56+
`ss_dbn` specifies full RNA secondary structure
57+
58+
Will be applied to the first L tokens, where L is the length of `ss_dbn`.
59+
60+
### 2.2 Dictionary-Based SS Input
61+
62+
Specify secondary structure for subsections:
63+
``` json
64+
{
65+
"ss_dbn_dict": {
66+
"A6-25": "(((..)))....(((..)))",
67+
"B1-20": "((((..))))...((...))"
68+
}
69+
}
70+
```
71+
Used in:
72+
``` json
73+
{
74+
"dict_input_ss": {
75+
"ss_dbn_dict": {
76+
"A6-25": "(((..)))....(((..)))",
77+
"B1-20": "((((..))))...((...))"
78+
},
79+
"contig": "30-30R,/0,30-30R",
80+
"length": "60-60",
81+
"input": "../input_pdbs/AMP.pdb"
82+
}
83+
}
84+
```
85+
## 3. Base Pair region Conditioning
86+
### 3.1 Paired Regions
87+
88+
Define paired and loop regions:
89+
```json
90+
{
91+
"paired_region_list": ["A20-25,B10-15"],
92+
"loop_region_list": ["A10-19","B20-30"]
93+
}
94+
```
95+
Enforces pairing and loop propensity between residue ranges during sampling
96+
97+
Used in:
98+
```json
99+
{
100+
"paired_region_input_ss": {
101+
"paired_region_list": ["A20-25,B10-15"],
102+
"loop_region_list": ["A10-19","B20-30"],
103+
"contig": "50-50R,/0,50-50R",
104+
"length": "100-100",
105+
"input": "../input_pdbs/AMP.pdb"
106+
}
107+
}
108+
```
109+
110+
### 3.2 Explicit Base Pair Positions
111+
112+
Fine-grained base pairing control:
113+
114+
```json
115+
{
116+
"paired_position_list": [
117+
"A3,B3","A5,B5","A7,B7","A9,B9","A11,B11",
118+
"A13,B13","A15,B15","A17,B17","A19,B19"
119+
]
120+
}
121+
```
122+
Used in:
123+
```json
124+
{
125+
"paired_position_input_ss": {
126+
"paired_position_list": [
127+
"A3,B3","A5,B5","A7,B7","A9,B9","A11,B11",
128+
"A13,B13","A15,B15","A17,B17","A19,B19"
129+
],
130+
"contig": "20-20R,/0,20-20R",
131+
"length": "40-40",
132+
"input": "../input_pdbs/AMP.pdb"
133+
}
134+
}
135+
```
136+
### Note: Most of the above jsons is not actually reading the `input` field. Kept as a dummy for the `inference3_engine`.
137+
138+
## 4. Ligand-Conditioned Aptamer Design
139+
140+
Supports small molecule binding RNA design.
141+
142+
AMP Aptamer Example
143+
```json
144+
{
145+
"AMP_aptamer": {
146+
"input": "../input_pdbs/AMP.pdb",
147+
"ligand": "AMP",
148+
"contig": "40-50R",
149+
"length": "40-50",
150+
"ori_jitter": 1,
151+
"select_buried": {"AMP": "ALL"},
152+
"select_hbond_acceptor": {
153+
"AMP": "N7,O4',O1P,O2P,O3P,N3,N1"
154+
},
155+
"select_hbond_donor": {
156+
"AMP": "N6,O3',O2'"
157+
}
158+
}
159+
}
160+
```
161+
Key Options
162+
163+
`ligand`: ligand name in the input PDB
164+
165+
`select_buried`: enforce burial of ligand atoms
166+
167+
`select_hbond_acceptor` / `select_hbond_donor`: suggest Hbond interaction atoms
168+
169+
`ori_jitter`: small random perturbation of ori token (from ligand COM)
170+
171+
172+
## 5. Hybrid RNA–Protein Design with Constraints
173+
### RNase P Active Site Example
174+
175+
```json
176+
{
177+
"unindexed_rnasep": {
178+
"input": "../input_pdbs/rnase_p_3q1q_active_site_small.pdb",
179+
"contig": "50-80R,/0,100-120,/0,C1-4,C79-86",
180+
"length": "162-212",
181+
"ligand": "MG,PO4",
182+
"unindex": "B49,B50,B51,B52,B321,/0,A56-58,/0",
183+
"select_fixed_atoms": {
184+
"B49": "ALL",
185+
"B50": "ALL",
186+
"B51": "ALL",
187+
"B52": "ALL",
188+
"B321": "ALL",
189+
"A56-58": "ALL",
190+
"C1-4": "ALL",
191+
"C79-86": "ALL"
192+
}
193+
}
194+
}
195+
```
196+
Key Features
197+
198+
Mixed RNA + protein + fixed fragments
199+
200+
`unindex`: removes residues from positional indexing
201+
202+
`select_fixed_atoms`: freezes specified atoms
203+
204+
Ligands (MG, PO4) included in design context
205+
206+
Useful for catalytic residues or structural motifs
207+
208+
## 7. Summary of New Features
209+
210+
R / D suffix → RNA / DNA specification in contigs
211+
212+
`ss_dbn` → global secondary structure constraint (optional)
213+
214+
`ss_dbn_dict` → local secondary structure constraints (optional)
215+
216+
`paired_region_list` → helix-level pairing constraints (optional)
217+
218+
`paired_position_list` → base-level pairing constraints (optional)
219+
220+
ligand + selection options → aptamer design
221+
222+
`unindex` → remove residues from indexing
223+
224+
`select_fixed_atoms` → freeze structural elements
225+
226+
227+
---
228+

0 commit comments

Comments
 (0)