Skip to content

Commit 79b31db

Browse files
committed
Update RFD3 docs: clarify input specs and file formats
Expanded and clarified the documentation for RFdiffusion3 input specifications, including more detailed explanations of the 'contig' string, input file types, and example YAML/JSON formats. Improved the intro to inference calculations to better explain the structure and usage of settings files, and updated descriptions for job configuration and output files. Added a placeholder for configuration options documentation.
1 parent c3969f9 commit 79b31db

File tree

3 files changed

+78
-69
lines changed

3 files changed

+78
-69
lines changed

models/rfd3/docs/configuration_options.md

Whitespace-only changes.

models/rfd3/docs/input.md

Lines changed: 66 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,12 @@
11
# RFdiffusion3 — Input specification (dialect **2**)
22

33
> **TL;DR**
4-
> Inputs are now defined with a single `InputSpecification` class.
4+
> Inputs are now defined with a single `InputSpecification` class, see [`rfd3/src/rfd3/inference/input.parsing.py`](https://github.com/RosettaCommons/foundry/blob/rac_docs/models/rfd3/src/rfd3/inference/input_parsing.py) to see all possible inputs.
55
> Selections like “what’s fixed?”, “what’s sequence-free?”, “which atoms are donors/acceptors?” are all expressed with the same **InputSelection** mini-language.
6-
> Everything is reproducibly logged back out alongside your generation.
6+
> Everything is reproducibly logged back out alongside your generation – each design will create an output JSON file with all setting defined.
77
88
---
99

10-
- [What changed (high level)](#what-changed-high-level)
1110
- [Quick start](#quick-start)
1211
- [The `InputSelection` mini-language](#the-inputselection-mini-language)
1312
- [Full schema: `InputSpecification`](#full-schema-inputspecification)
@@ -23,77 +22,88 @@
2322

2423
---
2524

26-
## How it works (high level)
27-
28-
- **Unified selections.** All per-residue/atom choices now use **InputSelection**:
29-
- You can pass `true`/`false`, a **contig string** (`"A1-10,B5-8"`), or a **dictionary** (`{"A1-10": "ALL", "B5": "N,CA,C,O"}`).
30-
- Selection fields include: `select_fixed_atoms`, `select_unfixed_sequence`, `select_buried`, `select_partially_buried`, `select_exposed`, `select_hbond_donor`, `select_hbond_acceptor`, `select_hotspots`.
31-
- **Clearer unindexing.** For **unindexed** motifs you typically either fix `"ALL"` atoms or explicitly choose subsets such as `"TIP"`/`"BKBN"`/explicit atom lists via a **dictionary** (see examples).
32-
When using `unindex`, only **the atoms you mark as fixed** are carried over from the input.
33-
- **Reproducibility.** The exact specification and the **sampled contig** are logged back into the output JSON. We also log useful counts (atoms, residues, chains).
34-
- **Safer parsing.** You’ll now get early, informative errors if:
35-
- You pass unknown keys,
36-
- A selection doesn’t match any atoms,
37-
- Indexed and unindexed motifs overlap,
38-
- Mutually exclusive selections overlap (e.g., two RASA bins for the same atom).
39-
- **Backwards compatible.** Add `"dialect": 1` to keep your old configs running while you migrate. (Deprecated.)
40-
41-
---
42-
4325
## InputSpecification
44-
45-
| Field | Type | Description |
46-
| -------------------------------------------------------------- | ----------------- | --------------------------------------------------------------------- |
47-
| `input` | `str?` | Path to input **PDB/CIF**. Required if you provide contig+length. |
48-
| `atom_array_input` | internal | Pre-loaded `AtomArray` (not recommended). |
49-
| `contig` | `InputSelection?` | Indexed motif specification, e.g., `"A1-80,10,\0,B5-12"`. |
50-
| `unindex` | `InputSelection?` | Unindexed motif components (unknown sequence placement). |
51-
| `length` | `str?` | Total design length constraint; `"min-max"` or int. |
52-
| `ligand` | `str?` | Ligand(s) by resname or index. |
53-
| `cif_parser_args` | `dict?` | Optional args to CIF loader. |
54-
| `extra` | `dict` | Extra metadata (e.g., logs). |
55-
| `dialect` | `int` | `2`=new (default), `1`=legacy. |
56-
| `select_fixed_atoms` | `InputSelection?` | Atoms with fixed coordinates. |
57-
| `select_unfixed_sequence` | `InputSelection?` | Where sequence can change. |
58-
| `select_buried` / `select_partially_buried` / `select_exposed` | `InputSelection?` | RASA bins 0/1/2 (mutually exclusive). |
59-
| `select_hbond_donor` / `select_hbond_acceptor` | `InputSelection?` | Atom-wise donor/acceptor flags. |
60-
| `select_hotspots` | `InputSelection?` | Atom-level or token-level hotspots. |
61-
| `redesign_motif_sidechains` | `bool` | Fixed backbone, redesigned sidechains for motifs. |
62-
| `symmetry` | `SymmetryConfig?` | See `docs/symmetry.md`. |
63-
| `ori_token` | `list[float]?` | `[x,y,z]` origin override to control COM placement |
64-
| `infer_ori_strategy` | `str?` | `"com"` or `"hotspots"`. |
65-
| `plddt_enhanced` | `bool` | Default `true`. |
66-
| `is_non_loopy` | `bool` | Default `true`. |
67-
| `partial_t` | `float?` | Noise (Å) for partial diffusion, enables partial diffusion |
26+
Here are some of the inference settings in RFdiffusion3 (RFD3):
27+
* For the inputs that are of type `InputSelection` see section [The InputSelection mini-language](#the-inputselection-mini-language) for more details
28+
29+
| Field | Type | Description |
30+
| -------------------------------------------------------------- | ----------------- | --------------------------------------------------------------------------------------- |
31+
| `input` | `str` | Path to and file name of input **PDB/CIF**. Required if you provide `contig`+`length`. |
32+
| `atom_array_input` | `AtomArray` | Pre-loaded `AtomArray` ([class from Biotite](https://www.biotite-python.org/latest/apidoc/biotite.structure.AtomArray.html)) (not recommended). |
33+
| `contig` | `InputSelection` | Indexed motif specification, e.g., `"A1-80,10,\0,B5-12"`. More details in [next section](#contig) |
34+
| `unindex` | `InputSelection` | Unindexed motif components (unknown sequence placement). Example: `A15-20,B6-10` or <!-- TO DO test out dictionary specification for this--> |
35+
| `length` | `str?` | Total design length constraint; `"min-max"` or int. |
36+
| `ligand` | `str?` | Ligand(s) by resname or index. |
37+
| `cif_parser_args` | `dict?` | Optional args to CIF loader. |
38+
| `extra` | `dict` | Extra metadata (e.g., logs). |
39+
| `dialect` | `int` | `2`=new (default), `1`=legacy. |
40+
| `select_fixed_atoms` | `InputSelection?` | Atoms with fixed coordinates. |
41+
| `select_unfixed_sequence` | `InputSelection?` | Where sequence can change. |
42+
| `select_buried` / `select_partially_buried` / `select_exposed` | `InputSelection?` | RASA bins 0/1/2 (mutually exclusive). |
43+
| `select_hbond_donor` / `select_hbond_acceptor` | `InputSelection?` | Atom-wise donor/acceptor flags. |
44+
| `select_hotspots` | `InputSelection?` | Atom-level or token-level hotspots. |
45+
| `redesign_motif_sidechains` | `bool` | Fixed backbone, redesigned sidechains for motifs. |
46+
| `symmetry` | `SymmetryConfig?` | See `docs/symmetry.md`. |
47+
| `ori_token` | `list[float]?` | `[x,y,z]` origin override to control COM placement |
48+
| `infer_ori_strategy` | `str?` | `"com"` or `"hotspots"`. |
49+
| `plddt_enhanced` | `bool` | Default `true`. |
50+
| `is_non_loopy` | `bool` | Default `true`. |
51+
| `partial_t` | `float?` | Noise (Å) for partial diffusion, enables partial diffusion |
6852

6953

7054
## Quick start
7155

72-
### Minimal JSON example
56+
### `contig`
57+
The 'contig string' is one way to specify the portions of your final structure that come from your input PDB/CIF or are designed by RFD3. Here are a few guidelines for writing a `contig` string:
58+
- Different portions of the string should be comma separated
59+
- `\0` denotes a chain break - no peptide bond is specified between the chain before/after the chain break but the break can be as large/small as makes sense for the rest of the design
60+
- Any portions of the string that start with a letter (e.g. `A1-80`) come from the input PDB, the letter corresponds to the chain label in the input PDB/CIF file
61+
- Any portions of the string that do **not** start with a letter are going to be designed by RFD3
62+
- If a range is specified for a designed segment (e.g., `100–150`), the length of the designed region is sampled uniformly at random from that range, inclusive.
63+
- The order of the `contig` string is followed in the design
64+
65+
> **Example**
66+
>
67+
> `A1-80,10-20,A100-120,B25-50,\0,C43-56,40-60`
68+
>
69+
> The resulting design would have:
70+
> - Residues 1-80 from chain A in the input PDB/CIF
71+
> - 10 to 20 designed residues that connect to residue A80
72+
> - Residues 100-120 from chain A in the input PDB/CIF, connected to the last residue in the designed region
73+
> - Residues 25-50 from chain B in the input PDB/CIF, connected to A120, even if this connection did not exist in the input PDB/CIF
74+
> - A chain break
75+
> - Residues 43-56 from chain C in the input PDB/CIF not connected to the previous chain
76+
> - 40-60 designed residues that connect to residue C56
77+
78+
### Input File Types
79+
For more detailed information about these file types, see {doc}`intro_inference_calculations`.
80+
81+
#### Minimal JSON example
7382

7483
```json
7584
{
76-
"": {
85+
"calculation_label": {
7786
"input": "path/to/template.pdb",
7887
"contig": "A1-80",
7988
"length": "150-180",
8089
"select_fixed_atoms": true,
8190
"select_unfixed_sequence": "A20-35",
8291
"ligand": "HAX,OAA",
8392
"dialect": 2
84-
}
93+
}
8594
}
8695
```
87-
### Mininmal YAML example
88-
```
89-
input: path/to/template.pdb
90-
contig: A1-80
91-
length: 150-180
92-
select_fixed_atoms: true
93-
select_unfixed_sequence: A20-35
94-
ligand: HAX,OAA
95-
dialect: 2
9696

97+
#### Mininmal YAML example
98+
```yaml
99+
calculation_label:
100+
input: path/to/template.pdb
101+
contig: A1-80
102+
length: 150-180
103+
select_fixed_atoms: true
104+
select_unfixed_sequence: A20-35
105+
ligand: HAX,OAA
106+
dialect: 2
97107
```
98108
99109
### Python API

0 commit comments

Comments
 (0)