Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions pyrosettacluster/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,17 +130,17 @@ Please refer to the following table to select _one_ environment file extraction
| --- | --- | --- | --- |
| `.pdb` | Decoy | Read file → Copy → Paste into new file | Run `dump_env_file.py` helper |
| `.pdb.bz2` | Decoy | Unzip with `bzip2` → Read file → Copy → Paste into new file | Run `dump_env_file.py` helper |
| `.pkl_pose`, `.pkl_pose.bz2`, `.b64_pose`, `.b64_pose.bz2` | Decoy | | Run `dump_env_file.py` helper |
| `.pkl_pose`, `.pkl_pose.bz2`, `.b64_pose`, `.b64_pose.bz2` | Decoy | | Run `dump_env_file.py` helper _(requires an identical PyRosetta build signature to that used to save the original file)_ |
| `.json` | Full-record scorefile | Read file → Copy → Paste into new file | Run `dump_env_file.py` helper |
| Pickled `pandas.DataFrame`<br>(`.gz`, `.xz`, `.tar`, etc.) | Full-record scorefile | | Run `dump_env_file.py` helper |
| `.init`, `.init.bz2` | PyRosetta initialization file | | Run `dump_env_file.py` helper |
| `.init`, `.init.bz2` | PyRosetta initialization file | | Run `dump_env_file.py` helper _(requires an identical PyRosetta build signature to that used to save the original file)_ |

> [!NOTE]
> **Extraction method #1:** If copy/pasting into a new file, the environment file string is located in the `record["instance"]["environment"]` nested key value of the PyRosettaCluster full record. Please paste it into one of the following file names (as expected in the next step) in a new folder, depending on the environment manager you're using to recreate the environment:
> | Environment manager | New file name |
> | --- | --- |
> | `pixi` | `pixi.lock` |
> | `uv` | `requirements.txt` |
> | `uv` | `uv.lock` |
> | `conda` | `environment.yml` |
> | `mamba` | `environment.yml` |
>
Expand All @@ -153,7 +153,7 @@ Please refer to the following table to select _one_ environment file extraction
> Also note the `record["instance"]["sha1"]` nested key value holding the GitHub commit SHA1 required to [reproduce the PyRosettaCluster simulation](#clone-original-repository)!

> [!NOTE]
> **Extraction method #2:** If running `dump_env_file.py`, the `pyrosetta` package (with version `>=2025.47`) and the [PyPI pyrosetta-distributed](https://pypi.org/project/pyrosetta-distributed/) package (for the `pyrosetta.distributed` framework dependencies) must be installed in any existing virtual environment, and that virtual environment's python interpreter used to run the script.
> **Extraction method #2:** If running `dump_env_file.py`, the `pyrosetta` package (with version `>=2025.47`) and the [PyPI pyrosetta-distributed](https://pypi.org/project/pyrosetta-distributed/) package (for the `pyrosetta.distributed` framework dependencies) must be installed in any existing virtual environment, and that virtual environment's python interpreter used to run the script. If extracting from a `.pkl_pose`, `.pkl_pose.bz2`, `.b64_pose`, `.b64_pose.bz2`, `.init` or `.init.bz2` file, the PyRosetta build signature _must be identical_ to that used to save the original decoy file or initialization file, otherwise an exception or segmentation fault may occur.
>
> Also note the printed GitHub commit SHA1 required to [reproduce the PyRosettaCluster simulation](#clone-original-repository)!

Expand All @@ -168,7 +168,7 @@ Run `python recreate_env.py` to recreate the virtual environment.
> This script runs a subprocess with one of the following commands:<br>
> - `conda env create ...`: when using the `conda` environment manager<br>
> - `mamba env create ...`: when using the `mamba` environment manager<br>
> - `uv pip sync ...`: when using the `uv` environment manager<br>
> - `uv sync ...`: when using the `uv` environment manager<br>
> - `pixi install ...`: when using the `pixi` environment manager<br>
> Installing certain packages may not be secure, so please only run with an input environment file you trust!<br>
> Learn more about [PyPI security](https://pypi.org/security) and [conda security](https://www.anaconda.com/docs/reference/security).
Expand Down Expand Up @@ -240,3 +240,4 @@ if __name__ == "__main__":




37 changes: 28 additions & 9 deletions pyrosettacluster/dump_env_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ def main(
env_dir: Optional[str],
pyrosetta_init_flags: Optional[str],
pandas_secure: bool,
pyarrow_secure: bool,
) -> None:
"""
Dump the PyRosettaCluster environment file(s) based on metadata from a PyRosettaCluster result.
Expand All @@ -49,16 +50,20 @@ def main(
and input_file.endswith((".pkl_pose", ".pkl_pose.bz2", ".b64_pose", ".b64_pose.bz2"))
):
if pyrosetta_init_flags:
pyrosetta.distributed.init(pyrosetta_init_flags)
pyrosetta.init(options="", extra_options=pyrosetta_init_flags, silent=True)
else:
pyrosetta.distributed.init()
pyrosetta.init(options="", extra_options="-run:constant_seed 1 -out:level 200", silent=True)

if (
isinstance(scorefile, str)
and not scorefile.endswith(".json")
):
if pandas_secure:
pyrosetta.secure_unpickle.add_secure_package("pandas")
print(f"[INFO] Added 'pandas' as a secure package to the PyRosetta unpickle-allowed list.")
if pyarrow_secure:
pyrosetta.secure_unpickle.add_secure_package("pyarrow")
print(f"[INFO] Added 'pyarrow' as a secure package to the PyRosetta unpickle-allowed list.")
else:
raise RuntimeError(
"Please also pass the `--pandas_secure` flag to retrieve data from a `pandas.DataFrame`. "
Expand Down Expand Up @@ -188,7 +193,10 @@ def ensure_env_dir(path: Optional[str]) -> str:
"Path to a PyRosettaCluster output decoy file (a '.pdb', '.pdb.bz2', '.pkl_pose', "
"'.pkl_pose.bz2', '.b64_pose', or '.b64_pose.bz2' file) or an output PyRosetta "
"initialization file ('.init' or '.init.bz2' file). If provided, the `--scorefile` "
"and `--decoy_name` flags are ignored."
"and `--decoy_name` flags are ignored. If using a '.pkl_pose', '.pkl_pose.bz2', "
"'.b64_pose', '.b64_pose.bz2', '.init' or '.init.bz2' file, please ensure that "
"the current PyRosetta version is identical to that used to save the decoy file "
"or initialization file."
),
)

Expand Down Expand Up @@ -262,15 +270,26 @@ def ensure_env_dir(path: Optional[str]) -> str:
"comes from a trusted source."
),
)
parser.add_argument(
"--pyarrow_secure",
action="store_true",
default=False,
help=(
"Allow loading a pickled `pandas.DataFrame` scorefile with PyArrow-backed datatypes, "
"which may be required if the scorefile was saved with `pandas` version `>=3.0.0`. "
"This flag only has an effect if the `--pandas_secure` flag is also passed."
),
)

args = parser.parse_args()
args.env_dir = ensure_env_dir(args.env_dir)

main(
input_file=args.input_file,
scorefile=args.scorefile,
decoy_name=args.decoy_name,
env_dir=args.env_dir,
pyrosetta_init_flags=args.pyrosetta_init_flags,
pandas_secure=args.pandas_secure,
args.input_file,
args.scorefile,
args.decoy_name,
args.env_dir,
args.pyrosetta_init_flags,
args.pandas_secure,
args.pyarrow_secure,
)
2 changes: 1 addition & 1 deletion pyrosettacluster/recreate_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
*Warning*: This script runs a subprocess with one of the following commands:
- `conda env create ...`: when 'conda' is an executable
- `mamba env create ...`: when 'mamba' is an executable
- `uv pip sync ...`: when 'uv' is an executable
- `uv sync ...`: when 'uv' is an executable
- `pixi install ...`: when 'pixi' is an executable
Installing certain packages may not be secure, so please only run with input files you trust.
Learn more about PyPI security at <https://pypi.org/security> and conda security
Expand Down
Loading