Skip to content

Commit cd92ca5

Browse files
authored
proteins readme
1 parent a8823c8 commit cd92ca5

File tree

1 file changed

+146
-2
lines changed

1 file changed

+146
-2
lines changed

README.md

Lines changed: 146 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,146 @@
1-
# python-chebai-proteins
2-
Protein-related extension of the chebai framework
1+
2+
# 🧪 ChEB-AI Proteins
3+
4+
`python-chebai-proteins` repository for protein prediction and classification, built on top of the [`python-chebai`](https://github.com/ChEB-AI/python-chebai) codebase.
5+
6+
7+
## 🔧 Installation
8+
9+
10+
To install, follow these steps:
11+
12+
1. Clone the repository:
13+
```
14+
git clone https://github.com/ChEB-AI/python-chebai-proteins.git
15+
```
16+
17+
2. Install the package:
18+
19+
```
20+
cd python-chebai
21+
pip install .
22+
```
23+
24+
## 🗂 Recommended Folder Structure
25+
26+
To combine configuration files from both `python-chebai` and `python-chebai-proteins`, structure your project like this:
27+
28+
```
29+
my_projects/
30+
├── python-chebai/
31+
│ ├── chebai/
32+
│ ├── configs/
33+
│ └── ...
34+
└── python-chebai-proteins/
35+
├── chebai_proteins/
36+
├── configs/
37+
└── ...
38+
```
39+
40+
This setup enables shared access to data and model configurations.
41+
42+
43+
44+
## 🚀 Training & Pretraining Guide
45+
46+
### ⚠️ Important Setup Instructions
47+
48+
Before running any training scripts, ensure the environment is correctly configured:
49+
50+
* Either:
51+
52+
* Install the `python-chebai` repository as a package using:
53+
54+
```bash
55+
pip install .
56+
```
57+
* **OR**
58+
59+
* Manually set the `PYTHONPATH` environment variable if working across multiple directories (`python-chebai` and `python-chebai-proteins`):
60+
61+
* If your current working directory is `python-chebai-proteins`, set:
62+
63+
```bash
64+
export PYTHONPATH=path/to/python-chebai
65+
```
66+
or vice versa.
67+
68+
* If you're working within both repositories simultaneously or facing module not found errors, we **recommend configuring both directories**:
69+
70+
```bash
71+
# Linux/macOS
72+
export PYTHONPATH=path/to/python-chebai:path/to/python-chebai-proteins
73+
74+
# Windows (use semicolon instead of colon)
75+
set PYTHONPATH=path\to\python-chebai;path\to\python-chebai-proteins
76+
```
77+
78+
> 🔎 See the [PYTHONPATH Explained](#-pythonpath-explained) section below for more details.
79+
80+
81+
### 📊 SCOPE hierarchy prediction
82+
83+
Assuming your current working directory is `python-chebai-proteins`, run the following command to start training:
84+
```bash
85+
python -m chebai fit --trainer=../configs/training/default_trainer.yml --trainer.callbacks=../configs/training/default_callbacks.yml --trainer.logger.init_args.name=scope50 --trainer.accumulate_grad_batches=4 --trainer.logger=../configs/training/wandb_logger.yml --trainer.min_epochs=100 --trainer.max_epochs=100 --data=configs/data/scope/scope50.yml --data.init_args.batch_size=32 --data.init_args.num_workers=10 --model=../configs/model/electra.yml --model.train_metrics=../configs/metrics/micro-macro-f1.yml --model.test_metrics=../configs/metrics/micro-macro-f1.yml --model.val_metrics=../configs/metrics/micro-macro-f1.yml --model.pass_loss_kwargs=false --model.criterion=../configs/loss/bce.yml --model.criterion.init_args.beta=0.99
86+
```
87+
88+
Same command can be used for **DeepGO** just by changing the config path for data.
89+
90+
91+
92+
93+
94+
95+
96+
## 🧭 PYTHONPATH Explained
97+
98+
### What is `PYTHONPATH`?
99+
100+
`PYTHONPATH` is an environment variable that tells Python where to search for modules that aren't installed via `pip` or not in your current working directory.
101+
102+
### Why You Need It
103+
104+
If your config refers to a custom module like:
105+
106+
```yaml
107+
class_path: chebai_proteins.preprocessing.datasets.scope.scope.SCOPe50
108+
```
109+
110+
...and you're running the code from `python-chebai`, Python won't know where to find `chebai_proteins` (from another repo like `python-chebai-proteins/`) unless you add it to `PYTHONPATH`.
111+
112+
113+
### How Python Finds Modules
114+
115+
Python looks for imports in this order:
116+
117+
1. Current directory
118+
2. Standard library
119+
3. Paths in `PYTHONPATH`
120+
4. Installed packages (`site-packages`)
121+
122+
You can inspect the full search paths:
123+
124+
```bash
125+
python -c "import sys; print(sys.path)"
126+
```
127+
128+
129+
130+
### ✅ Setting `PYTHONPATH`
131+
132+
#### 🐧 Linux / macOS
133+
134+
```bash
135+
export PYTHONPATH=/path/to/python-chebai-graph
136+
echo $PYTHONPATH
137+
```
138+
139+
#### 🪟 Windows CMD
140+
141+
```cmd
142+
set PYTHONPATH=C:\path\to\python-chebai-graph
143+
echo %PYTHONPATH%
144+
```
145+
146+
> 💡 Note: This is temporary for your terminal session. To make it permanent, add it to your system environment variables.

0 commit comments

Comments
 (0)