Skip to content

Commit 4aef6f3

Browse files
committed
Task 2
Task 2 completed (expect test) Images added Documented
1 parent d4837d9 commit 4aef6f3

File tree

9 files changed

+303
-0
lines changed

9 files changed

+303
-0
lines changed

PROJECT.md

Lines changed: 227 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,3 +58,230 @@ My_Ganga/
5858
│ ├── output_pages/ # Stores individual PDF pages
5959
│── gangadir/ # Ganga workspace
6060
```
61+
62+
***
63+
64+
## **Interfacing Ganga Job**
65+
66+
### **Description**
67+
This task is to demonstrate programmatic interaction with a Large Language Model (LLM) to generate and execute a Ganga job. The generated job will calculate an approximation of π using the accept-reject simulation method. The total number of simulations will be 1 million, split into 1000 subjobs, each performing 1000 simulations.
68+
69+
### **Workflow Execution**
70+
71+
#### **Step 1: LLM Setup & Communication**
72+
73+
- Initialize interaction with the chosen LLM (deepseek-coder-v2:latest on Ollama CLI in this case).
74+
- Install Ollama and deepseek-coder-v2:latest on Ollama
75+
- - To install ollama on linux ,execute:
76+
```sh
77+
curl -fsSL https://ollama.com/install.sh | sh
78+
```
79+
- - To install and run deepseek-coder-v2:latest on Ollama CLI ,execute:
80+
```sh
81+
ollama run deepseek-coder-v2:latest
82+
```
83+
84+
#### **Step 2: Prompting the LLM for Code Generation**
85+
86+
- Send a prompt to generate a structured prompt.
87+
88+
```
89+
generate me a prompt to get me a plan for "code that can execute a job in Ganga that will
90+
calculate an approximation to the number pi using an accept-reject simulation method will one
91+
million simulations. The job should be split into a number of subjobs that each do thousand
92+
simulations."
93+
94+
```
95+
96+
![Generating prompt for LLM Screenshot](images/Input_prompt_2.png)
97+
98+
- Send the Structured prompt to generate the output.
99+
100+
```
101+
Create a Python script that will:
102+
103+
1. Set up and submit a job in Ganga to perform one million simulations using the Monte Carlo
104+
accept-reject method for calculating an approximation of π.
105+
2. Split this job into multiple subjobs, where each subjob conducts 1000 independent
106+
simulations.
107+
3. Collect the results from all subjobs and compute the final approximation of π.
108+
109+
Please provide a detailed breakdown of:
110+
111+
1. A job submission script that sets up and submits the Ganga job.
112+
2. A simulation script that performs the Monte Carlo estimation for π, executed within each
113+
subjob.
114+
3. A result aggregation script that gathers outputs from all subjobs and computes the final
115+
approximation of π.
116+
4. Any necessary parallelization strategies to ensure efficient execution across distributed
117+
resources.
118+
5. Best practices for running distributed computations in Ganga, including resource allocation
119+
and monitoring.
120+
121+
```
122+
123+
![Generating Ganga code](images/Input_prompt_1.png)
124+
125+
#### **Step 3: The LLM Output**
126+
127+
To implement the task you've outlined using Python and Ganga, we need to break down the process
128+
into three main parts: the job submission script, the simulation script, and the result
129+
aggregation script. Below is a detailed guide on how these components can be implemented along
130+
with some best practices for running distributed computations in Ganga.
131+
132+
### 1. Job Submission Script
133+
The job submission script (`create_pi_job.py`) will set up and submit the Ganga job to run
134+
multiple
135+
subjobs, each performing 1000 simulations independently. Here's how you can do it:
136+
137+
```python
138+
from GangaCore.GPI import Job, ArgSplitter, Local, Executable
139+
140+
# Number of subjobs and simulations per subjob
141+
num_subjobs = 1000
142+
sims_per_subjob = 1000
143+
144+
# Path to the simulation script
145+
simulation_script = "/path/to/project/scripts/pi_simulation.py"
146+
147+
# Create the job
148+
j = Job(
149+
application=Executable(
150+
exe="/path/to/python"
151+
),
152+
backend=Local(),
153+
splitter=ArgSplitter(
154+
args=[[simulation_script, str(sims_per_subjob)]] * num_subjobs # Correctly format the
155+
args
156+
)
157+
)
158+
159+
# Submit the job
160+
j.submit()
161+
162+
```
163+
164+
### 2. Simulation Script
165+
The simulation script (`pi_simulation.py`) will perform one instance of the Monte Carlo method
166+
to approximate π:
167+
168+
```python
169+
import sys
170+
import random
171+
172+
def monte_carlo_pi(num_samples):
173+
inside_circle = 0
174+
175+
for _ in range(num_samples):
176+
x, y = random.uniform(-1, 1), random.uniform(-1, 1)
177+
if x**2 + y**2 <= 1:
178+
inside_circle += 1
179+
180+
return (inside_circle / num_samples) * 4 # π approximation
181+
182+
if __name__ == "__main__":
183+
num_samples = int(sys.argv[1])
184+
pi_estimate = monte_carlo_pi(num_samples)
185+
186+
# Print result to stdout (Ganga will capture this)
187+
print(pi_estimate)
188+
189+
```
190+
191+
### 3. Result Aggregation Script
192+
The result aggregation script (`pi_merger.py`) will collect the results from all subjobs
193+
and compute the final approximation of π:
194+
195+
```python
196+
import os
197+
198+
total_pi = 0
199+
subjob_count = 0
200+
201+
# Base path where Ganga stores job output
202+
base_dir = "/path/to/gangadir/workspace/user/LocalXML"
203+
204+
# Change this to your actual job ID
205+
job_id = __ # Make sure this is correct!
206+
207+
# Path to the job's output directory
208+
job_output_dir = os.path.join(base_dir, str(job_id))
209+
210+
# Iterate over all subjobs
211+
for subjob_id in range(1000):
212+
output_file = os.path.join(job_output_dir, str(subjob_id), "output", "stdout")
213+
214+
if os.path.exists(output_file):
215+
with open(output_file, "r") as f:
216+
try:
217+
pi_value = float(f.read().strip())
218+
total_pi += pi_value
219+
subjob_count += 1
220+
except ValueError:
221+
print(f"Skipping subjob {subjob_id}: Invalid output")
222+
else:
223+
print(f"Skipping subjob {subjob_id}: stdout file not found at {output_file}")
224+
225+
# Compute final approximation
226+
if subjob_count > 0:
227+
final_pi = total_pi / subjob_count
228+
print(f"Final π approximation: {final_pi}")
229+
else:
230+
print("No valid results found!")
231+
232+
```
233+
234+
### Best Practices for Running Distributed Computations in Ganga
235+
- **Efficient Subjob Splitting**: Ensure that the subjobs are balanced and do not overload the
236+
system by creating too many small jobs. You can adjust `subjobs_per_job` based on your
237+
resources.
238+
- **Error Handling**: Implement error handling to manage failed subjobs gracefully. This can
239+
help
240+
in debugging and ensuring data integrity.
241+
- **Output Data Management**: Store or log results from each simulation for later reference,
242+
especially important if the simulations are resource-intensive or time-consuming.
243+
- **Resource Allocation**: Make sure that your computational resources (like CPU, memory) are
244+
appropriately allocated to avoid overloading the system.
245+
- **Monitoring and Logging**: Use Ganga's built-in tools to monitor job status and logs for
246+
debugging purposes. This is crucial when dealing with distributed computations.
247+
248+
This setup should help you efficiently run a large number of Monte Carlo simulations to
249+
approximate π using Ganga, following best practices in parallel computing and resource
250+
management.
251+
252+
#### **Step 4: Parsing the output code**
253+
Make 3 python files as follows with the code provided above.
254+
255+
1. create_pi_job.py
256+
2. pi_merger.py
257+
3. pi_simulation.py
258+
259+
**Make sure the job_id is set correctly**
260+
261+
**MAKE SURE THE ADRESSES IN THE CODE ARE CORRECT**
262+
263+
#### **Step 5: Run the code**
264+
265+
- To run *create_pi_job.py* in ganga , run:
266+
267+
```sh
268+
ganga path/to/create_pi_job.py
269+
```
270+
271+
![Running create_pi_job.py in ganga ](images/Job_submit.png)
272+
273+
- To run *pi_merger.py* in ganga , run:
274+
275+
```sh
276+
python path/to/pi_merger.py
277+
```
278+
279+
![Running pi_merger.py in python ](images/Pi_merge.png)
280+
281+
## **Current Status**
282+
- [x] LLM-based code generation tested with Ollama.
283+
- [x] Job submission, execution, and result aggregation working.
284+
- [x] Debugging and improvements made to dynamically set job IDs.
285+
- [ ] Automated test for execution success in progress.
286+
287+

images/Input_prompt_1.png

173 KB
Loading

images/Input_prompt_2.png

161 KB
Loading

images/Job_submit.png

77 KB
Loading

images/LLM_CLI.png

138 KB
Loading

images/Pi_merge.png

113 KB
Loading
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
from GangaCore.GPI import Job, ArgSplitter, Local, Executable
2+
3+
# Number of subjobs and simulations per subjob
4+
num_subjobs = 1000
5+
sims_per_subjob = 1000
6+
7+
# Path to the simulation script
8+
simulation_script = "/home/uverma/Documents/code/My_Ganga/my_code/Interfacing_Ganga/pi_simulation.py"
9+
10+
# Create the job
11+
j = Job(
12+
application=Executable(
13+
exe="/home/uverma/Documents/code/My_Ganga/p_env/bin/python"
14+
),
15+
backend=Local(),
16+
splitter=ArgSplitter(
17+
args=[[simulation_script, str(sims_per_subjob)]] * num_subjobs # Correctly format the args
18+
)
19+
)
20+
21+
# Submit the job
22+
j.submit()
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
import os
2+
3+
total_pi = 0
4+
subjob_count = 0
5+
6+
# Base path where Ganga stores job output
7+
base_dir = "/home/uverma/gangadir/workspace/uverma/LocalXML"
8+
9+
# Change this to your actual job ID
10+
job_id = 137 # Make sure this is correct!
11+
12+
# Path to the job's output directory
13+
job_output_dir = os.path.join(base_dir, str(job_id))
14+
15+
# Iterate over all subjobs
16+
for subjob_id in range(1000):
17+
output_file = os.path.join(job_output_dir, str(subjob_id), "output", "stdout")
18+
19+
if os.path.exists(output_file):
20+
with open(output_file, "r") as f:
21+
try:
22+
pi_value = float(f.read().strip())
23+
total_pi += pi_value
24+
subjob_count += 1
25+
except ValueError:
26+
print(f"Skipping subjob {subjob_id}: Invalid output")
27+
else:
28+
print(f"Skipping subjob {subjob_id}: stdout file not found at {output_file}")
29+
30+
# Compute final approximation
31+
if subjob_count > 0:
32+
final_pi = total_pi / subjob_count
33+
print(f"Final π approximation: {final_pi}")
34+
else:
35+
print("No valid results found!")
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
import sys
2+
import random
3+
4+
def monte_carlo_pi(num_samples):
5+
inside_circle = 0
6+
7+
for _ in range(num_samples):
8+
x, y = random.uniform(-1, 1), random.uniform(-1, 1)
9+
if x**2 + y**2 <= 1:
10+
inside_circle += 1
11+
12+
return (inside_circle / num_samples) * 4 # π approximation
13+
14+
if __name__ == "__main__":
15+
num_samples = int(sys.argv[1])
16+
pi_estimate = monte_carlo_pi(num_samples)
17+
18+
# Print result to stdout (Ganga will capture this)
19+
print(pi_estimate)

0 commit comments

Comments
 (0)