Skip to content

Commit bbdee59

Browse files
committed
docs: Update README with installation command and usage examples for named sessions
1 parent a411876 commit bbdee59

File tree

1 file changed

+17
-5
lines changed

1 file changed

+17
-5
lines changed

README.md

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ in your code using the builder API:
3939
1. Install the latest version of Dataproc Spark Connect:
4040

4141
```sh
42-
pip install -U dataproc_spark_connect
42+
!pip install -U --pre dataproc-spark-connect
4343
```
4444

4545
2. Add the required imports into your PySpark application or notebook and start
@@ -75,19 +75,31 @@ Named sessions allow you to share a single Spark session across multiple noteboo
7575

7676
To create or connect to a named session:
7777

78-
1. Specify a custom session ID when creating the session:
78+
1. Create a session with a custom ID in your first notebook:
7979

8080
```python
8181
from google.cloud.dataproc_spark_connect import DataprocSparkSession
8282
session_id = 'my-ml-pipeline-session'
8383
spark = DataprocSparkSession.builder.dataprocSessionId(session_id).runtimeVersion('3.0').getOrCreate()
84+
df = spark.createDataFrame([(1, 'data')], ['id', 'value'])
85+
df.show()
8486
```
8587

86-
2. Session IDs must be 4-63 characters long, start with a lowercase letter, contain only lowercase letters, numbers, and hyphens, and not end with a hyphen.
88+
2. Reuse the same session in another notebook by specifying the same session ID:
8789

88-
3. Named sessions persist until explicitly terminated or reach their configured TTL.
90+
```python
91+
from google.cloud.dataproc_spark_connect import DataprocSparkSession
92+
session_id = 'my-ml-pipeline-session'
93+
spark = DataprocSparkSession.builder.dataprocSessionId(session_id).runtimeVersion('3.0').getOrCreate()
94+
df = spark.createDataFrame([(2, 'more-data')], ['id', 'value'])
95+
df.show()
96+
```
97+
98+
3. Session IDs must be 4-63 characters long, start with a lowercase letter, contain only lowercase letters, numbers, and hyphens, and not end with a hyphen.
99+
100+
4. Named sessions persist until explicitly terminated or reach their configured TTL.
89101

90-
4. A session with a given ID that is in a TERMINATED state cannot be reused. It must be deleted before a new session with the same ID can be created.
102+
5. A session with a given ID that is in a TERMINATED state cannot be reused. It must be deleted before a new session with the same ID can be created.
91103

92104
### Using Spark SQL Magic Commands (Jupyter Notebooks)
93105

0 commit comments

Comments
 (0)