|
2 | 2 | :imagesdir: ../../../images |
3 | 3 |
|
4 | 4 | [id="creating-data-science-project"] |
5 | | -= Creating Data Science Project |
6 | | - |
7 | | -In this learning exercise, we will configure a Jupyter notebook server using a specified image within a Data Science project, customizing it to meet your specific requirements. In the Developer Sandbox environment, a Data Science project is created for your convenience. Please navigate to and open this project to set up a Workbench. |
8 | | - |
9 | | -.Prerequisites |
10 | | - |
11 | | -* An OpenShift cluster |
| 5 | += First AI Demo |
12 | 6 |
|
| 7 | +In this demo, you will configure a Jupyter notebook server using a specified image within a Data Science project, customizing it to meet your specific requirements. |
13 | 8 |
|
14 | 9 | .Procedure |
15 | 10 |
|
16 | | -. Click the *Red Hat OpenShift AI* from the nines menu on the OpenShift Console. |
| 11 | +. Click the *Red Hat OpenShift AI* from the nines menu on the OpenShift Console. |
17 | 12 |
|
18 | 13 | . Click *Log in with OpenShift* |
19 | 14 |
|
20 | 15 | . Click on the *Data Science Projects* tab. |
21 | 16 |
|
22 | | -. Click Create project |
| 17 | +. Click *Create project* |
23 | 18 |
|
24 | | -. Steps about creating a project |
| 19 | +.. Enter a name for the project in the *Name* field and click *Create*. |
25 | 20 |
|
26 | 21 | . Click on *Create a workbench*. Now you are ready to move to the next step to define the workbench. |
27 | 22 |
|
28 | 23 | .. Give the *workbench* a name. |
29 | 24 |
|
30 | | -.. Select the Notebook image from the *image selection dropdown* as *Standard Data Science*. |
| 25 | +.. Select the *Notebook image* from the *image selection* dropdown as *Standard Data Science*. |
31 | 26 |
|
32 | | -.. Select the Container size to *small* under *Deployment size*. |
| 27 | +.. Select the Container size to *Small* under *Deployment size*. |
33 | 28 |
|
34 | | -.. Scroll down and in the *Cluster storage* section, create new persistent storage and give it a name. |
| 29 | +.. Scroll down and in the *Cluster storage* section, create a name for the new persistent storage that will be created. |
35 | 30 |
|
36 | | -.. Click the *Create workbench* button at the bottom of the page. |
| 31 | +.. Set the *persistent storage size* to 10 Gi. |
37 | 32 |
|
38 | | -After successful implementation, the status of the workbench turns to "Running," similar to Figure 10 below. |
| 33 | +.. Click the *Create workbench* button at the bottom of the page. |
| 34 | ++ |
| 35 | +After successful implementation, the status of the workbench turns to *Running* |
39 | 36 |
|
40 | | -.. Click on the *Open↗* button, located beside the status. |
| 37 | +.. Click the *Open↗* button, located beside the status. |
41 | 38 |
|
42 | 39 | .. Authorize the access with the OpenShift cluster by clicking on the *Allow selected permissions*. After granting permissions with OpenShift, you will be directed to the Jupyter Notebook page. |
43 | 40 |
|
44 | | -The Jupyter Notebook provides functionality to fetch or clone existing GitHub repositories, similar to any other standard IDE. Therefore, in this lesson, we will clone an existing simple AI/ML code into the notebook using the following instructions. |
| 41 | +## Accessing the current data science project within Jupyter Notebook |
| 42 | + |
| 43 | +The Jupyter Notebook provides functionality to fetch or clone existing GitHub repositories, similar to any other standard IDE. Therefore, in this section, you will clone an existing simple AI/ML code into the notebook using the following instructions. |
45 | 44 |
|
46 | 45 | . From the top, click on the *Git clone* icon. |
47 | 46 |
|
48 | | -. Enter the URL of the GitHub repository in the *Git Repository URL* field: |
| 47 | +. In the popup window enter the URL of the GitHub repository in the *Git Repository URL* field: |
49 | 48 | + |
50 | 49 | [source,text] |
51 | 50 | ---- |
52 | 51 | https://github.com/redhat-developer-demos/openshift-ai.git |
53 | 52 | ---- |
54 | 53 |
|
55 | | -. After fetching the github repository, the project will appear in the directory section on the left side of the notebook. |
56 | | - |
57 | | -. Checkout in the "/openshift-ai/first-app/" directory. |
| 54 | +. Click the Clone button. |
58 | 55 |
|
59 | | -. Open the "openshift-ai-test.ipynb" file. |
60 | | -+ |
61 | | -[source,terminal] |
62 | | ----- |
63 | | -$ git remote add -f upstream [email protected]:validatedpatterns-sandbox/openshift-ai.git |
64 | | ----- |
| 56 | +. After fetching the github repository, the project will appear in the directory section on the left side of the notebook. |
65 | 57 |
|
66 | | -. Verify the setup of your remote repositories by running the following command: |
67 | | -+ |
68 | | -[source,terminal] |
69 | | ----- |
70 | | -$ git remote -v |
71 | | ----- |
72 | | -+ |
73 | | -.Example output |
74 | | -+ |
75 | | -[source,terminal] |
76 | | ----- |
77 | | -origin [email protected]:<your-username>/openshift-ai.git (fetch) |
78 | | -origin [email protected]:<your-username>/openshift-ai.git (push) |
79 | | -upstream [email protected]:validatedpatterns-sandbox/openshift-ai.git (fetch) |
80 | | -upstream [email protected]:validatedpatterns-sandbox/openshift-ai.git (push) |
81 | | ----- |
| 58 | +. Expand the */openshift-ai/first-app/* directory. |
82 | 59 |
|
83 | | -. Create a local copy of the secret values file that can safely include credentials. Run the following commands: |
| 60 | +. Open the *openshift-ai-test.ipynb* file. |
84 | 61 | + |
85 | | -[source,terminal] |
86 | | ----- |
87 | | -$ cp values-secret.yaml.template ~/values-secret-openshift-ai.yaml |
88 | | ----- |
89 | | -+ |
90 | | -[NOTE] |
91 | | -==== |
92 | | -Putting the `values-secret.yaml` in your home directory ensures that it does not get pushed to your git repository. It is based on the `values-secrets.yaml.template` file provided by the pattern in the top level directory. When you create your own patterns you will add your secrets to this file and save. |
93 | | -==== |
| 62 | +You will be presented with the view of a Jupyter Notebook. |
94 | 63 |
|
95 | | -. Create a new feature branch, for example `my-branch` from the `rhoai` branch for your content: |
96 | | -+ |
97 | | -[source,terminal] |
98 | | ----- |
99 | | -$ git checkout -b my-branch rhoai |
100 | | ----- |
| 64 | +## Running code in a Jupyter notebook |
101 | 65 |
|
102 | | -. Create a local branch and push it to origin to gain the flexibility needed to customize the OpenShift AI pattern by running the following command: |
103 | | -+ |
104 | | -[source,terminal] |
105 | | ----- |
106 | | -$ git push origin my-branch |
107 | | ----- |
| 66 | +In the previous section, you imported and opened the notebook. To run the code within the notebook, you start by clicking the *Run* icon located at the top of the interface. This action initiates the execution of the code in the currently selected cell. |
108 | 67 |
|
109 | | -You can proceed to install the OpenShift AI pattern by using the web console or from command line by using the script `./pattern.sh` script. |
| 68 | +After you click *Run*, you will notice that the notebook automatically moves to the next cell. This is part of the design of Jupyter Notebooks, where scripts or code snippets are divided into multiple cells. Each cell can be run independently, allowing you to test specific sections of code in isolation. This structure greatly aids in both developing complex code incrementally and debugging it more effectively, as you can pinpoint errors and test solutions cell by cell. |
110 | 69 |
|
111 | | -To install the OpenShift AI pattern by using the web console you must first install the Validated Patterns Operator. The Validated Patterns Operator installs and manages Validated Patterns. |
| 70 | +For instance, as shown in Figure 15, after executing a cell, you can immediately see the output just below it. This immediate feedback loop is invaluable for iterative testing and refining of code. |
112 | 71 |
|
113 | | -//Include Procedure module here |
114 | | -[id="installing-validated-patterns-operator_{context}"] |
115 | | -== Installing the {validated-patterns-op} using the web console |
| 72 | +[id="interactive-classification-project"] |
| 73 | += Performing an interactive classification with Jupyter notebook |
116 | 74 |
|
117 | | -.Prerequisites |
118 | | -* Access to an {ocp} cluster by using an account with `cluster-admin` permissions. |
| 75 | +In this section, you will perform an interactive classification using a Jupyter notebook. |
119 | 76 |
|
120 | 77 | .Procedure |
121 | 78 |
|
122 | | -. Navigate in the {hybrid-console-first} to the *Operators* → *OperatorHub* page. |
123 | | - |
124 | | -. Scroll or type a keyword into the *Filter by keyword* box to find the Operator you want. For example, type `validated patterns` to find the {validated-patterns-op}. |
125 | | - |
126 | | -. Select the Operator to display additional information. |
127 | | -+ |
128 | | -[NOTE] |
129 | | -==== |
130 | | -Choosing a Community Operator warns that Red Hat does not certify Community Operators; you must acknowledge the warning before continuing. |
131 | | -==== |
132 | | - |
133 | | -. Read the information about the Operator and click *Install*. |
134 | | - |
135 | | -. On the *Install Operator* page: |
136 | | - |
137 | | -.. Select an *Update channel* (if more than one is available). |
138 | | - |
139 | | -.. Select a *Version* (if more than one is available). |
140 | | - |
141 | | -.. Select an *Installation mode*: |
142 | | -+ |
143 | | -The only supported mode for this Operator is *All namespaces on the cluster (default)*. This installs the Operator in the default `openshift-operators` namespace to watch and be made available to all namespaces in the cluster. This option is not always available. |
144 | | - |
145 | | -.. Select *Automatic* or *Manual* approval strategy. |
146 | | - |
147 | | -. Click *Install* to make the Operator available to the default `openshift-operators` namespace on this {ocp} cluster. |
148 | | - |
149 | | -.Verification |
150 | | -To confirm that the installation is successful: |
151 | | - |
152 | | -. Navigate to the *Operators* → *Installed Operators* page. |
| 79 | +. Click on the *Data Science Projects* tab. |
153 | 80 |
|
154 | | -. Check that the Operator is installed in the selected namespace and its status is `Succeeded`. |
| 81 | +. Click *Create project* |
155 | 82 |
|
156 | | -//Include Procedure module here |
157 | | -[id="create-pattern-instance_{context}"] |
158 | | -== Creating the OpenShift AI instance |
| 83 | +.. Enter a name for the project in the *Name* field and click *Create*. |
159 | 84 |
|
160 | | -.Prerequisites |
161 | | -The {validated-patterns-op} is successfully installed in the relevant namespace. |
| 85 | +. Click on *Create a workbench*. Now you are ready to move to the next step to define the workbench. |
162 | 86 |
|
163 | | -.Procedure |
| 87 | +.. Give the *workbench* a name for example *interactive_classification*. |
164 | 88 |
|
165 | | -. Navigate to the *Operators* → *Installed Operators* page. |
| 89 | +.. Select the *Notebook image* from the *image selection* dropdown as *TensorFlow*. |
166 | 90 |
|
167 | | -. Click the installed *{validated-patterns-op}*. |
| 91 | +.. Select the Container size to *Medium* under *Deployment size*. |
168 | 92 |
|
169 | | -. Under the *Details* tab, in the *Provided APIs* section, in the |
170 | | -*Pattern* box, click *Create instance* that displays the *Create Pattern* page. |
| 93 | +.. Scroll down and in the *Cluster storage* section, create a name for the new persistent storage that will be created. |
171 | 94 |
|
172 | | -. On the *Create Pattern* page, select *Form view* and enter information in the following fields: |
| 95 | +.. Set the *persistent storage size* to 20 Gi. |
173 | 96 |
|
174 | | -** *Name* - A name for the pattern deployment that is used in the projects that you created. |
175 | | -** *Labels* - Apply any other labels you might need for deploying this pattern. |
176 | | -** *Cluster Group Name* - Select a cluster group name to identify the type of cluster where this pattern is being deployed. For example, if you are deploying the {ie-pattern}, the cluster group name is `datacenter`. If you are deploying the {mcg-pattern}, the cluster group name is `hub`. |
177 | | -+ |
178 | | -To know the cluster group name for the patterns that you want to deploy, check the relevant pattern-specific requirements. |
179 | | -. Expand the *Git Config* section to reveal the options and enter the required information. |
180 | | -. Leave *In Cluster Git Server* unchanged. |
181 | | -.. Change the *Target Repo* URL to your forked repository URL. For example, change `https://github.com/validatedpatterns/<pattern_name>` to `https://github.com/<your-git-username>/<pattern-name>` |
182 | | -.. Optional: You might need to change the *Target Revision* field. The default value is `HEAD`. However, you can also provide a value for a branch, tag, or commit that you want to deploy. For example, `v2.1`, `main`, or a branch that you created, `my-branch`. |
183 | | -. Click *Create*. |
| 97 | +.. Click the *Create workbench* button at the bottom of the page. |
184 | 98 | + |
185 | | -[NOTE] |
186 | | -==== |
187 | | -A pop-up error with the message "Oh no! Something went wrong." might appear during the process. This error can be safely disregarded as it does not impact the installation of the OpenShift AI pattern. Use the Hub ArgoCD UI, accessible through the nines menu, to check the status of ArgoCD instances, which will display states such as progressing, healthy, and so on, for each managed application. The Cluster ArgoCD provides detailed status on each application, as defined in the clustergroup values file. |
188 | | -==== |
| 99 | +After successful implementation, the status of the workbench turns to *Running* |
189 | 100 |
|
190 | | -The *{rh-gitops} Operator* displays in list of *Installed Operators*. The *{rh-gitops} Operator* installs the remaining assets and artifacts for this pattern. To view the installation of these assets and artifacts, such as *{rh-rhacm-first}*, ensure that you switch to *Project:All Projects*. |
| 101 | +.. Click the *Open↗* button, located beside the status. |
191 | 102 |
|
192 | | -Wait some time for everything to deploy. You can track the progress through the `Hub ArgoCD` UI from the nines menu. The `config-demo` project appears stuck in a `Degraded` state. This is the expected behavior when installing using the OpenShift Container Platform console. |
| 103 | +.. Authorize the access with the OpenShift cluster by clicking on the *Allow selected permissions*. After granting permissions with OpenShift, you will be directed to the Jupyter Notebook page. |
193 | 104 |
|
194 | | -* To resolve this you need to run the following to load the secrets into the vault: |
195 | | -+ |
196 | | -[source,terminal] |
197 | | ----- |
198 | | -$ ./pattern.sh make load-secrets |
199 | | ----- |
200 | | -+ |
201 | | -[NOTE] |
202 | | -==== |
203 | | -You must have created a local copy of the secret values file by running the following command: |
| 105 | +## Obtaining and preparing the dataset |
204 | 106 |
|
205 | | -[source,terminal] |
206 | | ----- |
207 | | -$ cp values-secret.yaml.template ~/values-secret-openshift-ai.yaml |
208 | | ----- |
209 | | -==== |
| 107 | +Simplify data preparation in AI projects by automating the fetching of datasets using Kaggle's API following these steps: |
210 | 108 |
|
211 | | -The deployment will not take long but it should deploy successfully. |
| 109 | +. Navigate to the Kaggle website and log in with your account credentials. |
212 | 110 |
|
213 | | -Alternatively you can deploy the OpenShift AI pattern by using the command line script `pattern.sh`. |
| 111 | +. Click on your profile icon at the top right corner of the page, then select Account from the dropdown menu. |
214 | 112 |
|
215 | | -[id="deploying-cluster-using-patternsh-file"] |
216 | | -== Deploying the cluster by using the pattern.sh script |
| 113 | +. Scroll down to the section labeled API. Here, you'll find a Create New Token button. Click this button. |
217 | 114 |
|
218 | | -To deploy the cluster by using the `pattern.sh` script, complete the following steps: |
| 115 | +. A file named `kaggle.json` will be downloaded to your local machine. This file contains your Kaggle API credentials. |
219 | 116 |
|
220 | | -. Navigate to the root directory of the cloned repository by running the following command: |
221 | | -+ |
222 | | -[source,terminal] |
223 | | ----- |
224 | | -$ cd /path/to/your/repository |
225 | | ----- |
| 117 | +. Upload the `kaggle.json` file to your JupyterLab IDE environment. You can drag and drop the file into the file browser of JupyterLab IDE. This step might visually look different depending on your Operating System and Desktop User interface. |
226 | 118 |
|
227 | | -. Log in to your cluster by running the following this procedure: |
| 119 | +. Clone the Interactive Image Classification Project from the GitHub repository using the following instructions: |
228 | 120 |
|
229 | | -.. Obtain an API token by visiting https://oauth-openshift.apps.<your-cluster>.<domain>/oauth/token/request |
| 121 | +.. At the top of the JupyterLab interface, click on the "Git Clone" icon. |
230 | 122 |
|
231 | | -.. Log in with this retrieved token by running the following command: |
| 123 | +.. In the popup window, enter the URL of the GitHub repository in the "Git Repository URL" field: |
232 | 124 | + |
233 | | -[source,terminal] |
| 125 | +[source,text] |
234 | 126 | ---- |
235 | | -$ oc login --token=<retrieved-token> --server=https://api.<your-cluster>.<domain>:6443 |
| 127 | +https://github.com/redhat-developer-demos/openshift-ai.git |
236 | 128 | ---- |
237 | 129 |
|
238 | | -. Alternatively log in by running the following command: |
239 | | -+ |
240 | | -[source,terminal] |
241 | | ----- |
242 | | -$ export KUBECONFIG=~/<path_to_kubeconfig> |
243 | | ----- |
| 130 | +.. Click the *Clone* button. |
244 | 131 |
|
245 | | -. Deploy the pattern to your cluster by running the following command: |
246 | | -+ |
247 | | -[source,terminal] |
248 | | ----- |
249 | | -$ ./pattern.sh make install |
250 | | ----- |
| 132 | +.. After cloning, navigate to the *openshift-ai/2_interactive_classification* directory within the cloned repository. |
251 | 133 |
|
252 | | -. Verify that the Operators have been installed. |
253 | | - .. To verify, in the OpenShift Container Platform web console, navigate to *Operators → Installed Operators* page. |
254 | | - .. Check that *{rh-gitops} Operator* is installed in the `openshift-operators` namespace and its status is `Succeeded`. |
255 | | -. Verify that all applications are synchronized. Under *Networking \-> Routes* select the *Location URL* associated with the *hub-gitops-server* . All application are report status as `Synched`. |
| 134 | +. Open the Python Notebook in the JupyterLab Interface. |
256 | 135 | + |
257 | | -image::rhoai/rhods-sync-success.png[ArgoCD Applications,link="/images/rhoai/rhods-sync-success.png"] |
| 136 | +The JupyterLab interface is presented after uploading `kaggle.json` and cloning the `openshift-ai``repository shown the file browser on the left with 'openshift-ai' and '.kaggle.json |
258 | 137 |
|
259 | | -As part of installing by using the script `pattern.sh` pattern, HashiCorp Vault is installed. Running `./pattern.sh make install` also calls the `load-secrets` makefile target. This `load-secrets` target looks for a YAML file describing the secrets to be loaded into vault and in case it cannot find one it will use the `values-secret.yaml.template` file in the git repository to try to generate random secrets. |
| 138 | +. Open `Interactive_Image_Classification_Notebook.ipynb`` in the `openshift-ai` directory and run the notebook, The notebook contains all necessary instructions and is self-documented. |
260 | 139 |
|
261 | | -For more information, see section on https://validatedpatterns.io/secrets/vault/[Vault]. |
| 140 | +. Run the cells in the Python Notebook as follows: |
262 | 141 |
|
263 | | -[id="verify-rhoai-dashboards"] |
264 | | -== Verify installation by checking the OpenShift AI Dashboard |
| 142 | +.. Start by executing each cell in order by pressing the play button or using the keyboard shortcut "Shift + Enter" |
265 | 143 |
|
266 | | -. Access the OpenShift AI dashboard from nines menu on the OpenShift Console and select the link for **Red Hat OpenShift AI**. |
267 | | -+ |
268 | | -image:rhoai/rhods-application_menu.png[Application ShortCut,link="/images/rhoai/rhods-application_menu.png"] |
| 144 | +.. Once you run the cell in Step 4, you should see an output as shown in Figure 12 below. |
269 | 145 |
|
270 | | -. Log in to the dashboard using your OpenShift credentials. You will find an environment that is ready for further configuration. This pattern provides the fundamental platform pieces to support MLOps workflows. The installation of OpenShift Pipelines enables the immediate use of pipelines if that is the desired approach for deployment. |
271 | | -+ |
272 | | -image:rhoai/rhods-ai_dashboard.png[OpenShift AI Dashboard,link="/images/rhoai/hods-ai_dashboard.png"] |
| 146 | +.. Running the cell in Step 5, produces an output of two images, one of a cat and one of a dog, with their respective predictions labeled as "Cat" and "Dog". as shown in Figure 14 below. |
| 147 | + |
| 148 | +.. Once the code in the cell is executed in Step 6, a predict button will appear as shown in Figure 15 above. The interactive session will display images with their predicted labels in real-time as the user clicks the "Predict" button. This dynamic interaction helps in understanding how well the model performs across a random set of images and provides insights into potential improvements for model training. |
0 commit comments