Skip to content

Commit ddaf0ba

Browse files
authored
Minor updates (#157)
1 parent 6f45d03 commit ddaf0ba

File tree

2 files changed

+47
-31
lines changed

2 files changed

+47
-31
lines changed

AI_Postdoc_Workshop/module1/2-langchain.ipynb

Lines changed: 43 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@
4949
},
5050
{
5151
"cell_type": "code",
52-
"execution_count": 1,
52+
"execution_count": null,
5353
"id": "8dcf656b",
5454
"metadata": {},
5555
"outputs": [],
@@ -64,7 +64,7 @@
6464
"source": [
6565
"### String PromptTemplates\n",
6666
"\n",
67-
"The [String PromptTemplates](https://python.langchain.com/v0.2/docs/concepts/#string-prompttemplates) is used to format a string input. By default, the template takes Python `f-string` format. There are currently 2 choices of `template_format` available: `f-string` and `jinja2`. Later we will see the use of `jinja2` format. In the example below, we will use the `f-string` format."
67+
"The [String PromptTemplates](https://python.langchain.com/v0.2/docs/concepts/#string-prompttemplates) are used to format a string input. By default, the templates take Python's `f-string` format. There are currently 2 choices of `template_format` available: `f-string` and `jinja2`. Later we will see the use of `jinja2` format. In the example below, we will use the `f-string` format."
6868
]
6969
},
7070
{
@@ -91,7 +91,7 @@
9191
},
9292
{
9393
"cell_type": "code",
94-
"execution_count": 3,
94+
"execution_count": null,
9595
"id": "13cf3124",
9696
"metadata": {},
9797
"outputs": [],
@@ -124,7 +124,7 @@
124124
},
125125
{
126126
"cell_type": "code",
127-
"execution_count": 5,
127+
"execution_count": null,
128128
"id": "9",
129129
"metadata": {},
130130
"outputs": [],
@@ -163,7 +163,7 @@
163163
},
164164
{
165165
"cell_type": "code",
166-
"execution_count": 8,
166+
"execution_count": null,
167167
"id": "12",
168168
"metadata": {},
169169
"outputs": [],
@@ -192,14 +192,14 @@
192192
"source": [
193193
"#### Your turn 😎\n",
194194
"\n",
195-
"Create a `StringPromptTemplate` that outputs some text generation prompt, for example, \"Sun is part of galaxy ...\".\n",
195+
"Create a `String PromptTemplate` that outputs some text generation prompt, for example, \"Sun is part of galaxy ...\".\n",
196196
"\n",
197197
"Feel free to experiment with the built in [Python `f-string` ](https://docs.python.org/3.11/tutorial/inputoutput.html#formatted-string-literals) for the `prompt` input argument to the model."
198198
]
199199
},
200200
{
201201
"cell_type": "code",
202-
"execution_count": 10,
202+
"execution_count": null,
203203
"id": "b0cdc634",
204204
"metadata": {},
205205
"outputs": [],
@@ -220,12 +220,12 @@
220220
"id": "8e67e77a",
221221
"metadata": {},
222222
"source": [
223-
"LangChain have implemented a [`Runnable`](https://api.python.langchain.com/en/stable/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) protocol that allows us to create custom \"chains\".\n",
223+
"LangChain has implemented a [`Runnable`](https://api.python.langchain.com/en/stable/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) protocol that allows us to create custom \"chains\".\n",
224224
"This protocol has a standard interface for defining and invoking various LLMs, PromptTemplates, and other components, enabling reusability.\n",
225225
"For more details, go to LangChain's [Runnable documentation](https://python.langchain.com/v0.2/docs/concepts/#runnable-interface).\n",
226226
"\n",
227227
"```{note}\n",
228-
"In this tutorial, you will see the use of `.invoke` method on various LangChain's object.\n",
228+
"In this tutorial, you will see the use of the `.invoke` method on various LangChain objects.\n",
229229
"This is essentially using that standard interface for the `Runnable` protocol.\n",
230230
"```"
231231
]
@@ -240,7 +240,7 @@
240240
},
241241
{
242242
"cell_type": "code",
243-
"execution_count": 11,
243+
"execution_count": null,
244244
"id": "16",
245245
"metadata": {},
246246
"outputs": [],
@@ -250,7 +250,7 @@
250250
},
251251
{
252252
"cell_type": "code",
253-
"execution_count": 12,
253+
"execution_count": null,
254254
"id": "17",
255255
"metadata": {},
256256
"outputs": [],
@@ -290,7 +290,7 @@
290290
},
291291
{
292292
"cell_type": "code",
293-
"execution_count": 14,
293+
"execution_count": null,
294294
"id": "864d5266",
295295
"metadata": {},
296296
"outputs": [],
@@ -313,7 +313,7 @@
313313
"id": "8b730d9e",
314314
"metadata": {},
315315
"source": [
316-
"If you'd like to access the base object `Llama` object from the `llama-cpp-python` package, you can access it via the `.client` attribute of the `LlamaCpp` object."
316+
"If you'd like to access the base `Llama` object from the `llama-cpp-python` package, you can access it via the `.client` attribute of the `LlamaCpp` object."
317317
]
318318
},
319319
{
@@ -338,7 +338,7 @@
338338
},
339339
{
340340
"cell_type": "code",
341-
"execution_count": 17,
341+
"execution_count": null,
342342
"id": "18",
343343
"metadata": {},
344344
"outputs": [],
@@ -400,19 +400,19 @@
400400
"metadata": {},
401401
"source": [
402402
"As we can see above, the template reads as follows:\n",
403-
"- `eos_token` is a string that is added at the top of the resulting string after prompt is formatted.\n",
403+
"- `eos_token` is a string that is added at the top of the resulting string after the prompt is formatted.\n",
404404
"You can also see that `eos_token` is used to append `content` string values from an `assistant` `role`.\n",
405405
"You can find this value by going to the Model's [`tokenizer_config.json`](https://huggingface.co/allenai/OLMo-7B-Instruct-hf/blob/main/tokenizer_config.json#L233) file and looking for the `eos_token` key. *Unfornately, this is currently the only way to get this information, you can go to https://github.com/ggerganov/llama.cpp/issues/5040 for more details.* In our case, the `eos_token` is `<|endoftext|>`.\n",
406-
"- `messages` is a list of dictionary that is iterated over. As you can see that this dictionary should contain a `role` and `content` key.\n",
407-
"- `add_generation_prompt` is a boolean that is used to determine whether to add a generation prompt or not. In this case, when it's the last message and `add_generation_prompt` is `True`, it will add `<|assistant|>` string to the end of the prompt."
406+
"- `messages` is a list of dictionary that is iterated over. As you can see these dictionaries should contain `role` and `content` keys.\n",
407+
"- `add_generation_prompt` is a boolean that is used to determine whether to add a generation prompt or not. In this case, when it's the last message and `add_generation_prompt` is `True`, it will add the `<|assistant|>` string to the end of the prompt."
408408
]
409409
},
410410
{
411411
"cell_type": "markdown",
412412
"id": "f1ad5f5c",
413413
"metadata": {},
414414
"source": [
415-
"Now that we know what the template expects we can create the final prompt string by passing in the expected input variables, this time, instead of using the `.format` method, let's see what happens if we use the `.invoke` method on the `PromptTemplate` object."
415+
"Now that we know what the template expects we can create the final prompt string by passing in the expected input variables. This time, instead of using the `.format` method, let's see what happens if we use the `.invoke` method on the `PromptTemplate` object."
416416
]
417417
},
418418
{
@@ -464,7 +464,7 @@
464464
},
465465
{
466466
"cell_type": "code",
467-
"execution_count": 21,
467+
"execution_count": null,
468468
"id": "4caf24cf",
469469
"metadata": {},
470470
"outputs": [],
@@ -488,7 +488,7 @@
488488
"id": "6d33e0d4",
489489
"metadata": {},
490490
"source": [
491-
"You can see below that we get [`StringPromptValue`](https://api.python.langchain.com/en/latest/prompt_values/langchain_core.prompt_values.StringPromptValue.html) object this time as the output rather than pure string. But we can still get the string value by calling the `.to_string` method on the `StringPromptValue` object."
491+
"You can see below that we get a [`StringPromptValue`](https://api.python.langchain.com/en/latest/prompt_values/langchain_core.prompt_values.StringPromptValue.html) object this time as the output rather than a pure string. But we can still get the string value by calling the `.to_string` method on the `StringPromptValue` object."
492492
]
493493
},
494494
{
@@ -560,7 +560,7 @@
560560
"STEP 2: Prompt Template reads the variables to form the prompt text as output - \"What are stars and moon?\" \n",
561561
"STEP 3: The prompt is given as input to the LLM model. \n",
562562
"STEP 4: LLM Model produces output. \n",
563-
"STEP 5: The output goes through StrOutputParser that parses it into string and gives the result. "
563+
"STEP 5: The output goes through StrOutputParser that parses it into a string and gives the result. "
564564
]
565565
},
566566
{
@@ -573,7 +573,7 @@
573573
},
574574
{
575575
"cell_type": "code",
576-
"execution_count": 23,
576+
"execution_count": null,
577577
"id": "25",
578578
"metadata": {},
579579
"outputs": [],
@@ -667,7 +667,7 @@
667667
},
668668
{
669669
"cell_type": "code",
670-
"execution_count": 28,
670+
"execution_count": null,
671671
"id": "28",
672672
"metadata": {},
673673
"outputs": [],
@@ -682,7 +682,7 @@
682682
},
683683
{
684684
"cell_type": "code",
685-
"execution_count": 29,
685+
"execution_count": null,
686686
"id": "29",
687687
"metadata": {},
688688
"outputs": [],
@@ -705,7 +705,7 @@
705705
},
706706
{
707707
"cell_type": "code",
708-
"execution_count": 30,
708+
"execution_count": null,
709709
"id": "31",
710710
"metadata": {},
711711
"outputs": [],
@@ -748,7 +748,7 @@
748748
"source": [
749749
"#### Your turn 😎\n",
750750
"\n",
751-
"Try different messages value(s) and see how the output changes. But remember to follow the template structure.\n",
751+
"Try different message values and see how the output changes. But remember to follow the template structure.\n",
752752
"The dictionary keys must contain `role` and `content` and the allowed `role` values are only `user` and `assistant`."
753753
]
754754
},
@@ -761,6 +761,22 @@
761761
"source": [
762762
"# Write your llm_chain.invoke code here, feel free to also, create your own template and try partial_variables"
763763
]
764+
},
765+
{
766+
"cell_type": "code",
767+
"execution_count": null,
768+
"id": "daade3a0",
769+
"metadata": {},
770+
"outputs": [],
771+
"source": []
772+
},
773+
{
774+
"cell_type": "code",
775+
"execution_count": null,
776+
"id": "e20c605d-ecd3-400a-9ee7-cd1c9fa9d486",
777+
"metadata": {},
778+
"outputs": [],
779+
"source": []
764780
}
765781
],
766782
"metadata": {
@@ -779,7 +795,7 @@
779795
"name": "python",
780796
"nbconvert_exporter": "python",
781797
"pygments_lexer": "ipython3",
782-
"version": "3.11.9"
798+
"version": "3.11.11"
783799
}
784800
},
785801
"nbformat": 4,

AI_Postdoc_Workshop/module1/setup.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ During the tutorial, to follow along, we recommend using
77
to worry about setting up the compute environment. However, if you would like to
88
set up the tutorial on your local machine, you can use [**Conda**](#conda).
99

10-
[**GitHub Codespaces**](#github-codespaces), which is a cloud-based development
10+
[**GitHub Codespaces**](#github-codespaces), is a cloud-based development
1111
environment that's hosted in the cloud. This option is available indefinitely,
1212
but you will be limited in the free resources you can use with GitHub
1313
Codespaces.
@@ -29,7 +29,7 @@ GitHub Codespaces and you need to have a GitHub account to use GitHub
2929
Codespaces.
3030

3131
A codespace is a development environment that's hosted in the cloud. You are
32-
able to chose from various Dev container configuration, for this specific
32+
able to chose from various Dev container configurations. For this specific
3333
workshop, please ensure that `Scipy2024` is selected. GitHub currently gives
3434
every user
3535
[120 vCPU hours per month for free](https://docs.github.com/en/billing/managing-billing-for-github-codespaces/about-billing-for-github-codespaces#monthly-included-storage-and-core-hours-for-personal-accounts),
@@ -51,14 +51,14 @@ You can set up the tutorial locally using a Conda environment. Here's how:
5151

5252
0. Downloading and Installing Conda
5353

54-
If you don't have Conda installed, we recommend following the instruction to
54+
If you don't have Conda installed, we recommend following the instructions to
5555
download and install the
5656
[Miniforge distribution](https://github.com/conda-forge/miniforge) >=
5757
`Miniforge3-22.3.1-0` of Conda. This distribution is a minimal installer for
5858
conda specifically optimized for [conda-forge](https://conda-forge.org/)
5959
(Community-led recipes, infrastructure and distributions for conda.).
6060

61-
1. Create a new Conda environment called `ssec-scipy2024` with
61+
1. Create a new Conda environment called `ssec-scipy2024` with the
6262
[`conda-lock`](https://github.com/conda/conda-lock) package installed. This
6363
package is used to install the exact versions of the packages in the
6464
[`conda-lock.yml`](https://raw.githubusercontent.com/uw-ssec/docker-images/main/tutorial-scipy-2024/conda-lock.yml)

0 commit comments

Comments
 (0)