fix(docs): correct multiple typos in pdf_parsing_for_semantic_retrieval_systems.ipynb

Rahul-Lashkari · Rahul-Lashkari · commit 0ea8800c094d · 2025-04-19T22:21:05.000+05:30
diff --git a/site/en/docs/pdf_parsing_for_semantic_retrieval_systems.ipynb b/site/en/docs/pdf_parsing_for_semantic_retrieval_systems.ipynb
@@ -742,7 +742,7 @@
         "    questions: list[str],\n",
         "    links: list[str],\n",
         "  ) -> dict[str, Any]:\n",
-        "  \"\"\"Structred data extraction from image analysis.\"\"\"\n",
+        "  \"\"\"Structured data extraction from image analysis.\"\"\"\n",
         "  return {\n",
         "      'title': title,\n",
         "      'key_words': key_words,\n",
@@ -774,7 +774,7 @@
         "    function_declarations=[glm.FunctionDeclaration(\n",
         "    name=\"structured_data_extraction\",\n",
         "    description=textwrap.dedent(\"\"\"\\\n",
-        "    Structred data extraction from image analysis.\n",
+        "    Structured data extraction from image analysis.\n",
         "    \"\"\"),\n",
         "    parameters=glm.Schema(\n",
         "        type=glm.Type.OBJECT,\n",
@@ -815,20 +815,20 @@
         "  \"\"\"Extracts metadata from the image provided and returns it in a structured dict.\"\"\"\n",
         "  prompt = textwrap.dedent(f\"\"\"\n",
         "  You are an expert image analyzer. Given an image of a PDF page, your job is to write the following for each and every image.\n",
-        "  1. Generate key-words that matches the content from the image. (at most 10.)\n",
+        "  1. Generate key-words that match the content from the image. (at most 10.)\n",
         "  2. Suggest a one-word title for the image.\n",
         "  3. Generate 1-2 short questions from the image.\n",
         "  4. Extract links that are present in the image.\n",
         "\n",
         "  Your answer should follow the following format.\n",
         "  ** 1. Key-words**\n",
-        "  [list of relevant key-words to descibe the content of the image]\n",
+        "  [list of relevant key-words to describe the content of the image]\n",
         "\n",
         "  **2. Title**\n",
         "  Suggest a one-word title based on the content here.\n",
         "\n",
         "  **3. Questions**\n",
-        "      [lst of generated questions here...]\n",
+        "      [list of generated questions here...]\n",
         "      ....\n",
         "\n",
         "  **4. Links**\n",
@@ -952,7 +952,7 @@
         "id": "-1q_v21t2E94"
       },
       "source": [
-        "Neat! The models were successfuly able to extract your custom metadata from the given information sources!"
+        "Neat! The models were successfully able to extract your custom metadata from the given information sources!"
       ]
     },
     {
@@ -1019,7 +1019,7 @@
         "    is_separator_regex=False,\n",
         "  )\n",
         "\n",
-        "  # iter through all PDF files.\n",
+        "  # iterate through all PDF files.\n",
         "  for filename, file_bytes in pdfs.items():\n",
         "    print(f\"Extracting data from file: {filename}\")\n",
         "\n",
@@ -1239,7 +1239,7 @@
         "id": "vZl-A8EMVCZu"
       },
       "source": [
-        "`relevant_chunks` has chunks that matched our search results. Each chunk returned has a `chunk_relevance_score` and `chunk`. Where `chunk_relevance_score` deontes the degree to which the `user_query` is semantically similar to the contents from `chunk`."
+        "`relevant_chunks` has chunks that matched our search results. Each chunk returned has a `chunk_relevance_score` and `chunk`. Where `chunk_relevance_score` denotes the degree to which the `user_query` is semantically similar to the contents from `chunk`."
       ]
     },
     {