GITBOOK-276: ThiloteE's Minor fixes in add entry using reference text and AI functionality

ThiloteE · gitbook-bot · commit 7b9262c3fd38 · 2025-02-14T17:12:25.000Z
diff --git a/en/.gitbook/assets/Bild 5 - rule based result is selected in entry table - light mode.png b/en/.gitbook/assets/Bild 5 - rule based result is selected in entry table - light mode.png
diff --git a/en/ai/local-llm.md b/en/ai/local-llm.md
@@ -3,8 +3,8 @@
 Notice:
 
 1. This tutorial is intended for expert users
-2. (Local) LLMs requires a lot of computational power
-3. Smaller models (in terms of parameter size) typically respond qualitatively worse than bigger ones, but they are faster and need less memory.
+2. (Local) LLMs require a lot of computational power
+3. Smaller models (in terms of parameter size) typically respond qualitatively worse than bigger ones, but they are faster, need less memory and might already be sufficient for your use case.
 
 ## High-level explanation
 
@@ -42,5 +42,5 @@ The following steps guide you on how to use `GPT4All`to download and run local L
 3. Open JabRef, go to "File" > "Preferences" > "AI"
 4. Set the "AI provider" to "GPT4All"
 5. Set the "Chat model" to the name (including the `.gguf`part) of the model you have downloaded in GPT4All.
-6. Set the "API base URL" in "Expert Settings" to \`http://localhost:4891/v1/chat/completions\`.
+6. Set the "API base URL" in "Expert Settings" to `http://localhost:4891/v1/chat/completions`.
 
diff --git a/en/ai/preferences.md b/en/ai/preferences.md
@@ -4,17 +4,17 @@
 
 ## General settings
 
-- "Enable AI functionality in JabRef": by default it is turned off, so you need to check this option if you want to use the new AI features
-- "Automatically generate embeddings for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically start an embeddings generation task. (If you do not know what are the embeddings, take a look at ["How does the AI functionality work?"](https://docs.jabref.org/ai#how-does-the-ai-functionality-work)).
-- "Automatically generate summaries for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically generate a summary.
+* "Enable AI functionality in JabRef": by default it is turned off, so you need to check this option if you want to use the new AI features
+* "Automatically generate embeddings for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically start an embeddings generation task. (If you do not know what are the embeddings, take a look at ["How does the AI functionality work?"](https://docs.jabref.org/ai#how-does-the-ai-functionality-work)).
+* "Automatically generate summaries for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically generate a summary.
 
 If you import a lot of entries at a time, we recommend you to switch off options "Automatically generate embeddings for new entries" and "Automatically generate summaries for new entries", because this may slow down your computer, and you may reach the usage limit of the AI provider.
 
 ## Connection settings
 
-- "AI provider": you can choose either OpenAI, Mistral AI, or Hugging Face
-- "Chat model": choose the model you like (for OpenAI we recommend `gpt-4o-mini`, as to date, it is the cheapest and fastest, though we also recommend to look up the prices periodically, as they are subject to change)
-- "API token": enter your API token here
+* "AI provider": you can choose between [various providers](https://docs.jabref.org/ai/ai-providers-and-api-keys#what-is-an-ai-provider).
+* "Chat model": choose the model you like.
+* "API key": enter your API key here.
 
 ## Expert settings
 
@@ -34,7 +34,7 @@ You do not have to set this parameter manually or remember all the addresses. Ja
 
 The embedding model transforms a document (or a piece of text) into a vector (an ordered collection of numbers). This transformation provides the AI with relevant information for your questions.
 
-Different embedding models have varying performance, including accuracy and the speed of computing embeddings. The `_q` at the end of the model name usually denotes *quantized* (meaning reduced or simplified). These models are faster and smaller than their original counterparts but provide slightly less accuracy.
+Different embedding models have varying performance, including accuracy and the speed of computing embeddings. The `_q` at the end of the model name usually denotes _quantized_ (meaning reduced or simplified). These models are faster and smaller than their original counterparts but provide slightly less accuracy.
 
 Currently, only local embedding models are supported. This means you do not need to provide a new API key, as all the processing will be done on your machine.
 
@@ -112,27 +112,27 @@ To use the templates, we employ the [Apache Velocity](https://velocity.apache.or
 
 There are four templates that JabRef uses:
 
-- **System Message for Chatting**: This template constructs the system message (also known as the instruction) for every AI chat in JabRef (whether chatting with an entry or with a group).
-- **User Message for Chatting**: This template is also used in chats and is responsible for forming a request to AI with document embeddings. The user message created by this template is sent to AI; however, only the plain user question will be saved in the chat history.
-- **Summarization Chunk**: In cases where the chat model does not have enough context window to fit the entire document in one message, our algorithm will split the document into chunks. This template is used to summarize a single chunk of a document.
-- **Summarization Combine**: This template is used only when the document size exceeds the context window of a chat model. It combines the summarized chunks into one piece of text.
+* **System Message for Chatting**: This template constructs the system message (also known as the instruction) for every AI chat in JabRef (whether chatting with an entry or with a group).
+* **User Message for Chatting**: This template is also used in chats and is responsible for forming a request to AI with document embeddings. The user message created by this template is sent to AI; however, only the plain user question will be saved in the chat history.
+* **Summarization Chunk**: In cases where the chat model does not have enough context window to fit the entire document in one message, our algorithm will split the document into chunks. This template is used to summarize a single chunk of a document.
+* **Summarization Combine**: This template is used only when the document size exceeds the context window of a chat model. It combines the summarized chunks into one piece of text.
 
 You can create any template you want, but we advise starting from the default template, as it has been carefully designed and includes special syntax from Apache Velocity.
 
 ### Contexts for Templates
 
 For each template, there is a context that holds all necessary variables used in the template. In this section, we will show you the available variables for each template and their structure.
 
-- **System Message for Chatting**: There is a single variable, `entries`, which is a list of BIB entries. You can use `CanonicalBibEntry.getCanonicalRepresentation(BibEntry entry)` to format the entries.
-- **User Message for Chatting**: There are two variables: `message` (the user question) and `excerpts` (pieces of information found in documents through the embeddings search). Each object in `excerpts` is of type `PaperExcerpt`, which has two fields: `citationKey` and `text`.
-- **Summarization Chunk**: There is only the `text` variable, which contains the chunk.
-- **Summarization Combine**: There is only the `chunks` variable, which contains a list of summarized chunks.
+* **System Message for Chatting**: There is a single variable, `entries`, which is a list of BIB entries. You can use `CanonicalBibEntry.getCanonicalRepresentation(BibEntry entry)` to format the entries.
+* **User Message for Chatting**: There are two variables: `message` (the user question) and `excerpts` (pieces of information found in documents through the embeddings search). Each object in `excerpts` is of type `PaperExcerpt`, which has two fields: `citationKey` and `text`.
+* **Summarization Chunk**: There is only the `text` variable, which contains the chunk.
+* **Summarization Combine**: There is only the `chunks` variable, which contains a list of summarized chunks.
 
 ## Further literature
 
-- [Visual representation of samplers (Temperature, Top-P, Min-P, ...) by Artefact2](https://artefact2.github.io/llm-sampling/index.xhtml)
-- [What is a Context Window?](https://www.techtarget.com/whatis/definition/context-window)
-- [Is temperature the creativity of Large Language Models?](https://arxiv.org/abs/2405.00492)
-- [The Effect of Sampling Temperature on Problem Solving in Large Language Models](https://arxiv.org/abs/2402.05201)
-- [Min P Sampling: Balancing Creativity and Coherence at High Temperature](https://arxiv.org/abs/2407.01082)
-- [Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis](https://arxiv.org/abs/2405.08944)
+* [Visual representation of samplers (Temperature, Top-P, Min-P, ...) by Artefact2](https://artefact2.github.io/llm-sampling/index.xhtml)
+* [What is a Context Window?](https://www.techtarget.com/whatis/definition/context-window)
+* [Is temperature the creativity of Large Language Models?](https://arxiv.org/abs/2405.00492)
+* [The Effect of Sampling Temperature on Problem Solving in Large Language Models](https://arxiv.org/abs/2402.05201)
+* [Min P Sampling: Balancing Creativity and Coherence at High Temperature](https://arxiv.org/abs/2407.01082)
+* [Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis](https://arxiv.org/abs/2405.08944)
diff --git a/en/collect/newentryfromplaintext.md b/en/collect/newentryfromplaintext.md
@@ -39,7 +39,7 @@ O. Kopp, A. Armbruster, und O. Zimmermann, "Markdown Architectural Decision Reco
 
 
 
-    <figure><picture><source srcset="../.gitbook/assets/Bild 5 - rule based result is selected in entry table - dark mode.png" media="(prefers-color-scheme: dark)"><img src="../.gitbook/assets/new-entry-from-plain-text-step-4 (1).png" alt=""></picture><figcaption></figcaption></figure>
+    <figure><picture><source srcset="../.gitbook/assets/Bild 5 - rule based result is selected in entry table - dark mode.png" media="(prefers-color-scheme: dark)"><img src="../.gitbook/assets/Bild 5 - rule based result is selected in entry table - light mode.png" alt=""></picture><figcaption></figcaption></figure>
 
 ### Parser Explanation
 
@@ -51,7 +51,7 @@ This is the default parser. It does not require any extensive setups, nor does i
 
 JabRef uses the technology offered by [Grobid](https://github.com/kermitt2/grobid), a machine learning software project with decades of experience and development dedicated to bibliographic metadata extraction. The Grobid parser usually tends to achieve better results than the rule-based parser. Since JabRef runs Grobid on a remote instance, users will have to confirm sending data to JabRef's online service in the preferences (_File > Preferences > Web search > Remote Services_). Sending data is disabled by default. It cannot be guaranteed that JabRef's Grobid instance will always be up and running, but it is possible for you to set up your [own Grobid Instance](https://grobid.readthedocs.io/en/latest/Grobid-docker/).
 
-<figure><picture><source srcset="../.gitbook/assets/Bild 6 - Grobid Preferences - dark mode.png" media="(prefers-color-scheme: dark)"><img src="../.gitbook/assets/Bild 6 - Grobid Preferences - dark mode.png" alt=""></picture><figcaption><p>Grobid related preference section in JabRef</p></figcaption></figure>
+<figure><picture><source srcset="../.gitbook/assets/Bild 6 - Grobid Preferences - dark mode.png" media="(prefers-color-scheme: dark)"><img src="../.gitbook/assets/Bild 6 - Grobid Preferences - light mode.png" alt=""></picture><figcaption><p>Grobid related preference section in JabRef</p></figcaption></figure>
 
 #### LLM