Skip to content

Commit 7b9262c

Browse files
ThiloteEgitbook-bot
authored andcommitted
GITBOOK-276: ThiloteE's Minor fixes in add entry using reference text and AI functionality
1 parent 78a1fdc commit 7b9262c

File tree

4 files changed

+26
-26
lines changed

4 files changed

+26
-26
lines changed
30.2 KB
Loading

en/ai/local-llm.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
Notice:
44

55
1. This tutorial is intended for expert users
6-
2. (Local) LLMs requires a lot of computational power
7-
3. Smaller models (in terms of parameter size) typically respond qualitatively worse than bigger ones, but they are faster and need less memory.
6+
2. (Local) LLMs require a lot of computational power
7+
3. Smaller models (in terms of parameter size) typically respond qualitatively worse than bigger ones, but they are faster, need less memory and might already be sufficient for your use case.
88

99
## High-level explanation
1010

@@ -42,5 +42,5 @@ The following steps guide you on how to use `GPT4All`to download and run local L
4242
3. Open JabRef, go to "File" > "Preferences" > "AI"
4343
4. Set the "AI provider" to "GPT4All"
4444
5. Set the "Chat model" to the name (including the `.gguf`part) of the model you have downloaded in GPT4All.
45-
6. Set the "API base URL" in "Expert Settings" to \`http://localhost:4891/v1/chat/completions\`.
45+
6. Set the "API base URL" in "Expert Settings" to `http://localhost:4891/v1/chat/completions`.
4646

en/ai/preferences.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,17 @@
44

55
## General settings
66

7-
- "Enable AI functionality in JabRef": by default it is turned off, so you need to check this option if you want to use the new AI features
8-
- "Automatically generate embeddings for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically start an embeddings generation task. (If you do not know what are the embeddings, take a look at ["How does the AI functionality work?"](https://docs.jabref.org/ai#how-does-the-ai-functionality-work)).
9-
- "Automatically generate summaries for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically generate a summary.
7+
* "Enable AI functionality in JabRef": by default it is turned off, so you need to check this option if you want to use the new AI features
8+
* "Automatically generate embeddings for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically start an embeddings generation task. (If you do not know what are the embeddings, take a look at ["How does the AI functionality work?"](https://docs.jabref.org/ai#how-does-the-ai-functionality-work)).
9+
* "Automatically generate summaries for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically generate a summary.
1010

1111
If you import a lot of entries at a time, we recommend you to switch off options "Automatically generate embeddings for new entries" and "Automatically generate summaries for new entries", because this may slow down your computer, and you may reach the usage limit of the AI provider.
1212

1313
## Connection settings
1414

15-
- "AI provider": you can choose either OpenAI, Mistral AI, or Hugging Face
16-
- "Chat model": choose the model you like (for OpenAI we recommend `gpt-4o-mini`, as to date, it is the cheapest and fastest, though we also recommend to look up the prices periodically, as they are subject to change)
17-
- "API token": enter your API token here
15+
* "AI provider": you can choose between [various providers](https://docs.jabref.org/ai/ai-providers-and-api-keys#what-is-an-ai-provider).
16+
* "Chat model": choose the model you like.
17+
* "API key": enter your API key here.
1818

1919
## Expert settings
2020

@@ -34,7 +34,7 @@ You do not have to set this parameter manually or remember all the addresses. Ja
3434

3535
The embedding model transforms a document (or a piece of text) into a vector (an ordered collection of numbers). This transformation provides the AI with relevant information for your questions.
3636

37-
Different embedding models have varying performance, including accuracy and the speed of computing embeddings. The `_q` at the end of the model name usually denotes *quantized* (meaning reduced or simplified). These models are faster and smaller than their original counterparts but provide slightly less accuracy.
37+
Different embedding models have varying performance, including accuracy and the speed of computing embeddings. The `_q` at the end of the model name usually denotes _quantized_ (meaning reduced or simplified). These models are faster and smaller than their original counterparts but provide slightly less accuracy.
3838

3939
Currently, only local embedding models are supported. This means you do not need to provide a new API key, as all the processing will be done on your machine.
4040

@@ -112,27 +112,27 @@ To use the templates, we employ the [Apache Velocity](https://velocity.apache.or
112112

113113
There are four templates that JabRef uses:
114114

115-
- **System Message for Chatting**: This template constructs the system message (also known as the instruction) for every AI chat in JabRef (whether chatting with an entry or with a group).
116-
- **User Message for Chatting**: This template is also used in chats and is responsible for forming a request to AI with document embeddings. The user message created by this template is sent to AI; however, only the plain user question will be saved in the chat history.
117-
- **Summarization Chunk**: In cases where the chat model does not have enough context window to fit the entire document in one message, our algorithm will split the document into chunks. This template is used to summarize a single chunk of a document.
118-
- **Summarization Combine**: This template is used only when the document size exceeds the context window of a chat model. It combines the summarized chunks into one piece of text.
115+
* **System Message for Chatting**: This template constructs the system message (also known as the instruction) for every AI chat in JabRef (whether chatting with an entry or with a group).
116+
* **User Message for Chatting**: This template is also used in chats and is responsible for forming a request to AI with document embeddings. The user message created by this template is sent to AI; however, only the plain user question will be saved in the chat history.
117+
* **Summarization Chunk**: In cases where the chat model does not have enough context window to fit the entire document in one message, our algorithm will split the document into chunks. This template is used to summarize a single chunk of a document.
118+
* **Summarization Combine**: This template is used only when the document size exceeds the context window of a chat model. It combines the summarized chunks into one piece of text.
119119

120120
You can create any template you want, but we advise starting from the default template, as it has been carefully designed and includes special syntax from Apache Velocity.
121121

122122
### Contexts for Templates
123123

124124
For each template, there is a context that holds all necessary variables used in the template. In this section, we will show you the available variables for each template and their structure.
125125

126-
- **System Message for Chatting**: There is a single variable, `entries`, which is a list of BIB entries. You can use `CanonicalBibEntry.getCanonicalRepresentation(BibEntry entry)` to format the entries.
127-
- **User Message for Chatting**: There are two variables: `message` (the user question) and `excerpts` (pieces of information found in documents through the embeddings search). Each object in `excerpts` is of type `PaperExcerpt`, which has two fields: `citationKey` and `text`.
128-
- **Summarization Chunk**: There is only the `text` variable, which contains the chunk.
129-
- **Summarization Combine**: There is only the `chunks` variable, which contains a list of summarized chunks.
126+
* **System Message for Chatting**: There is a single variable, `entries`, which is a list of BIB entries. You can use `CanonicalBibEntry.getCanonicalRepresentation(BibEntry entry)` to format the entries.
127+
* **User Message for Chatting**: There are two variables: `message` (the user question) and `excerpts` (pieces of information found in documents through the embeddings search). Each object in `excerpts` is of type `PaperExcerpt`, which has two fields: `citationKey` and `text`.
128+
* **Summarization Chunk**: There is only the `text` variable, which contains the chunk.
129+
* **Summarization Combine**: There is only the `chunks` variable, which contains a list of summarized chunks.
130130

131131
## Further literature
132132

133-
- [Visual representation of samplers (Temperature, Top-P, Min-P, ...) by Artefact2](https://artefact2.github.io/llm-sampling/index.xhtml)
134-
- [What is a Context Window?](https://www.techtarget.com/whatis/definition/context-window)
135-
- [Is temperature the creativity of Large Language Models?](https://arxiv.org/abs/2405.00492)
136-
- [The Effect of Sampling Temperature on Problem Solving in Large Language Models](https://arxiv.org/abs/2402.05201)
137-
- [Min P Sampling: Balancing Creativity and Coherence at High Temperature](https://arxiv.org/abs/2407.01082)
138-
- [Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis](https://arxiv.org/abs/2405.08944)
133+
* [Visual representation of samplers (Temperature, Top-P, Min-P, ...) by Artefact2](https://artefact2.github.io/llm-sampling/index.xhtml)
134+
* [What is a Context Window?](https://www.techtarget.com/whatis/definition/context-window)
135+
* [Is temperature the creativity of Large Language Models?](https://arxiv.org/abs/2405.00492)
136+
* [The Effect of Sampling Temperature on Problem Solving in Large Language Models](https://arxiv.org/abs/2402.05201)
137+
* [Min P Sampling: Balancing Creativity and Coherence at High Temperature](https://arxiv.org/abs/2407.01082)
138+
* [Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis](https://arxiv.org/abs/2405.08944)

en/collect/newentryfromplaintext.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ O. Kopp, A. Armbruster, und O. Zimmermann, "Markdown Architectural Decision Reco
3939

4040

4141

42-
<figure><picture><source srcset="../.gitbook/assets/Bild 5 - rule based result is selected in entry table - dark mode.png" media="(prefers-color-scheme: dark)"><img src="../.gitbook/assets/new-entry-from-plain-text-step-4 (1).png" alt=""></picture><figcaption></figcaption></figure>
42+
<figure><picture><source srcset="../.gitbook/assets/Bild 5 - rule based result is selected in entry table - dark mode.png" media="(prefers-color-scheme: dark)"><img src="../.gitbook/assets/Bild 5 - rule based result is selected in entry table - light mode.png" alt=""></picture><figcaption></figcaption></figure>
4343

4444
### Parser Explanation
4545

@@ -51,7 +51,7 @@ This is the default parser. It does not require any extensive setups, nor does i
5151

5252
JabRef uses the technology offered by [Grobid](https://github.com/kermitt2/grobid), a machine learning software project with decades of experience and development dedicated to bibliographic metadata extraction. The Grobid parser usually tends to achieve better results than the rule-based parser. Since JabRef runs Grobid on a remote instance, users will have to confirm sending data to JabRef's online service in the preferences (_File > Preferences > Web search > Remote Services_). Sending data is disabled by default. It cannot be guaranteed that JabRef's Grobid instance will always be up and running, but it is possible for you to set up your [own Grobid Instance](https://grobid.readthedocs.io/en/latest/Grobid-docker/).
5353

54-
<figure><picture><source srcset="../.gitbook/assets/Bild 6 - Grobid Preferences - dark mode.png" media="(prefers-color-scheme: dark)"><img src="../.gitbook/assets/Bild 6 - Grobid Preferences - dark mode.png" alt=""></picture><figcaption><p>Grobid related preference section in JabRef</p></figcaption></figure>
54+
<figure><picture><source srcset="../.gitbook/assets/Bild 6 - Grobid Preferences - dark mode.png" media="(prefers-color-scheme: dark)"><img src="../.gitbook/assets/Bild 6 - Grobid Preferences - light mode.png" alt=""></picture><figcaption><p>Grobid related preference section in JabRef</p></figcaption></figure>
5555

5656
#### LLM
5757

0 commit comments

Comments
 (0)