This repository was archived by the owner on Jul 22, 2025. It is now read-only.
generated from discourse/discourse-plugin-skeleton
-
Notifications
You must be signed in to change notification settings - Fork 40
FEATURE: RAG search within tools #802
Merged
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
e091526
FEATURE: RAG search within tools
SamSaffron be6ad6c
lint
SamSaffron 2f2b771
moving rag UI to a central spot
SamSaffron 5c4fb04
Move more rag stuff into reusable components
SamSaffron 1b8df46
Mostly working now, just needs more polish and testing.
SamSaffron 593abad
indexing status is a bit wonky work around edge cases
SamSaffron 9d38e51
syntax tree
SamSaffron 63470a9
work in progress add options
SamSaffron f40355c
refactor options code so it is common
SamSaffron 1b6b7d8
ensure fragments regenerated on save
SamSaffron 13d32d9
lint
SamSaffron 3387a71
address comment, ensure file is actually uploaded.
SamSaffron File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
75 changes: 75 additions & 0 deletions
75
app/controllers/discourse_ai/admin/rag_document_fragments_controller.rb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,75 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| module DiscourseAi | ||
| module Admin | ||
| class RagDocumentFragmentsController < ::Admin::AdminController | ||
| requires_plugin ::DiscourseAi::PLUGIN_NAME | ||
|
|
||
| def indexing_status_check | ||
| if params[:target_type] == "AiPersona" | ||
| @target = AiPersona.find(params[:target_id]) | ||
| elsif params[:target_type] == "AiTool" | ||
| @target = AiTool.find(params[:target_id]) | ||
| else | ||
| raise Discourse::InvalidParameters.new("Invalid target type") | ||
| end | ||
|
|
||
| render json: RagDocumentFragment.indexing_status(@target, @target.uploads) | ||
| end | ||
|
|
||
| def upload_file | ||
| file = params[:file] || params[:files].first | ||
|
|
||
| if !SiteSetting.ai_embeddings_enabled? | ||
| raise Discourse::InvalidAccess.new("Embeddings not enabled") | ||
| end | ||
|
|
||
| validate_extension!(file.original_filename) | ||
| validate_file_size!(file.tempfile.size) | ||
|
|
||
| hijack do | ||
| upload = | ||
| UploadCreator.new( | ||
| file.tempfile, | ||
| file.original_filename, | ||
| type: "discourse_ai_rag_upload", | ||
| skip_validations: true, | ||
| ).create_for(current_user.id) | ||
|
|
||
| if upload.persisted? | ||
| render json: UploadSerializer.new(upload) | ||
| else | ||
| render json: failed_json.merge(errors: upload.errors.full_messages), status: 422 | ||
| end | ||
| end | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def validate_extension!(filename) | ||
| extension = File.extname(filename)[1..-1] || "" | ||
| authorized_extensions = %w[txt md] | ||
| if !authorized_extensions.include?(extension) | ||
| raise Discourse::InvalidParameters.new( | ||
| I18n.t( | ||
| "upload.unauthorized", | ||
| authorized_extensions: authorized_extensions.join(" "), | ||
| ), | ||
| ) | ||
| end | ||
| end | ||
|
|
||
| def validate_file_size!(filesize) | ||
| max_size_bytes = 20.megabytes | ||
| if filesize > max_size_bytes | ||
| raise Discourse::InvalidParameters.new( | ||
| I18n.t( | ||
| "upload.attachments.too_large_humanized", | ||
| max_size: ActiveSupport::NumberHelper.number_to_human_size(max_size_bytes), | ||
| ), | ||
| ) | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,6 +7,10 @@ class AiTool < ActiveRecord::Base | |
| validates :script, presence: true, length: { maximum: 100_000 } | ||
| validates :created_by_id, presence: true | ||
| belongs_to :created_by, class_name: "User" | ||
| has_many :rag_document_fragments, dependent: :destroy, as: :target | ||
| has_many :upload_references, as: :target, dependent: :destroy | ||
| has_many :uploads, through: :upload_references | ||
| before_update :regenerate_rag_fragments | ||
|
|
||
| def signature | ||
| { name: name, description: description, parameters: parameters.map(&:symbolize_keys) } | ||
|
|
@@ -28,6 +32,82 @@ def bump_persona_cache | |
| AiPersona.persona_cache.flush! | ||
| end | ||
|
|
||
| def regenerate_rag_fragments | ||
| if rag_chunk_tokens_changed? || rag_chunk_overlap_tokens_changed? | ||
| RagDocumentFragment.where(target: self).delete_all | ||
| end | ||
| end | ||
|
|
||
| def self.preamble | ||
| <<~JS | ||
| /** | ||
| * Tool API Quick Reference | ||
| * | ||
| * Entry Functions | ||
| * | ||
| * invoke(parameters): Main function. Receives parameters (Object). Must return a JSON-serializable value. | ||
| * Example: | ||
| * function invoke(parameters) { return "result"; } | ||
| * | ||
| * details(): Optional. Returns a string describing the tool. | ||
| * Example: | ||
| * function details() { return "Tool description."; } | ||
| * | ||
| * Provided Objects | ||
| * | ||
| * 1. http | ||
| * http.get(url, options?): Performs an HTTP GET request. | ||
| * Parameters: | ||
| * url (string): The request URL. | ||
| * options (Object, optional): | ||
| * headers (Object): Request headers. | ||
| * Returns: | ||
| * { status: number, body: string } | ||
| * | ||
| * http.post(url, options?): Performs an HTTP POST request. | ||
| * Parameters: | ||
| * url (string): The request URL. | ||
| * options (Object, optional): | ||
| * headers (Object): Request headers. | ||
| * body (string): Request body. | ||
| * Returns: | ||
| * { status: number, body: string } | ||
| * | ||
| * Note: Max 20 HTTP requests per execution. | ||
| * | ||
| * 2. llm | ||
| * llm.truncate(text, length): Truncates text to a specified token length. | ||
| * Parameters: | ||
| * text (string): Text to truncate. | ||
| * length (number): Max tokens. | ||
| * Returns: | ||
| * Truncated string. | ||
| * | ||
| * 3. index | ||
| * index.search(query, options?): Searches indexed documents. | ||
| * Parameters: | ||
| * query (string): Search query. | ||
| * options (Object, optional): | ||
| * filenames (Array): Limit search to specific files. | ||
| * limit (number): Max fragments (up to 200). | ||
| * Returns: | ||
| * Array of { fragment: string, metadata: string } | ||
| * | ||
| * Constraints | ||
| * | ||
| * Execution Time: ≤ 2000ms | ||
| * Memory: ≤ 10MB | ||
| * HTTP Requests: ≤ 20 per execution | ||
| * Exceeding limits will result in errors or termination. | ||
| * | ||
| * Security | ||
| * | ||
| * Sandboxed Environment: No access to system or global objects. | ||
| * No File System Access: Cannot read or write files. | ||
| */ | ||
| JS | ||
| end | ||
|
|
||
| def self.presets | ||
| [ | ||
| { | ||
|
|
@@ -38,6 +118,7 @@ def self.presets | |
| { name: "url", type: "string", required: true, description: "The URL to browse" }, | ||
| ], | ||
| script: <<~SCRIPT, | ||
| #{preamble} | ||
| let url; | ||
| function invoke(p) { | ||
| url = p.url; | ||
|
|
@@ -70,6 +151,7 @@ def self.presets | |
| { name: "amount", type: "number", description: "Amount to convert eg: 123.45" }, | ||
| ], | ||
| script: <<~SCRIPT, | ||
| #{preamble} | ||
| // note: this script uses the open.er-api.com service, it is only updated | ||
| // once every 24 hours, for more up to date rates see: https://www.exchangerate-api.com | ||
| function invoke(params) { | ||
|
|
@@ -118,6 +200,7 @@ def self.presets | |
| }, | ||
| ], | ||
| script: <<~SCRIPT, | ||
| #{preamble} | ||
| function invoke(params) { | ||
| const apiKey = 'YOUR_ALPHAVANTAGE_API_KEY'; // Replace with your actual API key | ||
| const url = `https://www.alphavantage.co/query?function=GLOBAL_QUOTE&symbol=${params.symbol}&apikey=${apiKey}`; | ||
|
|
@@ -154,6 +237,7 @@ def self.presets | |
| summary: "Get real-time stock quotes using AlphaVantage API", | ||
| }, | ||
| { preset_id: "empty_tool", script: <<~SCRIPT }, | ||
| #{preamble} | ||
| function invoke(params) { | ||
| // logic here | ||
| return params; | ||
|
|
@@ -173,14 +257,16 @@ def self.presets | |
| # | ||
| # Table name: ai_tools | ||
| # | ||
| # id :bigint not null, primary key | ||
| # name :string not null | ||
| # description :string not null | ||
| # summary :string not null | ||
| # parameters :jsonb not null | ||
| # script :text not null | ||
| # created_by_id :integer not null | ||
| # enabled :boolean default(TRUE), not null | ||
| # created_at :datetime not null | ||
| # updated_at :datetime not null | ||
| # id :bigint not null, primary key | ||
| # name :string not null | ||
| # description :string not null | ||
| # summary :string not null | ||
| # parameters :jsonb not null | ||
| # script :text not null | ||
|
Comment on lines
+261
to
+265
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should we limit the size of these fields?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah probably but a bit unrelated to this commit, it was there from before |
||
| # created_by_id :integer not null | ||
| # enabled :boolean default(TRUE), not null | ||
| # created_at :datetime not null | ||
| # updated_at :datetime not null | ||
| # rag_chunk_tokens :integer default(374), not null | ||
| # rag_chunk_overlap_tokens :integer default(10), not null | ||
| # | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.