Context Compressor: A Recursive Ternary Embedding Document Search Algorithm #15985

SuiltaPico · 2024-01-13T10:25:14Z

SuiltaPico
Jan 13, 2024

This algorithm divides a document recursively into three groups for compression. Assuming we have an array of text segments [1, 2, 3, 4, 5, 6], we first divide it into three groups [1, 2, 3], [3, 4, 5], [4, 5, 6], and then embed these three groups. Next, we search for the two most similar documents in the document based on vector similarity. After comparison, we find that the group [4, 5, 6] has the lowest similarity. Therefore, we continue using the same method on the text segment array [1, 2, 3, 4]...

However, this is just a preliminary idea and I haven't had the time to put it into practice. Additionally, I have not determined the termination condition for the recursion. I hope this idea can inspire everyone and we can discuss the feasibility of this method together.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Context Compressor: A Recursive Ternary Embedding Document Search Algorithm #15985

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Context Compressor: A Recursive Ternary Embedding Document Search Algorithm #15985

Uh oh!

SuiltaPico Jan 13, 2024

Replies: 0 comments

SuiltaPico
Jan 13, 2024