Skip to content

Commit 6e56fff

Browse files
authored
Merge pull request #1255 from oracle-devrel/oracle-vector-search
Add AI Vector Search content
2 parents 016fde8 + efd4026 commit 6e56fff

File tree

25 files changed

+1977
-0
lines changed

25 files changed

+1977
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Copyright (c) 2024 Oracle and/or its affiliates.
2+
3+
The Universal Permissive License (UPL), Version 1.0
4+
5+
Subject to the condition set forth below, permission is hereby granted to any
6+
person obtaining a copy of this software, associated documentation and/or data
7+
(collectively the "Software"), free of charge and under any and all copyright
8+
rights in the Software, and any and all patent rights owned or freely
9+
licensable by each licensor hereunder covering either (i) the unmodified
10+
Software as contributed to or provided by such licensor, or (ii) the Larger
11+
Works (as defined below), to deal in both
12+
13+
(a) the Software, and
14+
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
15+
one is included with the Software (each a "Larger Work" to which the Software
16+
is contributed by such licensors),
17+
18+
without restriction, including without limitation the rights to copy, create
19+
derivative works of, display, perform, and distribute the Software and make,
20+
use, sell, offer for sale, import, export, have made, and have sold the
21+
Software and the Larger Work(s), and to sublicense the foregoing rights on
22+
either these or other terms.
23+
24+
This license is subject to the following condition:
25+
The above copyright notice and either this complete permission notice or at
26+
a minimum a reference to the UPL must be included in all copies or
27+
substantial portions of the Software.
28+
29+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
30+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
31+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
32+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
33+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
34+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
35+
SOFTWARE.
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Vector Search using OCI GenAI Embeddings
2+
3+
This document covers the topic of creating embeddings for Vector Search using OCI GenAI Embedding Model.
4+
5+
Reviewed: 2024.09.06
6+
7+
8+
# When to use this asset?
9+
10+
In this asset we will explore how we can using the `oracledb` python SDK to interact with the Oracle Database 23ai EE with AI Vector Search Capabilities. We have created a table `simple_demo_cohere` within our Oracle Database and loaded 89 rows of sample text data into this table. Please refer to the SQL script `simple-demo-cohere-dataloading.sql` for the table creation and data loading statements which can be executed against our Oracle DB via SQLPlus or SQLDeveloper.
11+
12+
The following Notebook will go through reading in our sample data, then using the new Langchain integration to call the OCI Generative AI and perform Embeddings on top of text data utilising the `cohere.embbed-english.v3.0` model (we will pass each text as its own API embed call), and store the data back into our table via an update statement. We can then perform similarity search against our Vector DB to find the closest matched records based on our input data.
13+
14+
15+
# How to use this asset?
16+
17+
This asset is provided as general purpose material. Please tailor the content according to your context and needs.
18+
19+
20+
# License
21+
22+
Copyright (c) 2024 Oracle and/or its affiliates.
23+
24+
Licensed under the Universal Permissive License (UPL), Version 1.0.
25+
26+
See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.

data-platform/data-science-vector-ml/oracle-vector-search/code-assets/python-create-embeddings-cohere/files/python-sdk-create-embeddings-cohere.ipynb

Lines changed: 489 additions & 0 deletions
Large diffs are not rendered by default.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
------------------------------------------------------
2+
-- Simple Demo - Data Loading
3+
------------------------------------------------------
4+
5+
-- Drop Table
6+
drop table simple_demo_cohere;
7+
8+
-- Create Table
9+
create table simple_demo_cohere (
10+
id number primary key,
11+
text varchar2(128),
12+
text_v vector
13+
);
14+
15+
-- Truncate Table
16+
truncate table simple_demo_cohere;
17+
18+
-- Insert Records into Table
19+
insert into simple_demo_cohere values (1, 'San Francisco is in California.', null);
20+
insert into simple_demo_cohere values (2, 'San Jose is in California.', null);
21+
insert into simple_demo_cohere values (3, 'Los Angles is in California.', null);
22+
insert into simple_demo_cohere values (4, 'Buffalo is in New York.', null);
23+
insert into simple_demo_cohere values (5, 'Brooklyn is in New York.', null);
24+
insert into simple_demo_cohere values (6, 'Queens is in New York.', null);
25+
insert into simple_demo_cohere values (7, 'Harlem is in New York.', null);
26+
insert into simple_demo_cohere values (8, 'The Bronx is in New York.', null);
27+
insert into simple_demo_cohere values (9, 'Manhattan is in New York.', null);
28+
insert into simple_demo_cohere values (10, 'Staten Island is in New York.', null);
29+
insert into simple_demo_cohere values (11, 'Miami is in Florida.', null);
30+
insert into simple_demo_cohere values (12, 'Tampa is in Florida.', null);
31+
insert into simple_demo_cohere values (13, 'Orlando is in Florida.', null);
32+
insert into simple_demo_cohere values (14, 'Dallas is in Texas.', null);
33+
insert into simple_demo_cohere values (15, 'Huston is in Texas.', null);
34+
insert into simple_demo_cohere values (16, 'Austin is in Texas.', null);
35+
insert into simple_demo_cohere values (17, 'Phoenix is in Arizona.', null);
36+
insert into simple_demo_cohere values (18, 'Las Vegas is in Nevada.', null);
37+
insert into simple_demo_cohere values (19, 'Portland is in Oregon.', null);
38+
insert into simple_demo_cohere values (20, 'New Orleans is in Louisiana.', null);
39+
insert into simple_demo_cohere values (21, 'Atlanta is in Georgia.', null);
40+
insert into simple_demo_cohere values (22, 'Chicago is in Illinois.', null);
41+
insert into simple_demo_cohere values (23, 'Cleveland is in Ohio.', null);
42+
insert into simple_demo_cohere values (24, 'Boston is in Massachusetts.', null);
43+
insert into simple_demo_cohere values (25, 'Baltimore is in Maryland.', null);
44+
45+
insert into simple_demo_cohere values (100, 'Ferraris are often red.', null);
46+
insert into simple_demo_cohere values (101, 'Teslas are electric.', null);
47+
insert into simple_demo_cohere values (102, 'Mini coopers are small.', null);
48+
insert into simple_demo_cohere values (103, 'Fiat 500 are small.', null);
49+
insert into simple_demo_cohere values (104, 'Dodge Vipers are wide.', null);
50+
insert into simple_demo_cohere values (105, 'Ford 150 are popular.', null);
51+
insert into simple_demo_cohere values (106, 'Alfa Romeos are fun but unreliable.', null);
52+
insert into simple_demo_cohere values (107, 'Volvos are safe.', null);
53+
insert into simple_demo_cohere values (108, 'Toyotas are reliable.', null);
54+
insert into simple_demo_cohere values (109, 'Hondas are reliable.', null);
55+
insert into simple_demo_cohere values (110, 'Porsches are fast and reliable.', null);
56+
insert into simple_demo_cohere values (111, 'Nissan GTR are great', null);
57+
insert into simple_demo_cohere values (112, 'NISMO is awesome', null);
58+
59+
insert into simple_demo_cohere values (200, 'Bananas are yellow.', null);
60+
insert into simple_demo_cohere values (201, 'Kiwis are green inside.', null);
61+
insert into simple_demo_cohere values (202, 'Kiwis are brown on the outside.', null);
62+
insert into simple_demo_cohere values (203, 'Kiwis are birds.', null);
63+
insert into simple_demo_cohere values (204, 'Kiwis taste good.', null);
64+
insert into simple_demo_cohere values (205, 'Ripe strawberries are red.', null);
65+
insert into simple_demo_cohere values (206, 'Apples can be green, yellow or red.', null);
66+
insert into simple_demo_cohere values (207, 'Ripe cherries are red.', null);
67+
insert into simple_demo_cohere values (208, 'Pears can be green, yellow or brown.', null);
68+
insert into simple_demo_cohere values (209, 'Oranges are orange.', null);
69+
insert into simple_demo_cohere values (210, 'Peaches can be yellow, orange or red.', null);
70+
insert into simple_demo_cohere values (211, 'Peaches can be fuzzy.', null);
71+
insert into simple_demo_cohere values (212, 'Grapes can be green, red or purple.', null);
72+
insert into simple_demo_cohere values (213, 'Watermelons are green on the outside.', null);
73+
insert into simple_demo_cohere values (214, 'Watermelons are red on the outside.', null);
74+
insert into simple_demo_cohere values (215, 'Blueberries are blue.', null);
75+
insert into simple_demo_cohere values (216, 'Limes are green.', null);
76+
insert into simple_demo_cohere values (217, 'Lemons are yellow.', null);
77+
insert into simple_demo_cohere values (218, 'Ripe tomatoes are red.', null);
78+
insert into simple_demo_cohere values (219, 'Unripe tomatoes are green.', null);
79+
insert into simple_demo_cohere values (220, 'Ripe raspberries are red.', null);
80+
81+
insert into simple_demo_cohere values (300, 'Tigers have stripes.', null);
82+
insert into simple_demo_cohere values (301, 'Lions are big.', null);
83+
insert into simple_demo_cohere values (302, 'Mice are small.', null);
84+
insert into simple_demo_cohere values (303, 'Cats do not care.', null);
85+
insert into simple_demo_cohere values (304, 'Dogs are loyal.', null);
86+
insert into simple_demo_cohere values (305, 'Bears are hairy.', null);
87+
insert into simple_demo_cohere values (306, 'Pandas are black and white.', null);
88+
insert into simple_demo_cohere values (307, 'Zebras are black and white.', null);
89+
insert into simple_demo_cohere values (308, 'Penguins can be black and white.', null);
90+
insert into simple_demo_cohere values (309, 'Puffins can be black and white.', null);
91+
insert into simple_demo_cohere values (310, 'Giraffes have long necks.', null);
92+
insert into simple_demo_cohere values (311, 'Elephants have trunks.', null);
93+
insert into simple_demo_cohere values (312, 'Horses have four legs.', null);
94+
insert into simple_demo_cohere values (313, 'Birds can fly.', null);
95+
insert into simple_demo_cohere values (314, 'Birds lay eggs.', null);
96+
insert into simple_demo_cohere values (315, 'Fish can swim.', null);
97+
insert into simple_demo_cohere values (316, 'Sharks have lots of teeth.', null);
98+
insert into simple_demo_cohere values (317, 'Flies can fly.', null);
99+
100+
insert into simple_demo_cohere values (400, 'Ibaraki is in Kanto.', null);
101+
insert into simple_demo_cohere values (401, 'Tochigi is in Kanto.', null);
102+
insert into simple_demo_cohere values (402, 'Gunma is in Kanto.', null);
103+
insert into simple_demo_cohere values (403, 'Saitama is in Kanto.', null);
104+
insert into simple_demo_cohere values (404, 'Chiba is in Kanto.', null);
105+
insert into simple_demo_cohere values (405, 'Tokyo is in Kanto.', null);
106+
insert into simple_demo_cohere values (406, 'Kanagawa is in Kanto.', null);
107+
108+
insert into simple_demo_cohere values (500, 'Eggs are egg shaped.', null);
109+
insert into simple_demo_cohere values (501, 'Tokyo is in Japan.', null);
110+
insert into simple_demo_cohere values (502, 'To be, or not to be, that is the question.', null);
111+
insert into simple_demo_cohere values (503, '640K ought to be enough for anybody.', null);
112+
insert into simple_demo_cohere values (504, 'Man overboard.', null);
113+
114+
-- Commit Changes
115+
commit;
116+
117+
-- Count Records
118+
select count(*) from simple_demo_cohere; -- 89 Records
119+
120+
-- Preview Data
121+
select * from simple_demo_cohere order by id;
122+
123+
124+
------------------------------------------------------
125+
-- End of Script
126+
------------------------------------------------------
127+
128+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Copyright (c) 2024 Oracle and/or its affiliates.
2+
3+
The Universal Permissive License (UPL), Version 1.0
4+
5+
Subject to the condition set forth below, permission is hereby granted to any
6+
person obtaining a copy of this software, associated documentation and/or data
7+
(collectively the "Software"), free of charge and under any and all copyright
8+
rights in the Software, and any and all patent rights owned or freely
9+
licensable by each licensor hereunder covering either (i) the unmodified
10+
Software as contributed to or provided by such licensor, or (ii) the Larger
11+
Works (as defined below), to deal in both
12+
13+
(a) the Software, and
14+
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
15+
one is included with the Software (each a "Larger Work" to which the Software
16+
is contributed by such licensors),
17+
18+
without restriction, including without limitation the rights to copy, create
19+
derivative works of, display, perform, and distribute the Software and make,
20+
use, sell, offer for sale, import, export, have made, and have sold the
21+
Software and the Larger Work(s), and to sublicense the foregoing rights on
22+
either these or other terms.
23+
24+
This license is subject to the following condition:
25+
The above copyright notice and either this complete permission notice or at
26+
a minimum a reference to the UPL must be included in all copies or
27+
substantial portions of the Software.
28+
29+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
30+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
31+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
32+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
33+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
34+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
35+
SOFTWARE.
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Vector Search using Sentence Transformer Embeddings
2+
3+
This document covers the topic of creating embeddings for Vector Search using the Sentence Transformer Embedding Model.
4+
5+
Reviewed: 2024.09.06
6+
7+
8+
# When to use this asset?
9+
10+
In this asset we will explore how we can using the `oracledb` python SDK to interact with the Oracle Database 23ai with AI Vector Search Capabilities. We have created a table `simple_demo` within our Oracle Database and loaded 89 rows of sample text data into this table. Please refer to the SQL script `simple-demo-dataloading.sql` for the table creation and data loading statements which can be executed against our Oracle DB via SQLPlus or SQLDeveloper.
11+
12+
The following Notebook will go through reading in our sample data, using the open-source `all-MiniLM-L6-v2` Sentence Transformer to convert our text into a vector embedding and store the data back into our table via an update statement. We can then perform similarity search against our Vector DB to find the closest matched records based on our input data.
13+
14+
15+
# How to use this asset?
16+
17+
This asset is provided as general purpose material. Please tailor the content according to your context and needs.
18+
19+
20+
# License
21+
22+
Copyright (c) 2024 Oracle and/or its affiliates.
23+
24+
Licensed under the Universal Permissive License (UPL), Version 1.0.
25+
26+
See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.

data-platform/data-science-vector-ml/oracle-vector-search/code-assets/python-create-embeddings-sentence-transformer/files/python-sdk-create-embedding-sentence-transformer.ipynb

Lines changed: 523 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)