Skip to content

Commit dfc166d

Browse files
committed
Update sparse vector and bigint tests and documentation
1 parent 850d986 commit dfc166d

File tree

8 files changed

+2066
-37
lines changed

8 files changed

+2066
-37
lines changed

doc/src/api_manual/connection.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1237,13 +1237,18 @@ Connection Methods
12371237
- ``fetchType``: One of the :ref:`Node-oracledb Type Constant <oracledbconstantsnodbtype>` values.
12381238
- ``isJson``: Indicates if the column is known to contain JSON data. This will be ``true`` for JSON columns (from Oracle Database 21c) and for LOB and VARCHAR2 columns where "IS JSON" constraint is enabled (from Oracle Database 19c). This property will be ``false`` for all the other columns. It will also be ``false`` for any column when Oracle Client 18c or earlier is used in Thick mode or the Oracle Database version is earlier than 19c.
12391239
- ``isOson``: Indicates if the column is known to contain binary encoded OSON data. This attribute will be ``true`` in Thin mode and while using Oracle Client version 21c (or later) in Thick mode when the "IS JSON FORMAT OSON" check constraint is enabled on BLOB and RAW columns. It will be set to ``false`` for all other columns. It will also be set to ``false`` for any column when the Thick mode uses Oracle Client versions earlier than 21c. Note that the "IS JSON FORMAT OSON" check constraint is available from Oracle Database 19c onwards.
1240+
- ``isSparseVector``: Indicates if the column is known to contain a sparse vector. This will be ``true`` for vector columns containing sparse vectors.
12401241
- ``name``: The column name follows Oracle’s standard name-casing rules. It will commonly be uppercase, since most applications create tables using unquoted, case-insensitive names.
12411242
- ``nullable``: Indicates whether ``NULL`` values are permitted for this column.
12421243
- ``precision``: Set only for ``oracledb.DB_TYPE_NUMBER``, ``oracledb.DB_TYPE_TIMESTAMP``, ``oracledb.DB_TYPE_TIMESTAMP_TZ``, and ``oracledb.DB_TYPE_TIMESTAMP_LTZ`` columns.
12431244
- ``scale``: Set only for ``oracledb.DB_TYPE_NUMBER`` columns.
12441245
- ``vectorDimensions``: The number of dimensions of the VECTOR column. If the column is not a VECTOR column or allows for any number of dimensions, then the value of this property is *undefined*.
12451246
- ``vectorFormat``: The storage format of each dimension value in the VECTOR column. If the column is not a VECTOR column or allows for any storage format, then the value of this property is *undefined*.
12461247

1248+
.. versionchanged:: 6.8
1249+
1250+
The ``isSparseVector`` information attribute was added.
1251+
12471252
.. versionchanged:: 6.5
12481253

12491254
The ``vectorDimensions`` and ``vectorFormat`` information attributes were added.

doc/src/api_manual/oracledb.rst

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3661,6 +3661,51 @@ Oracledb Methods
36613661
* - Error ``error``
36623662
- If ``startup()`` succeeds, ``error`` is NULL. If an error occurs, then ``error`` contains the :ref:`error message <errorobj>`.
36633663

3664+
.. _oracledbsparsevector:
3665+
3666+
Oracledb SparseVector Class
3667+
===========================
3668+
3669+
.. versionadded:: 6.8
3670+
3671+
A SparseVector Class stores information about a sparse vector. This class
3672+
represents an object that accepts one of the following types in its
3673+
constructor: typed array, JavaScript array, object, or string. See
3674+
:ref:`sparsevectors` for more information.
3675+
3676+
.. _sparsevectorproperties:
3677+
3678+
SparseVector Properties
3679+
-----------------------
3680+
3681+
.. attribute:: SparseVector.indices
3682+
3683+
This property is a JavaScript array or a 32-bit unsigned integer
3684+
(Uint32Array) TypedArray that specifies the indices (zero-based) of
3685+
non-zero values in the vector.
3686+
3687+
.. attribute:: SparseVector.numDimensions
3688+
3689+
This property is an integer that specifies the number of dimensions of the
3690+
vector.
3691+
3692+
.. attribute:: SparseVector.values
3693+
3694+
This property is a JavaScript array or TypedArray that specifies the
3695+
non-zero values stored in the vector.
3696+
3697+
SparseVector Methods
3698+
--------------------
3699+
3700+
.. method:: SparseVector.dense()
3701+
3702+
Converts a sparse vector to a dense vector and returns a TypedArray of
3703+
8-bit signed integers, 32-bit floating-point numbers, or 64-bit
3704+
floating-point numbers depending on the storage format of the sparse
3705+
vector column's non-zero values in Oracle Database.
3706+
3707+
This method is best used with sparse vectors read from Oracle Database.
3708+
36643709
.. _oracledbfuture:
36653710

36663711
Oracledb Future Object

doc/src/user_guide/vector_data_type.rst

Lines changed: 186 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,12 @@ The VECTOR data type is a homogeneous array of 8-bit signed integers, 8-bit
1111
unsigned integers, 32-bit floating-point numbers, or 64-bit floating-point
1212
numbers.
1313

14+
There are two vector types that can be stored in Oracle Database, DENSE
15+
vectors and SPARSE vectors. A dense vector is a vector where each dimension is
16+
physically stored, including zero-values. This is the default vector type used
17+
in Oracle Database. A sparse vector is a vector which has mostly zero as its
18+
dimension values and only the non-zero values are physically stored.
19+
1420
With the VECTOR data type, you can define the number of dimensions for the
1521
data and the storage format for each dimension value in the vector. The
1622
possible storage formats include:
@@ -43,8 +49,8 @@ To create a table with three columns for vector data, for example:
4349
)
4450
4551
In this example, each column can store vector data of three dimensions where
46-
each dimension value is of the specified storage format. This example is used
47-
in subsequent sections.
52+
each dimension value is of the specified storage format. The vector type of
53+
the three columns is dense. This example is used in subsequent sections.
4854

4955
.. _insertvector:
5056

@@ -322,3 +328,181 @@ See `vectortype1.js <https://github.com/oracle/node-oracledb/tree/
322328
main/examples/vectortype1.js>`__ and `vectortype2.js <https://github.com/
323329
oracle/node-oracledb/tree/main/examples/vectortype2.js>`__ for runnable
324330
examples.
331+
332+
.. _sparsevectors:
333+
334+
Using SPARSE Vectors
335+
====================
336+
337+
A sparse vector is a vector which has zero value for most of its dimensions.
338+
This vector physically stores only the non-zero values. For more information
339+
about using sparse vectors in Oracle Database, see the `Oracle AI Vector search
340+
User's Guide <https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=GUID-
341+
6015566C-3277-4A3C-8DD0-08B346A05478>`__.
342+
343+
A sparse vector is supported when you are using Oracle Database 23.7 or later.
344+
The sparse vector support was added in node-oracledb 6.8.
345+
346+
The storage formats that can be used with sparse vectors are FLOAT32, FLOAT64,
347+
and INT8. Note that the BINARY storage format cannot be used with sparse
348+
vectors.
349+
350+
You can define a column for a sparse vector using the following format::
351+
352+
VECTOR(number_of_dimensions, dimension_storage_format, SPARSE)
353+
354+
For example, to create a table with all three properties for sparse vectors:
355+
356+
.. code-block:: sql
357+
358+
CREATE TABLE vecSparseTable (
359+
SPARSECOL64 VECTOR(4, FLOAT64, SPARSE)
360+
)
361+
362+
In this example, the SPARSECOL64 column can store sparse vector data of 4
363+
dimensions where each dimension value is a 64-bit floating-point number. This
364+
example is used in subsequent sections.
365+
366+
The sparse vector format is::
367+
368+
[Total Dimension Count, [Dimension Index Array], [Dimension Value Array]]
369+
370+
The three key components of sparse vectors are:
371+
372+
- The total number of dimensions of the vector, which includes zero and
373+
non-zero values.
374+
375+
- An array that contains the indices of the dimensions which is zero-based.
376+
377+
- An array containing the non-zero values of the dimensions at the specified
378+
indices.
379+
380+
To create sparse vectors in node-oracledb, you can use :ref:`Oracledb
381+
SparseVector objects <oracledbsparsevector>`.
382+
383+
.. _insertsparsevector:
384+
385+
Inserting SPARSE Vectors
386+
------------------------
387+
388+
With node-oracledb, sparse vectors can be inserted using
389+
:ref:`SparseVector objects <oracledbsparsevector>`. You can specify the
390+
following properties in this object:
391+
392+
- The number of dimensions of the vector which includes zero and non-zero
393+
integers.
394+
395+
- The indices of the dimensions with a JavaScript array or a 32-bit unsigned
396+
integers (Uint32Array) TypedArray.
397+
398+
- The non-zero values of the dimensions with a JavaScript array or TypedArray.
399+
400+
If the array of indices is not a JavaScript array or a Uint32Array TypedArray,
401+
then the ``NJS-158: SPARSE VECTOR indices is not Uint32Array or an Array`` is
402+
raised. See :ref:`sparsevectorproperties` for more information.
403+
404+
The SparseVector object can be defined in the following ways:
405+
406+
- A string which is a JSON array that contains the number of dimensions, an
407+
array of indices (zero-based), and an array of values. For example::
408+
409+
oracledb.SparseVector('[10, [1, 3, 5], [1.5, 3.5, 7.7]]')
410+
411+
This creates a sparse vector of 10 dimensions with the non-zero values of
412+
[1.5, 3.5, 7.7] being present in indices [1,3,5].
413+
414+
- An object that contains an array of values, an array of indices (zero-based),
415+
and the number of dimensions. For example::
416+
417+
oracledb.SparseVector({values: [1.5, 3.5, 7.7], indices: [1, 3, 5], numDimensions: 10})
418+
419+
- A dense array which can be a JavaScript array or a TypedArray. For example::
420+
421+
oracledb.SparseVector([1.5, 0, 3.5, 7.7])
422+
423+
You can insert a sparse vector as a string, an object, or as an array.
424+
425+
The example below inserts a sparse vector as an object using the
426+
:ref:`SparseVector objects <oracledbsparsevector>`:
427+
428+
.. code-block:: javascript
429+
430+
const sparseVec = oracledb.SparseVector({values: [39, -65],
431+
indices: [1, 3], numDimensions: 4});
432+
433+
await connection.execute(
434+
`INSERT INTO vecSparseTable (SPARSECOL64) VALUES (:vec64)`,
435+
{ vec64: sparseVec }
436+
);
437+
438+
.. _fetchsparsevector:
439+
440+
Fetching SPARSE Vectors
441+
-----------------------
442+
443+
With node-oracledb, vector columns are fetched as :ref:`SparseVector objects
444+
<oracledbsparsevector>` if the VECTOR column in Oracle Database contains
445+
sparse vector data. To query a sparse vector column, for example:
446+
447+
.. code-block:: javascript
448+
449+
const result = await connection.execute(
450+
`SELECT SPARSECOL64 FROM vecSparseTable`
451+
);
452+
const vecs = result.rows[0].SPARSECOL64;
453+
const val = JSON.stringify(vecs); // change JavaScript object to a JSON string
454+
console.log(val);
455+
456+
This prints the following output::
457+
458+
{"SPARSECOL64":{"numDimensions":4,"indices":{"0":1,"1":3},"values":{"0":39,"1":-65}}
459+
460+
The :ref:`vectorDimensions <execmetadata>`, :ref:`vectorFormat <execmetadata>`,
461+
and :ref:`isSparseVector <execmetadata>` attributes in the metadata returned
462+
by a query contains the number of dimensions of the vector column, the storage
463+
format of each dimension value in the vector column, and whether the column
464+
contains a sparse vector respectively. To fetch these attributes, you can use:
465+
466+
.. code-block:: javascript
467+
468+
const vecDimensions = result.metadata[0].vectorDimensions;
469+
const vecStorageFormat = result.metadata[0].vectorFormat;
470+
const vecSparseVector = result.metadata[0].isSparseVector;
471+
console.log('Vector dimensions for the SPARSECOL64 column:', vecDimensions);
472+
console.log('Vector storage format for the SPARSECOL64 column:', vecStorageFormat);
473+
console.log('Sparse vector available in the SPARSECOL64 column:', vecSparseVector);
474+
475+
This prints the following output::
476+
477+
Vector dimensions for the SPARSECOL64 column: 4
478+
Vector storage format for the SPARSECOL64 column: 3
479+
Sparse vector available in the SPARSECOL64 column: true
480+
481+
This output indicates that the SPARSECOL64 column in vecSparseTable is a
482+
4-dimensional SPARSE vector with FLOAT64 storage format.
483+
484+
See `vectorSparse.js <https://github.com/oracle/node-oracledb/tree/
485+
main/examples/vectorSparse.js>`__ for a runnable example.
486+
487+
.. _convertsparsevector:
488+
489+
Converting Sparse Vectors to Dense Vectors
490+
------------------------------------------
491+
492+
You can convert a sparse vector to a dense vector using
493+
:meth:`SparseVector.dense()`. This method returns a TypedArray of 8-bit signed
494+
integers, 32-bit floating-point numbers, or 64-bit floating-point numbers
495+
depending on the storage format of the sparse vector column's non-zero values
496+
in Oracle Database. To convert a sparse vector to a dense vector, for example:
497+
498+
.. code-block:: javascript
499+
500+
const denseArray = sparseVec.dense();
501+
console.log('Dense vector:', denseArray);
502+
503+
This prints an output such as::
504+
505+
Dense vector: Float64Array(4) [ 0, 39, 0, -65 ]
506+
507+
A Float64 Typedarray is returned in this example since the vector storage
508+
format of sparseVec sparse vector is FLOAT64.

0 commit comments

Comments
 (0)