|
| 1 | +<!--- Hugo front matter used to generate the website version of this page: |
| 2 | +linkTitle: HTTP |
| 3 | +---> |
| 4 | + |
| 5 | +## Retrieval Spans |
| 6 | + |
| 7 | +<!-- semconv span.db.retrieval.client --> |
| 8 | +<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. --> |
| 9 | +<!-- see templates/registry/markdown/snippet.md.j2 --> |
| 10 | +<!-- prettier-ignore-start --> |
| 11 | +<!-- markdownlint-capture --> |
| 12 | +<!-- markdownlint-disable --> |
| 13 | + |
| 14 | +**Status:**  |
| 15 | + |
| 16 | +Spans representing retrieval calls adhere to the general [Semantic Conventions for Database Client Spans](/docs/database/database-spans.md). |
| 17 | + |
| 18 | +Retrieval calls can be made to vector databases, search engines, and other systems optimized |
| 19 | +for similarity search and retrieval operations. These systems are commonly used in |
| 20 | +Retrieval-Augmented Generation (RAG) applications and semantic search. |
| 21 | + |
| 22 | +`db.system.name` SHOULD be set to the specific database system being used (e.g., `"pinecone"`, |
| 23 | +`"weaviate"`, `"qdrant"`, `"chroma"`, `"milvus"`) and SHOULD be provided **at span creation time**. |
| 24 | + |
| 25 | +**Span name** SHOULD follow the general [database span name convention](/docs/database/database-spans.md#name). |
| 26 | +For retrieval operations, the span name SHOULD be `{db.operation.name} {db.collection.name}` when both |
| 27 | +are available, or `{db.operation.name}` otherwise. Common operation names include `search`, `query`, |
| 28 | +`retrieve`, or database-specific operation names. |
| 29 | + |
| 30 | +**Span kind** SHOULD be `CLIENT`. It MAY be set to `INTERNAL` on spans representing |
| 31 | +in-memory retrieval operations. |
| 32 | + |
| 33 | +**Span status** SHOULD follow the [Recording Errors](/docs/general/recording-errors.md) document. |
| 34 | + |
| 35 | +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |
| 36 | +|---|---|---|---|---|---| |
| 37 | +| [`db.system.name`](/docs/registry/attributes/db.md) | string | The database management system (DBMS) product as identified by the client instrumentation. [1] | `other_sql`; `softwareag.adabas`; `actian.ingres` | `Required` |  | |
| 38 | +| [`db.operation.name`](/docs/registry/attributes/db.md) | string | The name of the operation or command being executed. [2] | `findAndModify`; `HMSET`; `SELECT` | `Conditionally Required` [3] |  | |
| 39 | +| [`db.response.status_code`](/docs/registry/attributes/db.md) | string | Database response status code. [4] | `102`; `ORA-17002`; `08P01`; `404` | `Conditionally Required` [5] |  | |
| 40 | +| [`error.type`](/docs/registry/attributes/error.md) | string | Describes a class of error the operation ended with. [6] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. |  | |
| 41 | +| [`server.port`](/docs/registry/attributes/server.md) | int | Server port number. [7] | `80`; `8080`; `443` | `Conditionally Required` [8] |  | |
| 42 | +| [`db.operation.batch.size`](/docs/registry/attributes/db.md) | int | The number of queries included in a batch operation. [9] | `2`; `3`; `4` | `Recommended` |  | |
| 43 | +| [`db.query.text`](/docs/registry/attributes/db.md) | string | The query text or vector representation used for retrieval. [10] | `[0.1, 0.2, 0.3, ...]`; `search term`; `semantic query text` | `Recommended` [11] |  | |
| 44 | +| [`db.retrieval.documents_retrieved`](/docs/registry/attributes/db.md) | int | The actual number of documents retrieved. [12] | `5`; `10`; `15` | `Recommended` |  | |
| 45 | +| [`db.retrieval.top_k`](/docs/registry/attributes/db.md) | int | The maximum number of results requested. [13] | `5`; `10`; `20` | `Recommended` |  | |
| 46 | +| [`db.retrieval.type`](/docs/registry/attributes/db.md) | string | The type of retrieval operation being performed. [14] | `similarity`; `hybrid`; `keyword` | `Recommended` |  | |
| 47 | +| [`server.address`](/docs/registry/attributes/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` |  | |
| 48 | + |
| 49 | +**[1] `db.system.name`:** For retrieval databases, this should be set to the specific system name (e.g., `"pinecone"`, `"weaviate"`, `"qdrant"`, `"chroma"`, `"milvus"`). |
| 50 | + |
| 51 | +**[2] `db.operation.name`:** It is RECOMMENDED to capture the value as provided by the application |
| 52 | +without attempting to do any case normalization. |
| 53 | + |
| 54 | +The operation name SHOULD NOT be extracted from `db.query.text`, |
| 55 | +when the database system supports query text with multiple operations |
| 56 | +in non-batch operations. |
| 57 | + |
| 58 | +If spaces can occur in the operation name, multiple consecutive spaces |
| 59 | +SHOULD be normalized to a single space. |
| 60 | + |
| 61 | +For batch operations, if the individual operations are known to have the same operation name |
| 62 | +then that operation name SHOULD be used prepended by `BATCH `, |
| 63 | +otherwise `db.operation.name` SHOULD be `BATCH` or some other database |
| 64 | +system specific term if more applicable. |
| 65 | + |
| 66 | +**[3] `db.operation.name`:** If readily available and if there is a single operation name that describes the database call. |
| 67 | + |
| 68 | +**[4] `db.response.status_code`:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. |
| 69 | +Semantic conventions for individual database systems SHOULD document what `db.response.status_code` means in the context of that system. |
| 70 | + |
| 71 | +**[5] `db.response.status_code`:** If the operation failed and status code is available. |
| 72 | + |
| 73 | +**[6] `error.type`:** The `error.type` SHOULD match the `db.response.status_code` returned by the database or the client library, or the canonical name of exception that occurred. |
| 74 | +When using canonical exception type name, instrumentation SHOULD do the best effort to report the most relevant type. For example, if the original exception is wrapped into a generic one, the original exception SHOULD be preferred. |
| 75 | +Instrumentations SHOULD document how `error.type` is populated. |
| 76 | + |
| 77 | +**[7] `server.port`:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. |
| 78 | + |
| 79 | +**[8] `server.port`:** If using a port other than the default port for this DBMS and if `server.address` is set. |
| 80 | + |
| 81 | +**[9] `db.operation.batch.size`:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. |
| 82 | + |
| 83 | +**[10] `db.query.text`:** For vector similarity searches, this may contain the query vector representation or a textual description of the query. For hybrid searches, it may contain the keyword query portion. The value should be sanitized to remove sensitive information. |
| 84 | + |
| 85 | +**[11] `db.query.text`:** Should be collected when available and after sanitization to exclude sensitive data. |
| 86 | + |
| 87 | +**[12] `db.retrieval.documents_retrieved`:** This represents the count of documents/results actually returned by the database, which may be less than `db.retrieval.top_k` if fewer matching results were found. |
| 88 | + |
| 89 | +**[13] `db.retrieval.top_k`:** This represents the limit parameter or top_k value specified in the retrieval query, indicating how many results the client requested. The actual number of results returned may be captured in `db.retrieval.documents_retrieved`. |
| 90 | + |
| 91 | +**[14] `db.retrieval.type`:** This attribute describes the retrieval strategy used by the database or vector store. Common types include similarity search, hybrid search (combining multiple strategies), keyword search, or other database-specific retrieval methods. |
| 92 | + |
| 93 | +**[15] `server.address`:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. |
| 94 | + |
| 95 | +The following attributes can be important for making sampling decisions |
| 96 | +and SHOULD be provided **at span creation time** (if provided at all): |
| 97 | + |
| 98 | +* [`db.operation.name`](/docs/registry/attributes/db.md) |
| 99 | +* [`db.query.text`](/docs/registry/attributes/db.md) |
| 100 | +* [`db.system.name`](/docs/registry/attributes/db.md) |
| 101 | +* [`server.address`](/docs/registry/attributes/server.md) |
| 102 | +* [`server.port`](/docs/registry/attributes/server.md) |
| 103 | + |
| 104 | +--- |
| 105 | + |
| 106 | +`db.system.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. |
| 107 | + |
| 108 | +| Value | Description | Stability | |
| 109 | +|---|---|---| |
| 110 | +| `actian.ingres` | [Actian Ingres](https://www.actian.com/databases/ingres/) |  | |
| 111 | +| `aws.dynamodb` | [Amazon DynamoDB](https://aws.amazon.com/pm/dynamodb/) |  | |
| 112 | +| `aws.redshift` | [Amazon Redshift](https://aws.amazon.com/redshift/) |  | |
| 113 | +| `azure.cosmosdb` | [Azure Cosmos DB](https://learn.microsoft.com/azure/cosmos-db) |  | |
| 114 | +| `cassandra` | [Apache Cassandra](https://cassandra.apache.org/) |  | |
| 115 | +| `clickhouse` | [ClickHouse](https://clickhouse.com/) |  | |
| 116 | +| `cockroachdb` | [CockroachDB](https://www.cockroachlabs.com/) |  | |
| 117 | +| `couchbase` | [Couchbase](https://www.couchbase.com/) |  | |
| 118 | +| `couchdb` | [Apache CouchDB](https://couchdb.apache.org/) |  | |
| 119 | +| `derby` | [Apache Derby](https://db.apache.org/derby/) |  | |
| 120 | +| `elasticsearch` | [Elasticsearch](https://www.elastic.co/elasticsearch) |  | |
| 121 | +| `firebirdsql` | [Firebird](https://www.firebirdsql.org/) |  | |
| 122 | +| `gcp.spanner` | [Google Cloud Spanner](https://cloud.google.com/spanner) |  | |
| 123 | +| `geode` | [Apache Geode](https://geode.apache.org/) |  | |
| 124 | +| `h2database` | [H2 Database](https://h2database.com/) |  | |
| 125 | +| `hbase` | [Apache HBase](https://hbase.apache.org/) |  | |
| 126 | +| `hive` | [Apache Hive](https://hive.apache.org/) |  | |
| 127 | +| `hsqldb` | [HyperSQL Database](https://hsqldb.org/) |  | |
| 128 | +| `ibm.db2` | [IBM Db2](https://www.ibm.com/db2) |  | |
| 129 | +| `ibm.informix` | [IBM Informix](https://www.ibm.com/products/informix) |  | |
| 130 | +| `ibm.netezza` | [IBM Netezza](https://www.ibm.com/products/netezza) |  | |
| 131 | +| `influxdb` | [InfluxDB](https://www.influxdata.com/) |  | |
| 132 | +| `instantdb` | [Instant](https://www.instantdb.com/) |  | |
| 133 | +| `intersystems.cache` | [InterSystems Caché](https://www.intersystems.com/products/cache/) |  | |
| 134 | +| `mariadb` | [MariaDB](https://mariadb.org/) |  | |
| 135 | +| `memcached` | [Memcached](https://memcached.org/) |  | |
| 136 | +| `microsoft.sql_server` | [Microsoft SQL Server](https://www.microsoft.com/sql-server) |  | |
| 137 | +| `mongodb` | [MongoDB](https://www.mongodb.com/) |  | |
| 138 | +| `mysql` | [MySQL](https://www.mysql.com/) |  | |
| 139 | +| `neo4j` | [Neo4j](https://neo4j.com/) |  | |
| 140 | +| `opensearch` | [OpenSearch](https://opensearch.org/) |  | |
| 141 | +| `oracle.db` | [Oracle Database](https://www.oracle.com/database/) |  | |
| 142 | +| `other_sql` | Some other SQL database. Fallback only. |  | |
| 143 | +| `postgresql` | [PostgreSQL](https://www.postgresql.org/) |  | |
| 144 | +| `redis` | [Redis](https://redis.io/) |  | |
| 145 | +| `sap.hana` | [SAP HANA](https://www.sap.com/products/technology-platform/hana/what-is-sap-hana.html) |  | |
| 146 | +| `sap.maxdb` | [SAP MaxDB](https://maxdb.sap.com/) |  | |
| 147 | +| `softwareag.adabas` | [Adabas (Adaptable Database System)](https://documentation.softwareag.com/?pf=adabas) |  | |
| 148 | +| `sqlite` | [SQLite](https://www.sqlite.org/) |  | |
| 149 | +| `teradata` | [Teradata](https://www.teradata.com/) |  | |
| 150 | +| `trino` | [Trino](https://trino.io/) |  | |
| 151 | + |
| 152 | +--- |
| 153 | + |
| 154 | +`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. |
| 155 | + |
| 156 | +| Value | Description | Stability | |
| 157 | +|---|---|---| |
| 158 | +| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. |  | |
| 159 | + |
| 160 | +<!-- markdownlint-restore --> |
| 161 | +<!-- prettier-ignore-end --> |
| 162 | +<!-- END AUTOGENERATED TEXT --> |
| 163 | +<!-- endsemconv --> |
0 commit comments