Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 14 additions & 3 deletions admin_guide/upgrading_citus.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,17 +77,28 @@ After installing the new package and restarting the database, run the extension
-- you should see the upgraded Citus version
SELECT * FROM citus_version();

-- if upgrading to Citus 11.x or later,
-- run on the coordinator node
-- This UDF must be executed during one of the following upgrade paths:
-- * when upgrading to Citus 11 or later, while still on a pre-11 version, OR
-- * when upgrading to Citus 14 or later, while still on a pre-14 version.
--
-- Must be run on the coordinator node.
CALL citus_finish_citus_upgrade();

.. note::

If upgrading to Citus 11.x from an earlier major version, the
citus_finish_citus_upgrade() procedure will make sure that all worker nodes
have the right schema and metadata. It may take several minutes to run,
depending on how much metadata needs to be synced.

.. note::

If upgrading to Citus 14.x from an earlier major version, the
citus_finish_citus_upgrade() procedure will make sure to fix the colocation
groups whose registered collation columns do not match the distribution
columns of the tables in the group, if any. No data movement is involved
in this process, but it has to adjust pg_dist_colocation and pg_dist_partition
metadata tables on all nodes, if needed.

.. note::

During a major version upgrade, from the moment of yum installing a new
Expand Down
20 changes: 20 additions & 0 deletions develop/api_guc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,26 @@ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

Sampling rate for new tenants in citus_stat_tenants. The rate can be of range between ``0.0`` and ``1.0``. Default is ``1.0`` meaning 100% of untracked tenant queries are sampled. Setting it to a lower value means that already tracked tenants have 100% queries sampled, but tenants that are currently untracked are sampled only at the provided rate.

.. _enable_stat_counters:

citus.enable_stat_counters (boolean)
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

When enabled, Citus maintains statistics counters that track various operations including connection management and query execution. These statistics are available through the :ref:`citus_stat_counters <citus_stat_counters>` view and can be reset using the :ref:`citus_stat_counters_reset() <citus_stat_counters_reset>` function.

Important notes:

* Statistics are not persisted and are lost on server shutdown.
* The default value is false (disabled).
* Requires superuser privileges to change.
* When disabled, no statistics are collected, but existing statistics remain queryable.

.. code-block:: postgresql

-- enable stat counters collection
ALTER SYSTEM SET citus.enable_stat_counters = true;
SELECT pg_reload_conf();

Data Loading
---------------------------

Expand Down
62 changes: 62 additions & 0 deletions develop/api_metadata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -736,6 +736,68 @@ The tenant-level statistics will reflect the queries we just made:
1 | 1 | 3 | 0.000883
2 | 0 | 1 | 0.000144

.. _citus_stat_counters:

Statistics counters view
~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``citus_stat_counters`` view provides statistics about Citus operations including connection management and query execution. These statistics are collected when :ref:`citus.enable_stat_counters <enable_stat_counters>` is enabled.

The view reports aggregate statistics for each database, tracking both connection-level metrics (connection establishment and reuse) and query execution patterns (single-shard vs multi-shard queries).

**Connection Management Statistics:**

* **connection_establishment_succeeded**: Number of successful inter-node connections initiated by this node.
* **connection_establishment_failed**: Number of failed connection attempts. This includes various failure scenarios such as unreachable nodes, connection timeouts (see :ref:`citus.node_connection_timeout <node_connection_timeout>`), configuration errors, or internal issues. Note that optional connections that were skipped due to connection throttling (see ``citus.max_shared_pool_size`` and ``citus.local_shared_pool_size``) are not counted as failures since they were never attempted.
* **connection_reused**: Number of times a cached connection was reused instead of establishing a new one (see ``citus.max_cached_conns_per_worker`` and ``citus.max_cached_connection_lifetime``).

**Query Execution Statistics:**

These counters are incremented not just for top-level queries but also for subplans and subqueries. For example, recursive planning steps and the SELECT portion of INSERT ... SELECT queries increment these counters separately.

* **query_execution_single_shard**: Number of queries that accessed only a single shard.
* **query_execution_multi_shard**: Number of queries that accessed multiple shards across the cluster.

+-------------------------------------+-------------+------------------------------------------------------------------------+
| Name | Type | Description |
+=====================================+=============+========================================================================+
| database_id | oid | Database OID |
+-------------------------------------+-------------+------------------------------------------------------------------------+
| name | name | Database name (NULL if database has been dropped) |
+-------------------------------------+-------------+------------------------------------------------------------------------+
| connection_establishment_succeeded | bigint | Successful inter-node connection establishments |
+-------------------------------------+-------------+------------------------------------------------------------------------+
| connection_establishment_failed | bigint | Failed connection attempts |
+-------------------------------------+-------------+------------------------------------------------------------------------+
| connection_reused | bigint | Cached connections that were reused |
+-------------------------------------+-------------+------------------------------------------------------------------------+
| query_execution_single_shard | bigint | Queries accessing a single shard |
+-------------------------------------+-------------+------------------------------------------------------------------------+
| query_execution_multi_shard | bigint | Queries accessing multiple shards |
+-------------------------------------+-------------+------------------------------------------------------------------------+
| stats_reset | timestamptz | Last time the statistics were reset (NULL if never reset) |
+-------------------------------------+-------------+------------------------------------------------------------------------+

Example:

.. code-block:: postgres

SELECT * FROM citus_stat_counters;

::

database_id | name | connection_establishment_succeeded | connection_establishment_failed | connection_reused | query_execution_single_shard | query_execution_multi_shard | stats_reset
-------------+------------+------------------------------------+---------------------------------+-------------------+------------------------------+-----------------------------+-------------
13340 | mydb | 245 | 3 | 89 | 156 | 78 | 2025-01-15...
16384 | analytics | 12 | 0 | 5 | 23 | 45 | 2025-01-14...

Caveats:

* Statistics are not persisted and are lost on server restart.
* Statistics can be manually reset using the :ref:`citus_stat_counters_reset() <citus_stat_counters_reset>` function.
* The ``citus_stat_counters()`` function can be used to query statistics for a specific database by providing its OID.
* Even after a database is dropped, its statistics remain visible via the ``citus_stat_counters()`` function, though the ``name`` column will be NULL in the view.

.. _dist_query_activity:

Distributed Query Activity
Expand Down
89 changes: 89 additions & 0 deletions develop/api_udf.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1348,6 +1348,95 @@ Return Value

None

.. _citus_stat_counters_reset:

citus_stat_counters_reset
$$$$$$$$$$$$$$$$$$$$$$$$$$$

Resets the Citus statistics counters to zero for a specified database. The statistics are displayed in the :ref:`citus_stat_counters <citus_stat_counters>` view and include connection management and query execution metrics.

This function requires superuser privileges.

Arguments
*********

**database_id:** (Optional) OID of the database for which to reset statistics. Use ``0`` or ``NULL`` (or omit the parameter) to reset statistics for the current database. You can provide a specific database OID to reset statistics for any database **on the local node** in the cluster.

Return Value
************

None

Example
*******

.. code-block:: postgresql

-- reset stats for the current database
SELECT citus_stat_counters_reset();

-- or explicitly specify the current database
SELECT citus_stat_counters_reset(0);

-- reset stats for a specific database by OID, no-op if no such database
SELECT citus_stat_counters_reset(12345);

Caveats
*******

* Due to concurrent access patterns, there is a small possibility that the function might not reset counters for all active backends. This is a rare edge case.
* After resetting, the ``stats_reset`` column in the :ref:`citus_stat_counters <citus_stat_counters>` view will show the timestamp when the reset occurred.

citus_stat_counters
$$$$$$$$$$$$$$$$$$$$$$$$$$$

Returns Citus statistics counters for one or more databases. This function is particularly useful for querying statistics for specific databases, including databases that have been dropped (which will still have their statistics available through this function, though not through the :ref:`citus_stat_counters view <citus_stat_counters>`).

The statistics include the same information as the :ref:`citus_stat_counters view <citus_stat_counters>`.

See :ref:`citus.enable_stat_counters <enable_stat_counters>` to enable statistics collection.

Arguments
*********

**database_id:** (Optional) OID of the database for which to retrieve statistics. Use ``0`` or ``NULL`` (or omit the parameter) to retrieve statistics for all databases.

Return Value
************

A set of records with the following columns:

* **database_id** (oid): Database OID.
* **connection_establishment_succeeded** (bigint): Successful inter-node connection establishments.
* **connection_establishment_failed** (bigint): Failed connection attempts.
* **connection_reused** (bigint): Cached connections that were reused.
* **query_execution_single_shard** (bigint): Queries accessing a single shard.
* **query_execution_multi_shard** (bigint): Queries accessing multiple shards.
* **stats_reset** (timestamptz): Last time the statistics were reset (NULL if never reset).

Example
*******

.. code-block:: postgresql

-- get stats for all databases
SELECT * FROM citus_stat_counters();

-- get stats for a specific database
SELECT * FROM citus_stat_counters(12345);

-- get stats for current database only
SELECT * FROM citus_stat_counters(
(SELECT oid FROM pg_database WHERE datname = current_database())
);

Caveats
*******

* Unlike the :ref:`citus_stat_counters view <citus_stat_counters>`, this function can return statistics for databases that have been dropped (the view filters these out).
* Statistics are only available for databases that have had at least one connection since the last server restart.
* Providing a database OID that has never been used returns an empty result set.

.. _cluster_management_functions:

Cluster Management And Repair Functions
Expand Down