Skip to content

[BUG] DatabaseDefault object appears in SQL when bulk_create() batch_size exceeds 1000 #136

@kostasgan-csq

Description

@kostasgan-csq

Describe the bug
I have encountered an issue with Django 5.2's db_default model field when using bulk_create() with the django-clickhouse-backend package.

I am attempting to create a table with an id column whose value is automatically generated by ClickHouse using the generateSerialID() function.

My Django model extends ClickhouseModel and contains the following id field:

id = ch_models.UInt64Field(
    primary_key=True,
    db_default="generateSerialID('table_name_id')",
)

When instantiating thousands of records and performing a bulk_create() with a batch size greater than 1000, a database exception occurs because the literal string <django.db.models.expressions.DatabaseDefault object at 0x7f0754b6f5c0> appears in the id field of the SQL INSERT query instead of the expected DEFAULT keyword.

To Reproduce

  1. Create the ClickHouse table:
CREATE TABLE bug_table
(
    id          UInt64 DEFAULT generateSerialID('bug_table__id'),
    description String
)
ENGINE = MergeTree
ORDER BY (id)
PRIMARY KEY (id);
  1. Define the Django model and execute the following in a Django shell:
import string
import random
from django.db import connections
from django.test.utils import CaptureQueriesContext
from clickhouse_backend import models as ch_models

class BugTable(ch_models.ClickhouseModel):
    id = ch_models.UInt64Field(
        primary_key=True,
        db_default="generateSerialID('bug_table__id')",
    )
    description = ch_models.StringField(max_length=10, blank=True)

    class Meta:
        db_table = 'bug_table'
        ordering = ['id']
        engine = ch_models.MergeTree(primary_key='id')

ALPHABET = string.ascii_uppercase

def rand3() -> str:
    return ''.join(random.choices(ALPHABET, k=3))

# Create 1001 objects (exceeds the threshold)
objs = [BugTable(description=rand3()) for _ in range(1001)]

conn = connections['your_clickhouse_connection_name']
conn.force_debug_cursor = True

# This will fail with batch_size > 1000
with CaptureQueriesContext(conn) as ctx:
    BugTable.objects.bulk_create(objs, batch_size=1001)

print(f"Captured {len(ctx.captured_queries)} queries")
for i, q in enumerate(ctx.captured_queries, 1):
    print(f"\n-- SQL #{i}\n{q['sql']}")

Expected behavior
The generated SQL INSERT statement should contain the DEFAULT keyword for the id field instead of the DatabaseDefault object representation.

Workarounds
Option 1: Use a batch size of 1000 or less:
BugTable.objects.bulk_create(objs, batch_size=1000) # Works correctly

Option 2: Increase MAX_ROWS_INSERT_USE_EXPRESSION at runtime:

from clickhouse_backend.models.sql import compiler
compiler.MAX_ROWS_INSERT_USE_EXPRESSION = 10_000

with CaptureQueriesContext(conn) as ctx:
    BugTable.objects.bulk_create(objs, batch_size=1001) # Now works

for i, q in enumerate(ctx.captured_queries, 1):
    print(f"\n-- SQL #{i}\n{q['sql']}")

Versions

  • ClickHouse server version: 25.8.7.3
  • Python version 3.12.10
  • Clickhouse-driver version: 0.2.9
  • Django version: 5.2
  • Django clickhouse backend version: 1.4

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions