Skip to content

Commit 729212e

Browse files
craig[bot]rickystewartyuzefovicharulajmaniYevgeniy Miretskiy
committed
104228: dev: replace `stress` with `--runs_per_test` r=rail a=rickystewart `--runs_per_test` does the same thing that `stress` does (runs a test many times to weed out flakes), but better, in that: 1. Better Bazel support -- we had to hack Bazel support into `stress` to keep it working 2. Bazel gives us statistics and control over how the tests are scheduled in a way that is standardized as opposed to the ad-hoc arguments one can pass to `stress` 3. Bazel can collect logs and artifacts for different test runs and expose them to the user in a way that `stress` cannot 4. Drop-in support for remote execution when we have access to it, obviating the need for `roachprod-stress` For now, the default implementation of `dev test --stress` is to run the test 1,000 times, stopping the build if any test fails. This is meant to replicate how people use `stress` (i.e., just start it running and stop if there are any failures). The 1,000 figure is arbitrary and meant to just be a "very high number" of times to run unit tests. The default figure of 1,000 can be adjusted with `--count`. Also update documentation with some new recipes and add extra logging to warn about the change in behavior. Closes cockroachdb#102879 Epic: none Release note: None 107765: sql: remove calls to CreateTestServerParams r=yuzefovich a=yuzefovich This PR removes calls to `tests.CreateTestServerParams` outside of `sql_test` tests as well as does some miscellaneous cleanups along the way. The rationale for doing this is that vast majority of callers didn't actually need the setup performed by the helper but inherited the disabling of test tenant. It seems more prudent for each test to be explicit about the kind of testing knobs and settings it needs. Informs: cockroachdb#76378. Epic: CRDB-18499. Release note: None 107889: concurrency: enable joint claims on the request scan path r=nvanbenschoten a=arulajmani This patch addresses a TODO that was blocking joint claims on a request's scan path and adds a test to show joint claims work. In particular, previously the lock mode of a queued request was hardcoded to use locking strength `lock.Intent`. We now use the correct lock mode by snapshotting an immutable copy of the request's lock strength on to the queuedGuard when inserting a it into a lock's wait queue. Informs cockroachdb#102275 Release note: None 108052: changefeedccl: Release allocation when skipping events r=miretskiy a=miretskiy The changefeed (or KV feed to be precise) may skip some events when "scan boundary" is reached. Scan boundary is a timestamp when certain event occurs -- usually a schema change. But, it may also occur when the `end_time` option is set. The KV feed ignores events that have MVCC timestamp greater or equal to the scan boundary event. Unfortunately, due to a long outstanding bug, the memory allocation associated with the event would not be released when KV feed decides to skip the event. Because of this, allocated memory was "leaked" and not reclaimed. If enough additional events arrive, those leaked events may account for all of the memory budget, thus leading to inability for additional events to be added. This bug impacts any changefeeds running with the `end_time` option set. It might also impact changefeeds that observe normal schema change event, though this situation is highly unlikely(the same transaction that perform schema change had to have modified sufficient number of rows in the table to fill up all of the memory budget). Fixes cockroachdb#108040 Release note (enterprise change): Fix a potential "deadlock" when running changefeed with `end_time` option set. 108066: dev: set `COCKROACH_DEV_LICENSE` for `compose` r=rail a=rickystewart This test won't run without the license key set. Epic: CRDB-17171 Release note: None Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Arul Ajmani <[email protected]> Co-authored-by: Yevgeniy Miretskiy <[email protected]>
6 parents 46255c8 + c232fa8 + cf4d08b + 29a9db3 + 74f3f34 + e1a7054 commit 729212e

File tree

148 files changed

+1490
-963
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

148 files changed

+1490
-963
lines changed

dev

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ fi
88
set -euo pipefail
99

1010
# Bump this counter to force rebuilding `dev` on all machines.
11-
DEV_VERSION=82
11+
DEV_VERSION=83
1212

1313
THIS_DIR=$(cd "$(dirname "$0")" && pwd)
1414
BINARY_DIR=$THIS_DIR/bin/dev-versions

pkg/ccl/changefeedccl/changefeed_test.go

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7342,6 +7342,25 @@ func (s *memoryHoggingSink) Close() error {
73427342
return nil
73437343
}
73447344

7345+
type countEmittedRowsSink struct {
7346+
memoryHoggingSink
7347+
numRows int64 // Accessed atomically; not using atomic.Int64 to make backports possible.
7348+
}
7349+
7350+
func (s *countEmittedRowsSink) EmitRow(
7351+
ctx context.Context,
7352+
topic TopicDescriptor,
7353+
key, value []byte,
7354+
updated, mvcc hlc.Timestamp,
7355+
alloc kvevent.Alloc,
7356+
) error {
7357+
alloc.Release(ctx)
7358+
atomic.AddInt64(&s.numRows, 1)
7359+
return nil
7360+
}
7361+
7362+
var _ Sink = (*countEmittedRowsSink)(nil)
7363+
73457364
func TestChangefeedFlushesSinkToReleaseMemory(t *testing.T) {
73467365
defer leaktest.AfterTest(t)()
73477366
defer log.Scope(t).Close(t)
@@ -7392,6 +7411,54 @@ func TestChangefeedFlushesSinkToReleaseMemory(t *testing.T) {
73927411
require.Greater(t, sink.numFlushes(), 0)
73937412
}
73947413

7414+
// Test verifies that KV feed does not leak event memory allocation
7415+
// when it reaches end_time or scan boundary.
7416+
func TestKVFeedDoesNotLeakMemoryWhenSkippingEvents(t *testing.T) {
7417+
defer leaktest.AfterTest(t)()
7418+
defer log.Scope(t).Close(t)
7419+
7420+
s, stopServer := makeServer(t)
7421+
defer stopServer()
7422+
7423+
sqlDB := sqlutils.MakeSQLRunner(s.DB)
7424+
knobs := s.TestingKnobs.
7425+
DistSQL.(*execinfra.TestingKnobs).
7426+
Changefeed.(*TestingKnobs)
7427+
7428+
// Arrange for a small memory budget.
7429+
knobs.MemMonitor = startMonitorWithBudget(4096)
7430+
7431+
// Arrange for custom sink to be used -- a sink that counts emitted rows.
7432+
sink := &countEmittedRowsSink{}
7433+
knobs.WrapSink = func(_ Sink, _ jobspb.JobID) Sink {
7434+
return sink
7435+
}
7436+
sqlDB.Exec(t, `CREATE TABLE foo(key INT PRIMARY KEY DEFAULT unique_rowid(), val INT)`)
7437+
7438+
startTime := s.Server.Clock().Now().AsOfSystemTime()
7439+
7440+
// Insert 123 rows -- this fills up our tiny memory buffer (~26 rows do)
7441+
// Collect statement timestamp -- this will become our end time.
7442+
var insertTimeStr string
7443+
sqlDB.QueryRow(t,
7444+
`INSERT INTO foo (val) SELECT * FROM generate_series(1, 123) RETURNING cluster_logical_timestamp();`,
7445+
).Scan(&insertTimeStr)
7446+
endTime := parseTimeToHLC(t, insertTimeStr).AsOfSystemTime()
7447+
7448+
// Start the changefeed, with end_time set to be equal to the insert time.
7449+
// KVFeed should ignore all events.
7450+
var jobID jobspb.JobID
7451+
sqlDB.QueryRow(t, `CREATE CHANGEFEED FOR foo INTO 'null:' WITH cursor = $1, end_time = $2`,
7452+
startTime, endTime).Scan(&jobID)
7453+
7454+
// If everything is fine (events are ignored, but their memory allocation is released),
7455+
// the changefeed should terminate. If not, we'll time out waiting for job.
7456+
waitForJobStatus(sqlDB, t, jobID, jobs.StatusSucceeded)
7457+
7458+
// No rows should have been emitted (all should have been filtered out due to end_time).
7459+
require.EqualValues(t, 0, atomic.LoadInt64(&sink.numRows))
7460+
}
7461+
73957462
func TestChangefeedMultiPodTenantPlanning(t *testing.T) {
73967463
defer leaktest.AfterTest(t)()
73977464
defer log.Scope(t).Close(t)

pkg/ccl/changefeedccl/kvfeed/kv_feed.go

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -584,6 +584,7 @@ func copyFromSourceToDestUntilTableEvent(
584584
) error {
585585
var (
586586
scanBoundary errBoundaryReached
587+
endTimeIsSet = !endTime.IsEmpty()
587588

588589
// checkForScanBoundary takes in a new event's timestamp (event generated
589590
// from rangefeed), and asks "Is some type of 'boundary' reached
@@ -595,7 +596,11 @@ func copyFromSourceToDestUntilTableEvent(
595596
// If the scanBoundary is not nil, it either means that there is a table
596597
// event boundary set or a boundary for the end time. If the boundary is
597598
// for the end time, we should keep looking for table events.
598-
_, isEndTimeBoundary := scanBoundary.(*errEndTimeReached)
599+
isEndTimeBoundary := false
600+
if endTimeIsSet {
601+
_, isEndTimeBoundary = scanBoundary.(*errEndTimeReached)
602+
}
603+
599604
if scanBoundary != nil && !isEndTimeBoundary {
600605
return nil
601606
}
@@ -610,7 +615,7 @@ func copyFromSourceToDestUntilTableEvent(
610615
// precedence to table events.
611616
if len(nextEvents) > 0 {
612617
scanBoundary = &errTableEventReached{nextEvents[0]}
613-
} else if !endTime.IsEmpty() && scanBoundary == nil {
618+
} else if endTimeIsSet && scanBoundary == nil {
614619
scanBoundary = &errEndTimeReached{
615620
endTime: endTime,
616621
}
@@ -691,6 +696,17 @@ func copyFromSourceToDestUntilTableEvent(
691696
if err != nil {
692697
return err
693698
}
699+
700+
if skipEntry || scanBoundaryReached {
701+
// We will skip this entry or outright terminate kvfeed (if boundary reached).
702+
// Regardless of the reason, we must release this event memory allocation
703+
// since other ranges might not have reached scan boundary yet.
704+
// Failure to release this event allocation may prevent other events from being
705+
// enqueued in the blocking buffer due to memory limit.
706+
a := e.DetachAlloc()
707+
a.Release(ctx)
708+
}
709+
694710
if scanBoundaryReached {
695711
// All component rangefeeds are now at the boundary.
696712
// Break out of the ctxgroup by returning the sentinel error.
@@ -702,7 +718,6 @@ func copyFromSourceToDestUntilTableEvent(
702718
return addEntry(e)
703719
}
704720
)
705-
706721
for {
707722
e, err := source.Get(ctx)
708723
if err != nil {

pkg/ccl/serverccl/BUILD.bazel

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,6 @@ go_library(
1717
"//pkg/sql",
1818
"//pkg/sql/contention",
1919
"//pkg/sql/sqlstats/persistedsqlstats",
20-
"//pkg/sql/tests",
2120
"//pkg/testutils/serverutils",
2221
"//pkg/testutils/sqlutils",
2322
"//pkg/util/httputil",

pkg/ccl/serverccl/statusccl/tenant_grpc_test.go

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,7 @@ func TestTenantGRPCServices(t *testing.T) {
4646

4747
tenantID := serverutils.TestTenantID()
4848
testingKnobs := base.TestingKnobs{
49-
SQLStatsKnobs: &sqlstats.TestingKnobs{
50-
AOSTClause: "AS OF SYSTEM TIME '-1us'",
51-
},
49+
SQLStatsKnobs: sqlstats.CreateTestingKnobs(),
5250
}
5351
tenant, connTenant := serverutils.StartTenant(t, server, base.TestTenantArgs{
5452
TenantID: tenantID,

pkg/ccl/serverccl/statusccl/tenant_status_test.go

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -391,9 +391,7 @@ func TestTenantCannotSeeNonTenantStats(t *testing.T) {
391391
tenant, sqlDB := serverutils.StartTenant(t, server, base.TestTenantArgs{
392392
TenantID: roachpb.MustMakeTenantID(10 /* id */),
393393
TestingKnobs: base.TestingKnobs{
394-
SQLStatsKnobs: &sqlstats.TestingKnobs{
395-
AOSTClause: "AS OF SYSTEM TIME '-1us'",
396-
},
394+
SQLStatsKnobs: sqlstats.CreateTestingKnobs(),
397395
},
398396
})
399397

pkg/ccl/serverccl/tenant_test_utils.go

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,6 @@ import (
2323
"github.com/cockroachdb/cockroach/pkg/sql"
2424
"github.com/cockroachdb/cockroach/pkg/sql/contention"
2525
"github.com/cockroachdb/cockroach/pkg/sql/sqlstats/persistedsqlstats"
26-
"github.com/cockroachdb/cockroach/pkg/sql/tests"
2726
"github.com/cockroachdb/cockroach/pkg/testutils/serverutils"
2827
"github.com/cockroachdb/cockroach/pkg/testutils/sqlutils"
2928
"github.com/cockroachdb/cockroach/pkg/util/httputil"
@@ -221,9 +220,10 @@ func newTenantClusterHelper(
221220

222221
var cluster tenantCluster = make([]TestTenant, tenantClusterSize)
223222
for i := 0; i < tenantClusterSize; i++ {
224-
args := tests.CreateTestTenantParams(roachpb.MustMakeTenantID(tenantID))
225-
args.TestingKnobs = knobs
226-
cluster[i] = newTestTenant(t, server, args)
223+
cluster[i] = newTestTenant(t, server, base.TestTenantArgs{
224+
TenantID: roachpb.MustMakeTenantID(tenantID),
225+
TestingKnobs: knobs,
226+
})
227227
}
228228

229229
return cluster

pkg/ccl/sqlproxyccl/BUILD.bazel

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,6 @@ go_test(
9595
"//pkg/sql/pgwire",
9696
"//pkg/sql/pgwire/pgerror",
9797
"//pkg/sql/pgwire/pgwirebase",
98-
"//pkg/sql/tests",
9998
"//pkg/testutils",
10099
"//pkg/testutils/datapathutils",
101100
"//pkg/testutils/serverutils",

pkg/ccl/sqlproxyccl/proxy_handler_test.go

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,6 @@ import (
3737
"github.com/cockroachdb/cockroach/pkg/sql/catalog/descs"
3838
"github.com/cockroachdb/cockroach/pkg/sql/pgwire"
3939
"github.com/cockroachdb/cockroach/pkg/sql/pgwire/pgerror"
40-
"github.com/cockroachdb/cockroach/pkg/sql/tests"
4140
"github.com/cockroachdb/cockroach/pkg/testutils"
4241
"github.com/cockroachdb/cockroach/pkg/testutils/datapathutils"
4342
"github.com/cockroachdb/cockroach/pkg/testutils/serverutils"
@@ -1931,15 +1930,16 @@ func TestConnectionMigration(t *testing.T) {
19311930
require.NoError(t, err)
19321931

19331932
// Start first SQL pod.
1934-
tenant1, tenantDB1 := serverutils.StartTenant(t, s, tests.CreateTestTenantParams(tenantID))
1933+
tenant1, tenantDB1 := serverutils.StartTenant(t, s, base.TestTenantArgs{TenantID: tenantID})
19351934
tenant1.PGPreServer().(*pgwire.PreServeConnHandler).TestingSetTrustClientProvidedRemoteAddr(true)
19361935
defer tenant1.Stopper().Stop(ctx)
19371936
defer tenantDB1.Close()
19381937

19391938
// Start second SQL pod.
1940-
params2 := tests.CreateTestTenantParams(tenantID)
1941-
params2.DisableCreateTenant = true
1942-
tenant2, tenantDB2 := serverutils.StartTenant(t, s, params2)
1939+
tenant2, tenantDB2 := serverutils.StartTenant(t, s, base.TestTenantArgs{
1940+
TenantID: tenantID,
1941+
DisableCreateTenant: true,
1942+
})
19431943
tenant2.PGPreServer().(*pgwire.PreServeConnHandler).TestingSetTrustClientProvidedRemoteAddr(true)
19441944
defer tenant2.Stopper().Stop(ctx)
19451945
defer tenantDB2.Close()
@@ -2984,10 +2984,11 @@ func startTestTenantPodsWithStopper(
29842984

29852985
var tenants []serverutils.ApplicationLayerInterface
29862986
for i := 0; i < count; i++ {
2987-
params := tests.CreateTestTenantParams(tenantID)
2988-
params.TestingKnobs = knobs
2989-
params.Stopper = stopper
2990-
tenant, tenantDB := serverutils.StartTenant(t, ts, params)
2987+
tenant, tenantDB := serverutils.StartTenant(t, ts, base.TestTenantArgs{
2988+
TenantID: tenantID,
2989+
TestingKnobs: knobs,
2990+
Stopper: stopper,
2991+
})
29912992
tenant.PGPreServer().(*pgwire.PreServeConnHandler).TestingSetTrustClientProvidedRemoteAddr(true)
29922993

29932994
// Create a test user. We only need to do it once.

pkg/ccl/testccl/sqlccl/gc_job_test.go

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ import (
1616
"github.com/cockroachdb/cockroach/pkg/jobs"
1717
"github.com/cockroachdb/cockroach/pkg/jobs/jobspb"
1818
"github.com/cockroachdb/cockroach/pkg/sql/sqltestutils"
19-
"github.com/cockroachdb/cockroach/pkg/sql/tests"
2019
"github.com/cockroachdb/cockroach/pkg/testutils"
2120
"github.com/cockroachdb/cockroach/pkg/testutils/serverutils"
2221
"github.com/cockroachdb/cockroach/pkg/testutils/sqlutils"
@@ -36,7 +35,9 @@ func TestGCJobGetsMarkedIdle(t *testing.T) {
3635
})
3736
sqltestutils.SetShortRangeFeedIntervals(t, mainDB)
3837
defer s.Stopper().Stop(ctx)
39-
tenant, tenantDB := serverutils.StartTenant(t, s, tests.CreateTestTenantParams(serverutils.TestTenantID()))
38+
tenant, tenantDB := serverutils.StartTenant(t, s, base.TestTenantArgs{
39+
TenantID: serverutils.TestTenantID(),
40+
})
4041
defer tenant.Stopper().Stop(ctx)
4142
defer tenantDB.Close()
4243

0 commit comments

Comments
 (0)