Skip to content

Commit adf0cdc

Browse files
authored
feat: add --sample-database flag for initializing embedded emulator with official samples (#472)
Adds the ability to initialize the embedded Spanner emulator with pre-populated official sample databases, streamlining development and testing workflows. Key features: - Adds --sample-database flag to load official Google samples (banking, finance, finance-graph, finance-pg, gaming) - Adds --list-samples flag to show available sample databases - Supports flexible URI-based loading (gs://, file://, https://) for extensibility - Integrates with spanemuboost for DDL/DML initialization - Handles both GoogleSQL and PostgreSQL dialects automatically Implementation: - Uses Go's native Cloud Storage client for GCS access - Downloads schema and data files directly without caching - Validates that --embedded-emulator is used with --sample-database - Provides clear error messages and initialization feedback This feature enables users to quickly start with realistic sample data for testing, demos, and learning Spanner capabilities without manual database setup. Fixes #470
1 parent 889ab33 commit adf0cdc

File tree

12 files changed

+984
-31
lines changed

12 files changed

+984
-31
lines changed

.gemini/styleguide.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,38 @@ err3 := fmt.Errorf("first: %w, second: %w", err1, err2) // Valid
4545

4646
Reference: https://go.dev/doc/go1.20#errors
4747

48+
### WaitGroup.Go Method (Go 1.25+)
49+
50+
Since Go 1.25, `sync.WaitGroup` has a new `Go` method that simplifies concurrent task management:
51+
52+
- **`WaitGroup.Go(f func())`** starts a goroutine and automatically manages Add/Done
53+
- Eliminates the need for manual `wg.Add(1)` and `defer wg.Done()` calls
54+
- Makes concurrent code cleaner and less error-prone
55+
56+
**DO NOT** suggest that `sync.WaitGroup` doesn't have a `Go` method when reviewing Go 1.25+ code:
57+
58+
```go
59+
// Go 1.25+ simplified pattern
60+
var wg sync.WaitGroup
61+
wg.Go(func() {
62+
// Task executes in goroutine
63+
// No need for Add(1) or defer Done()
64+
doWork()
65+
})
66+
wg.Wait()
67+
68+
// Old pattern (still valid but more verbose)
69+
var wg sync.WaitGroup
70+
wg.Add(1)
71+
go func() {
72+
defer wg.Done()
73+
doWork()
74+
}()
75+
wg.Wait()
76+
```
77+
78+
Reference: https://pkg.go.dev/sync#WaitGroup.Go
79+
4880
### Other Go Best Practices
4981

5082
- Follow standard Go idioms and conventions

CLAUDE.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -407,5 +407,9 @@ go tool cover -html=tmp/coverage.out # Generate HTML coverage report (de
407407
- **Backward Compatibility**: Not required since spanner-mycli is not used as an external library
408408
- **No Future-Proofing**: Since spanner-mycli is not used as a library, don't add parameters or abstractions for potential future use. Only implement what's needed now.
409409
- **Issue Management**: All fixes must go through Pull Requests - never close issues manually
410+
- **License Headers**:
411+
- For new files created after the fork from spanner-cli: Use "Copyright [year] apstndb"
412+
- For existing files that originated from spanner-cli: Keep "Copyright [year] Google LLC"
413+
- This applies to all new logic and features added to spanner-mycli after forking
410414

411415
For any detailed information not covered here, refer to the appropriate documentation in `dev-docs/` or `docs/`.

README.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,8 @@ spanner:
142142
--embedded-emulator Use embedded Cloud Spanner Emulator. --project, --instance, --database, --endpoint, --insecure will be automatically configured.
143143
--emulator-image= container image for --embedded-emulator
144144
--emulator-platform= Container platform (e.g. linux/amd64, linux/arm64) for embedded emulator
145+
--sample-database= Initialize emulator with specified sample database (banking, finance, finance-graph, finance-pg, gaming). Requires --embedded-emulator.
146+
--list-samples List available sample databases and exit
145147
--output-template= Filepath of output template. (EXPERIMENTAL)
146148
--log-level=
147149
--log-grpc Show gRPC logs
@@ -1066,6 +1068,41 @@ emulator-project:emulator-instance:emulator-database
10661068
Empty set (8.763167ms)
10671069
```
10681070

1071+
#### Sample Databases
1072+
1073+
You can initialize the emulator with Google's official sample databases using the `--sample-database` flag:
1074+
1075+
```bash
1076+
# List available sample databases
1077+
$ spanner-mycli --list-samples
1078+
Available sample databases:
1079+
1080+
banking GoogleSQL Banking application with accounts and transactions
1081+
finance GoogleSQL Finance application schema (GoogleSQL)
1082+
finance-graph GoogleSQL Finance application with graph features
1083+
finance-pg PostgreSQL Finance application (PostgreSQL dialect)
1084+
gaming GoogleSQL Gaming application with players and scores
1085+
1086+
Usage: spanner-mycli --embedded-emulator --sample-database=<name>
1087+
1088+
# Start with the banking sample database
1089+
$ spanner-mycli --embedded-emulator --sample-database=banking
1090+
emulator-project:emulator-instance:emulator-database
1091+
> SELECT COUNT(*) AS count FROM Accounts;
1092+
+-------+
1093+
| count |
1094+
| INT64 |
1095+
+-------+
1096+
| 10 |
1097+
+-------+
1098+
1 rows in set (2.76 ms)
1099+
1100+
# Use PostgreSQL sample
1101+
$ spanner-mycli --embedded-emulator --sample-database=finance-pg
1102+
```
1103+
1104+
The sample databases are automatically downloaded from Google Cloud Storage and initialized when the emulator starts. This provides realistic test data for development and demonstrations.
1105+
10691106
> [!NOTE]
10701107
> The embedded emulator has the same limitations as the standalone emulator. See the warning in the [Using with the Cloud Spanner Emulator](#using-with-the-cloud-spanner-emulator) section above for details.
10711108

cli.go

Lines changed: 3 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -263,32 +263,10 @@ func (c *Cli) updateSystemVariables(result *Result) {
263263

264264
// executeSourceFile executes SQL statements from a file
265265
func (c *Cli) executeSourceFile(ctx context.Context, filePath string) error {
266-
// Open the file to get a stable file handle
267-
f, err := os.Open(filePath)
266+
// Use common file safety checks (nil uses DefaultMaxFileSize - 100MB)
267+
contents, err := SafeReadFile(filePath, nil)
268268
if err != nil {
269-
return fmt.Errorf("failed to open file %s: %w", filePath, err)
270-
}
271-
defer f.Close()
272-
273-
// Check if the file is a regular file to prevent DoS from special files
274-
fi, err := f.Stat()
275-
if err != nil {
276-
return fmt.Errorf("failed to stat file %s: %w", filePath, err)
277-
}
278-
if !fi.Mode().IsRegular() {
279-
return fmt.Errorf("sourcing from a non-regular file is not supported: %s", filePath)
280-
}
281-
282-
// Add a check to prevent reading excessively large files.
283-
const maxFileSize = 100 * 1024 * 1024 // 100MB
284-
if fi.Size() > maxFileSize {
285-
return fmt.Errorf("file %s is too large to be sourced (size: %d bytes, max: %d bytes)", filePath, fi.Size(), maxFileSize)
286-
}
287-
288-
// Read the file contents from the opened handle
289-
contents, err := io.ReadAll(f)
290-
if err != nil {
291-
return fmt.Errorf("failed to read file %s: %w", filePath, err)
269+
return err
292270
}
293271

294272
// Parse the contents using buildCommands (same as batch mode)

cli_test.go

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1578,8 +1578,8 @@ func TestCli_executeSourceFile_NonExistentFile(t *testing.T) {
15781578
err := cli.executeSourceFile(context.Background(), "/non/existent/file.sql")
15791579
if err == nil {
15801580
t.Error("Expected error for non-existent file")
1581-
} else if !strings.Contains(err.Error(), "failed to open file") {
1582-
t.Errorf("Expected error to contain 'failed to open file', got: %v", err)
1581+
} else if !strings.Contains(err.Error(), "no such file or directory") {
1582+
t.Errorf("Expected error to contain 'no such file or directory', got: %v", err)
15831583
}
15841584
}
15851585

@@ -1599,8 +1599,8 @@ func TestCli_executeSourceFile_NonRegularFile(t *testing.T) {
15991599
err := cli.executeSourceFile(context.Background(), "/dev/null")
16001600
if err == nil {
16011601
t.Error("Expected error for non-regular file")
1602-
} else if !strings.Contains(err.Error(), "sourcing from a non-regular file is not supported") {
1603-
t.Errorf("Expected error to contain 'sourcing from a non-regular file is not supported', got: %v", err)
1602+
} else if !strings.Contains(err.Error(), "cannot read") {
1603+
t.Errorf("Expected error to contain 'cannot read', got: %v", err)
16041604
}
16051605
}
16061606

@@ -1636,7 +1636,7 @@ func TestCli_executeSourceFile_FileTooLarge(t *testing.T) {
16361636
err = cli.executeSourceFile(context.Background(), tmpFile.Name())
16371637
if err == nil {
16381638
t.Error("Expected error for file too large")
1639-
} else if !strings.Contains(err.Error(), "is too large to be sourced") {
1640-
t.Errorf("Expected error to contain 'is too large to be sourced', got: %v", err)
1639+
} else if !strings.Contains(err.Error(), "too large") {
1640+
t.Errorf("Expected error to contain 'too large', got: %v", err)
16411641
}
16421642
}

file_safety.go

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
//
2+
// Copyright 2025 apstndb
3+
//
4+
// Licensed under the Apache License, Version 2.0 (the "License");
5+
// you may not use this file except in compliance with the License.
6+
// You may obtain a copy of the License at
7+
//
8+
// http://www.apache.org/licenses/LICENSE-2.0
9+
//
10+
// Unless required by applicable law or agreed to in writing, software
11+
// distributed under the License is distributed on an "AS IS" BASIS,
12+
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
// See the License for the specific language governing permissions and
14+
// limitations under the License.
15+
//
16+
17+
package main
18+
19+
import (
20+
"fmt"
21+
"os"
22+
)
23+
24+
const (
25+
// DefaultMaxFileSize is the default maximum file size for safety checks (100MB)
26+
DefaultMaxFileSize = 100 * 1024 * 1024 // 100MB
27+
28+
// SampleDatabaseMaxFileSize is the maximum file size for sample database files (10MB)
29+
// Sample databases should be smaller since they're downloaded from remote sources
30+
SampleDatabaseMaxFileSize = 10 * 1024 * 1024 // 10MB
31+
)
32+
33+
// FileSafetyOptions configures file safety checks
34+
type FileSafetyOptions struct {
35+
// MaxSize is the maximum allowed file size (0 means use DefaultMaxFileSize)
36+
MaxSize int64
37+
// AllowNonRegular allows reading from non-regular files (not recommended)
38+
AllowNonRegular bool
39+
}
40+
41+
// ValidateFileSafety checks if a file is safe to read based on the given options
42+
func ValidateFileSafety(fi os.FileInfo, path string, opts *FileSafetyOptions) error {
43+
if opts == nil {
44+
opts = &FileSafetyOptions{}
45+
}
46+
47+
maxSize := opts.MaxSize
48+
if maxSize == 0 {
49+
maxSize = DefaultMaxFileSize
50+
}
51+
52+
// Check for special files (devices, named pipes, sockets, etc.)
53+
if !opts.AllowNonRegular {
54+
if !fi.Mode().IsRegular() {
55+
// Check specific modes for better error messages
56+
mode := fi.Mode()
57+
switch {
58+
case mode&os.ModeDevice != 0:
59+
return fmt.Errorf("cannot read device file %s", path)
60+
case mode&os.ModeNamedPipe != 0:
61+
return fmt.Errorf("cannot read named pipe %s", path)
62+
case mode&os.ModeSocket != 0:
63+
return fmt.Errorf("cannot read socket file %s", path)
64+
case mode&os.ModeCharDevice != 0:
65+
return fmt.Errorf("cannot read character device %s", path)
66+
default:
67+
return fmt.Errorf("cannot read special file %s (mode: %v)", path, mode)
68+
}
69+
}
70+
}
71+
72+
// Check file size
73+
if fi.Size() > maxSize {
74+
return fmt.Errorf("file %s too large: %d bytes (max %d)", path, fi.Size(), maxSize)
75+
}
76+
77+
return nil
78+
}
79+
80+
// SafeReadFile reads a file after performing safety checks
81+
func SafeReadFile(path string, opts *FileSafetyOptions) ([]byte, error) {
82+
fi, err := os.Stat(path)
83+
if err != nil {
84+
return nil, fmt.Errorf("failed to stat file %s: %w", path, err)
85+
}
86+
87+
if err := ValidateFileSafety(fi, path, opts); err != nil {
88+
return nil, err
89+
}
90+
91+
return os.ReadFile(path)
92+
}

0 commit comments

Comments
 (0)