Skip to content

Commit 7dd4728

Browse files
committed
fix(db): harden database architecture and robust restoration logic
- Upgrade database to WAL mode and add busy timeout for better concurrency. - Implement `MainStore::atomic_restore` to safely replace DB files without corruption. - Remove redundant database initialization calls that caused production locking. - Clean up SQLite temporary files (-wal, -shm) before physical restoration. - Update release notes to v1.2.4 and document database concurrency risks in GEMINI.md.
1 parent 6b56d6b commit 7dd4728

File tree

8 files changed

+206
-206
lines changed

8 files changed

+206
-206
lines changed

GEMINI.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,3 +115,15 @@ This section outlines the coding standards, architectural patterns, and workflow
115115
5. **Correct**: Iteratively fix any issues found during verification.
116116
6. **Report**: Provide a concise summary of the resolution.
117117
- **Git Protocol**: Do not proactively stage or commit changes unless explicitly requested by the user.
118+
119+
### 7. Database Concurrency & Deadlock Risks
120+
To ensure stability in production environments where background services (like CCProxy) and the main UI may access the database simultaneously, follow these rules:
121+
- **Enable WAL Mode**: Always use `PRAGMA journal_mode=WAL;` to allow multiple readers and one writer to coexist without blocking.
122+
- **Busy Timeout**: Always set a `busy_timeout` (e.g., 5 seconds) on the SQLite connection to handle transient locks gracefully.
123+
- **Avoid "Double Open"**: Ensure the database file is opened only once per process. Redundant calls to `Connection::open` on the same file can lead to `readonly database` errors or file corruption in production.
124+
- **Atomic Restoration**: When performing physical file replacement (restoration):
125+
1. **Disconnect**: Close all active connections (or swap with an in-memory DB) to release file handles.
126+
2. **Cleanup**: Delete temporary `-wal` and `-shm` files to prevent startup crashes due to file mismatch.
127+
3. **Replace**: Physically move/rename the new database file.
128+
4. **Reconnect**: Establish a fresh connection and re-enable WAL mode.
129+
- **Locking Hierarchy**: Be mindful of the `Arc<RwLock<MainStore>>` (Outer) and `Mutex<Connection>` (Inner) hierarchy. Ensure that restoration and high-concurrency operations are performed as single atomic blocks within the outer lock to prevent Rust-level deadlocks.

RELEASE.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,18 @@
22

33
# Release Notes
44

5-
## [1.2.3]
5+
## [1.2.4]
66

77
### 🪄 Improvements
88

9+
- **Database Architecture Hardening**: Upgraded the database engine to use **WAL (Write-Ahead Logging)** mode and implemented a 5-second **Busy Timeout**. These changes significantly improve concurrency, allowing the background proxy (CCProxy) and the main chat interface to access the database simultaneously without locking issues.
10+
- **Atomic Restoration with State Preservation**: Completely refactored the database restoration process. The system now performs a safe, atomic "disconnect-replace-reconnect" sequence that prevents file corruption. It also preserves machine-specific settings (window positions, sizes, and network proxy configurations) during restoration, ensuring a seamless experience when migrating data.
911
- **Mission-Critical Stability Hardening**: Performed a comprehensive audit and refactoring of the application's startup sequence. All hardcoded `.expect()` and `.unwrap()` calls in critical paths (logging, database path resolution, and window management) have been replaced with graceful error handling. This ensures the application remains operational even in highly restricted environments like Windows Server 2019.
1012
- **Graceful Logging Fallback**: The logging system now automatically degrades to console-only output if the designated log directory is unwritable or inaccessible, preventing immediate startup crashes.
1113

1214
### 🐞 Bug Fixes
1315

16+
- **Production Database Locking**: Resolved a critical "attempt to write a readonly database" error in production environments caused by redundant file handle requests during initialization.
1417
- **Proxy Routing Precedence**: Resolved a critical routing conflict where generic group paths (e.g., `/{group}/...`) could incorrectly intercept specific functional prefixes like `/switch`, `/compat`, or `/compat_mode`. This fix ensures correct dispatching for all access modes and resolves 404 errors when using combined paths.
1518
- **Compatibility Mode Alias**: Introduced the `compat` shorthand alias as a convenient alternative to `compat_mode` (e.g., `/group/compat/v1/messages`), improving API call ergonomics.
1619
- **Proxy Statistics Calibration**: Fixed a critical issue where output tokens were reported as 0 in `tool_compat_mode` and `direct_forward` modes. The system now accurately estimates tokens for **Reasoning/Thinking content** and **Tool Call Arguments** across all supported protocols (OpenAI, Claude, Gemini, Ollama).

RELEASE.zh-CN.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,18 @@
22

33
# 发布日志
44

5-
## [1.2.3]
5+
## [1.2.4]
66

77
### 🪄 改进
88

9+
- **数据库架构加固**:将数据库引擎升级为 **WAL (Write-Ahead Logging)** 模式,并实现了 5 秒的 **Busy Timeout (忙碌重试)**。这些改进显著提升了并发性能,允许后台代理 (CCProxy) 与主聊天界面同时访问数据库而不会发生锁定冲突。
10+
- **状态保留的原子化还原**:彻底重构了数据库还原流程。系统现在执行安全的原子级“断开-替换-重连”序列,防止了文件损坏。同时,在还原过程中能够自动保留机器特有的设置(如窗口位置、尺寸、网络代理配置等),确保数据迁移后的无缝体验。
911
- **启动路径鲁棒性加固**:对应用程序启动序列进行了全面审计和重构。移除了所有在关键路径(如日志系统、数据库路径解析及窗口管理)中硬编码的 `.expect()``.unwrap()` 调用。这确保了在 Windows Server 2019 等权限高度受限的环境下,程序仍能平稳启动而不会闪退。
1012
- **日志系统自动降级**:当日志目录不可写或无法访问时,日志系统现在会自动降级为仅控制台输出模式,彻底消除了由于磁盘写入权限导致的启动 Panic。
1113

1214
### 🐞 修复
1315

16+
- **生产环境数据库锁定修复**:解决了在生产环境下,由于初始化过程中冗余的文件句柄请求导致的“attempt to write a readonly database”关键错误。
1417
- **代理路由优先级修复**:解决了通配分组路径(如 `/{group}/...`)可能意外拦截 `/switch``/compat``/compat_mode` 等特定功能前缀的冲突问题。该修复确保了所有访问模式都能正确分发,解决了组合路径下可能出现的 404 错误。
1518
- **兼容模式别名支持**:新增了 `compat` 简写别名作为 `compat_mode` 的便捷替代方案(例如:`/group/compat/v1/messages`),提升了 API 调用的易用性。
1619
- **代理统计校准**:修复了在 `工具兼容模式` (tool_compat_mode) 和 `直接转发` (direct_forward) 模式下,输出 Token 统计始终为 0 的问题。现在系统能准确估算所有协议(OpenAI、Claude、Gemini、Ollama)中的**推理/思考内容 (Reasoning/Thinking)****工具调用参数**的 Token 消耗。

src-tauri/src/commands/setting.rs

Lines changed: 47 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
//!
3333
3434
use crate::constants::*;
35-
use crate::db::{AiModel, AiSkill, MainStore, ModelConfig};
35+
use crate::db::{AiModel, AiSkill, MainStore, ModelConfig, StoreError};
3636
use crate::db::{BackupConfig, DbBackup};
3737
use crate::libs::fs::{self, get_file_name};
3838
use crate::tray::create_tray;
@@ -710,18 +710,10 @@ fn upload_logo(image_path: String) -> Result<String> {
710710
// Backup
711711
// =================================================
712712
#[tauri::command]
713-
pub async fn backup_setting(
714-
app: AppHandle,
715-
backup_dir: Option<String>,
716-
) -> Result<()> {
713+
pub async fn backup_setting(app: AppHandle, backup_dir: Option<String>) -> Result<()> {
717714
let result = tokio::spawn(async move {
718-
DbBackup::new(
719-
&app,
720-
BackupConfig {
721-
backup_dir,
722-
},
723-
)
724-
.and_then(|mut backup| backup.backup_to_directory())
715+
DbBackup::new(&app, BackupConfig { backup_dir })
716+
.and_then(|mut backup| backup.backup_to_directory())
725717
})
726718
.await
727719
.map_err(|e| AppError::General {
@@ -751,75 +743,59 @@ pub async fn restore_setting(
751743
"proxyPassword",
752744
];
753745

754-
// 2. Backup current machine-specific configurations
755-
let mut preserved_configs = HashMap::new();
756-
{
757-
let config_store = state.read()?;
758-
for &key in &machine_specific_keys {
759-
if let Some(value) = config_store.config.get_setting(key) {
760-
preserved_configs.insert(key.to_string(), value.clone());
761-
}
762-
}
763-
}
746+
// 2. Prepare paths and backup instance
747+
let theme_dir = HTTP_SERVER_THEME_DIR.read().clone();
748+
let upload_dir = HTTP_SERVER_UPLOAD_DIR.read().clone();
749+
let schema_dir = SCHEMA_DIR.read().clone();
750+
let shared_dir = SHARED_DATA_DIR.read().clone();
751+
let static_dir = HTTP_SERVER_DIR.read().clone();
752+
let store_dir = STORE_DIR.read().clone();
753+
let mcp_sessions_dir = store_dir.join("mcp_sessions");
754+
let main_db_path = store_dir.join("chatspeed.db");
764755

765-
let result = tokio::spawn(async move {
766-
let theme_dir = HTTP_SERVER_THEME_DIR.read().clone();
767-
let upload_dir = HTTP_SERVER_UPLOAD_DIR.read().clone();
768-
let schema_dir = SCHEMA_DIR.read().clone();
769-
let shared_dir = SHARED_DATA_DIR.read().clone();
770-
let static_dir = HTTP_SERVER_DIR.read().clone();
771-
let store_dir = STORE_DIR.read().clone();
772-
let mcp_sessions_dir = store_dir.join("mcp_sessions");
773-
DbBackup::new(
774-
&app,
775-
BackupConfig {
776-
backup_dir: Some(backup_dir.clone()),
777-
},
778-
)
779-
.and_then(|db_backup| {
780-
db_backup.restore_from_directory(
781-
&Path::new(&backup_dir),
782-
&Path::new(&*theme_dir),
783-
&Path::new(&*upload_dir),
784-
&Path::new(&mcp_sessions_dir),
785-
&Path::new(&*schema_dir),
786-
&Path::new(&*shared_dir),
787-
&Path::new(&*static_dir),
788-
)
789-
})
790-
})
791-
.await
792-
.map_err(|e| AppError::General {
793-
message: e.to_string(),
794-
})?;
756+
let db_backup = DbBackup::new(
757+
&app,
758+
BackupConfig {
759+
backup_dir: Some(backup_dir.clone()),
760+
},
761+
)
762+
.map_err(AppError::Db)?;
795763

796-
result.map_err(AppError::Db)?;
764+
// 3. Decrypt database to a temporary file FIRST
765+
let backup_db_file = Path::new(&backup_dir).join("chatspeed.db");
766+
let temp_db_path = db_backup
767+
.decrypt_to_temp(&backup_db_file, &main_db_path)
768+
.map_err(AppError::Db)?;
797769

798-
// 3. Reload the configuration from the newly restored database file
799-
let mut config_store = state.write()?;
800-
config_store.reload_config().map_err(AppError::Db)?;
801-
802-
// 4. Restore the preserved configurations to the new database
803-
for key in machine_specific_keys {
804-
if let Some(value) = preserved_configs.get(key) {
805-
config_store.set_config(key, value).map_err(AppError::Db)?;
806-
} else {
807-
// If the key didn't exist before, ensure it doesn't exist in the restored DB either
808-
let _ = config_store.delete_config(key);
809-
}
770+
// 4. Perform atomic database restoration
771+
{
772+
let mut config_store = state
773+
.write()
774+
.map_err(|e| AppError::Db(StoreError::LockError(e.to_string())))?;
775+
config_store
776+
.atomic_restore(&temp_db_path, &main_db_path, &machine_specific_keys)
777+
.map_err(AppError::Db)?;
810778
}
811779

780+
// 5. Restore user files (static assets, etc.)
781+
db_backup
782+
.restore_user_files(
783+
&Path::new(&backup_dir).join("user_files.zip"),
784+
&Path::new(&*theme_dir),
785+
&Path::new(&*upload_dir),
786+
&Path::new(&mcp_sessions_dir),
787+
&Path::new(&*schema_dir),
788+
&Path::new(&*shared_dir),
789+
&Path::new(&*static_dir),
790+
)
791+
.map_err(AppError::Db)?;
792+
812793
Ok(())
813794
}
814795

815796
#[tauri::command]
816797
pub fn get_all_backups(app: AppHandle, backup_dir: Option<String>) -> Result<Vec<String>> {
817-
let db_backup = DbBackup::new(
818-
&app,
819-
BackupConfig {
820-
backup_dir,
821-
},
822-
)?;
798+
let db_backup = DbBackup::new(&app, BackupConfig { backup_dir })?;
823799

824800
let backups = db_backup.list_backups().map_err(AppError::Db)?;
825801
Ok(backups

src-tauri/src/db/backup.rs

Lines changed: 21 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,12 @@ impl DbBackup {
4545

4646
#[cfg(not(debug_assertions))]
4747
let app_dir = {
48-
let app_local_data_dir = _app
49-
.path()
50-
.app_data_dir()
51-
.map_err(|e| StoreError::TauriError(format!("Failed to retrieve the application data directory: {}", e)))?;
48+
let app_local_data_dir = _app.path().app_data_dir().map_err(|e| {
49+
StoreError::TauriError(format!(
50+
"Failed to retrieve the application data directory: {}",
51+
e
52+
))
53+
})?;
5254
std::fs::create_dir_all(&app_local_data_dir)
5355
.map_err(|e| StoreError::TauriError(e.to_string()))?;
5456
app_local_data_dir
@@ -116,39 +118,35 @@ impl DbBackup {
116118
Ok(backup_path)
117119
}
118120

119-
/// Restores a single database from a backup file with decryption
120-
fn restore_single_db(&self, backup_path: &Path, target_path: &Path) -> Result<(), StoreError> {
121+
/// Decrypts a backup database to a temporary file.
122+
pub fn decrypt_to_temp(
123+
&self,
124+
backup_path: &Path,
125+
target_path: &Path,
126+
) -> Result<PathBuf, StoreError> {
121127
if !backup_path.exists() {
122128
return Err(StoreError::NotFound(
123129
t!("db.backup.file_not_found", path = backup_path.display()).to_string(),
124130
));
125131
}
126132

127-
// 获取备份目录名称(时间格式)
128133
let backup_dir_name = backup_path
129134
.parent()
130135
.and_then(|p| p.file_name())
131136
.and_then(|n| n.to_str())
132137
.unwrap_or("unknown");
133138

134-
// Use streaming decryption for database
135139
let temp_file = target_path.with_extension("tmp");
136140
decrypt_database_streaming(backup_path, &temp_file, backup_dir_name)?;
141+
Ok(temp_file)
142+
}
137143

138-
// Atomically replace original file
139-
fs::rename(&temp_file, target_path).map_err(|e| {
140-
error!("Failed to rename temp database file: {}", e);
141-
StoreError::IoError(
142-
t!(
143-
"db.backup.failed_to_rename_temp_db",
144-
path = target_path.display(),
145-
error = e.to_string()
146-
)
147-
.to_string(),
148-
)
149-
})?;
150-
151-
Ok(())
144+
/// Cleans up SQLite temporary files (-wal, -shm) for a given database path.
145+
pub fn cleanup_sqlite_temporaries(db_path: &Path) {
146+
let wal_path = db_path.with_extension("db-wal");
147+
let shm_path = db_path.with_extension("db-shm");
148+
let _ = fs::remove_file(wal_path);
149+
let _ = fs::remove_file(shm_path);
152150
}
153151

154152
/// Lists all database backup directories in the backup directory, sorted by modification time.
@@ -260,65 +258,6 @@ impl DbBackup {
260258
Ok(())
261259
}
262260

263-
/// Restores databases and user files from a backup directory
264-
///
265-
/// # Arguments
266-
///
267-
/// * `backup_dir` - Path to the backup directory
268-
/// * `theme_dir` - Path to restore theme files
269-
/// * `upload_dir` - Path to restore uploaded files
270-
/// * `mcp_sessions_dir` - Path to restore MCP sessions
271-
/// * `schema_dir` - Path to restore schema files
272-
/// * `shared_dir` - Path to restore shared files
273-
/// * `static_dir` - Path to restore static files
274-
///
275-
/// # Errors
276-
///
277-
/// Returns a `StoreError` if any restore operation fails
278-
pub fn restore_from_directory(
279-
&self,
280-
backup_dir: &Path,
281-
theme_dir: &Path,
282-
upload_dir: &Path,
283-
mcp_sessions_dir: &Path,
284-
schema_dir: &Path,
285-
shared_dir: &Path,
286-
static_dir: &Path,
287-
) -> Result<(), StoreError> {
288-
// Verify backup directory exists
289-
if !backup_dir.exists() || !backup_dir.is_dir() {
290-
return Err(StoreError::NotFound(
291-
t!(
292-
"db.backup.dir_not_found_for_restore",
293-
path = backup_dir.display()
294-
)
295-
.to_string(),
296-
));
297-
}
298-
299-
// Check for chatspeed.db
300-
let main_backup = backup_dir.join("chatspeed.db");
301-
if main_backup.exists() {
302-
self.restore_single_db(&main_backup, &self.main_db_path)?;
303-
}
304-
305-
// Check for user_files.zip
306-
let user_files = backup_dir.join("user_files.zip");
307-
if user_files.exists() {
308-
self.restore_user_files(
309-
&user_files,
310-
theme_dir,
311-
upload_dir,
312-
mcp_sessions_dir,
313-
schema_dir,
314-
shared_dir,
315-
static_dir,
316-
)?;
317-
}
318-
319-
Ok(())
320-
}
321-
322261
/// Restores user files from a backup zip file
323262
///
324263
/// # Arguments
@@ -334,7 +273,7 @@ impl DbBackup {
334273
/// # Errors
335274
///
336275
/// Returns a `StoreError` if restore operation fails
337-
fn restore_user_files(
276+
pub fn restore_user_files(
338277
&self,
339278
zip_path: &Path,
340279
theme_dir: &Path,

0 commit comments

Comments
 (0)