Skip to content

Feature document database migration #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Jul 12, 2025

Conversation

fulleni
Copy link
Member

@fulleni fulleni commented Jul 12, 2025

Status

READY/IN DEVELOPMENT/HOLD

Description

Type of Change

  • ✨ New feature (non-breaking change which adds functionality)
  • 🛠️ Bug fix (non-breaking change which fixes an issue)
  • ❌ Breaking change (fix or feature that would cause existing functionality to change)
  • 🧹 Code refactor
  • ✅ Build configuration change
  • 📝 Documentation
  • 🗑️ Chore

fulleni added 23 commits July 11, 2025 11:56
- Introduces a new `ownershipCheckMiddleware` to centralize the logic
  for verifying if a user owns a specific data item.
- The middleware is configuration-driven, using `ModelConfig` to
  determine if a check is necessary for the requested action.
- If a check is required, it fetches the item and provides it to the
  downstream handler, preventing redundant database reads.
- Throws a `ForbiddenException` if the ownership check fails.
- Adds a `FetchedItem` wrapper class for type-safe context provision.
- Creates a new `_middleware.dart` file for the `/[id]` path.
- Applies the `ownershipCheckMiddleware` to all item-specific
  requests (GET, PUT, DELETE).
- This ensures ownership is verified after authentication and
  authorization checks have passed but before the route handler
  is executed.
- Rewrites the `_handleGet` method in the collection route handler
  to accept a single, JSON-encoded `filter` parameter.
- Removes the complex logic for translating individual URL parameters
  into a query map.
- Eliminates the `_camelToSnake` helper, as the API now passes
  native model field names to the data layer.
- This change simplifies the API handler and aligns it with a
  document-native query model, making it more flexible and robust.
- Rewrites the `_handleGet` method in the collection route handler
  to accept a single, JSON-encoded `filter` parameter.
- Removes the complex logic for translating individual URL parameters
  into a query map.
- Eliminates the `_camelToSnake` helper, as the API now passes
  native model field names to the data layer.
- This change simplifies the API handler and aligns it with a
  document-native query model, making it more flexible and robust.
- Rewrites `DatabaseSeedingService` to work with a MongoDB `Db` instance.
- Removes table creation logic, as MongoDB collections are schemaless.
- Implements idempotent seeding using `bulkWrite` with `upsert` operations,
  preventing duplicate data on subsequent runs.
- Correctly handles the conversion of string IDs from fixtures to MongoDB
  `ObjectId` for the `_id` field.
- Ensures complex nested objects in fixtures are properly JSON-encoded
  before insertion.
- Rewrites `DatabaseSeedingService` to work with a MongoDB `Db` instance.
- Removes table creation logic, as MongoDB collections are schemaless.
- Implements idempotent seeding using `bulkWrite` with `upsert` operations,
  preventing duplicate data on subsequent runs.
- Correctly handles the conversion of string IDs from fixtures to MongoDB
  `ObjectId` for the `_id` field.
- Ensures complex nested objects in fixtures are properly JSON-encoded
  before insertion.
- Removes `ht_data_postgres` and `postgres` from the `pubspec.yaml`.
- This is a key step in migrating the data layer from PostgreSQL to
  MongoDB, severing the project's dependency on the old database
  implementation.
- Rewrites `AppDependencies` to use `MongoDbConnectionManager` instead of
  the PostgreSQL equivalent.
- Initializes the MongoDB connection and runs the new `DatabaseSeedingService`.
- Instantiates `HtDataMongodb` clients for all data models.
- Wires up all `HtDataRepository` instances and application services to use
  the new MongoDB-backed data layer.
- This completes the core dependency injection part of the migration.
- Updates the prerequisites in the README to specify MongoDB instead of
  PostgreSQL.
- Changes the example `DATABASE_URL` to a MongoDB connection string.
- Rewrites the description of the server startup process to reflect
  idempotent seeding of MongoDB collections.
- Removes a duplicate bullet point for clarity.
- Replaces the obsolete `readAllByQuery` method with the new `readAll`
  method in `AuthService`.
- Passes the query conditions as a `filter` map, aligning the service
  with the new document-based data access pattern.
- This change completes the service-level refactoring for the MongoDB
  migration.
- Changes the example DATABASE_URL to the standard MongoDB format.
- Refactors `EnvironmentConfig` to actively search for the `.env` file
  by traversing up the directory tree from the current working directory.
- This change makes the server startup process resilient to cases where
  the Dart Frog development server's execution context is not the
  project root, resolving the "file not found" error for `.env`.
  /// project root.
  static DotEnv _loadEnv() {
    final env = DotEnv(includePlatformEnvironment: true);
    try {
      // Find the project root by looking for pubspec.yaml, then find .env
      var dir = Directory.current;
      while (true) {
        final pubspecFile = File('${dir.path}/pubspec.yaml');
        if (pubspecFile.existsSync()) {
          // Found project root, now look for .env in this directory
          final envFile = File('${dir.path}/.env');
          if (envFile.existsSync()) {
            _log.info('Found .env file at: ${envFile.path}');
            env.load([envFile.path]);
            return env;
          }
          break; // Found pubspec but no .env, break and fall back
        }

        // Stop if we have reached the root of the filesystem.
        if (dir.parent.path == dir.path) {
          break;
        }
        dir = dir.parent;
      }
    } catch (e) {
      _log.warning('Error during robust .env search: $e. Falling back.');
    }

    // Fallback for when the robust search fails
    _log.warning(
      '.env file not found by searching for project root. '
      'Falling back to default load().',
    );
    env.load();
    return env;
  }

  /// Retrieves the database connection URI from the environment.
- Fixes a subtle error in the directory traversal logic within the
  `EnvironmentConfig._loadEnv` method. The previous implementation had a flaw
  in how it checked for the parent directory, which could prevent it from
  correctly locating the `pubspec.yaml` and `.env` files in certain project
  structures or when the Dart Frog development server starts from a
  subdirectory.
- The corrected logic now accurately traverses up the directory tree,
  repeatedly checking if the parent directory is the same as the current
  directory (indicating the root of the filesystem). This ensures that the
  search correctly identifies the project root and locates the `.env` file.
Corrects the `DatabaseSeedingService` to ensure the seeding process is
idempotent and correctly preserves the relationships between fixture
documents.

The previous implementation had two critical bugs:
1. It generated a new, random `ObjectId` on every run, which broke all
   predefined relationships between entities.
2. It used an empty filter `{}` in its `replaceOne` operation, causing
   it to overwrite existing documents incorrectly on subsequent runs.

This fix addresses these issues by:
- Using the predefined hex string ID from each fixture item to create a
  deterministic `ObjectId`.
- Using `updateOne` with a specific `_id` in the filter, ensuring that
  the correct document is targeted.
- Using the `$set` operator with `upsert: true` to safely insert or
  update the document, making the entire process idempotent.
Fixes a critical bug in the `AuthService.deleteAccount` method where
only the main user document was being deleted. The previous logic
incorrectly assumed a relational `CASCADE` delete behavior.

This change adds explicit calls to delete the user's associated
documents from the `user_app_settings` and `user_content_preferences`
collections, ensuring that no orphaned data is left in the database
after an account is deleted.
Replaces the `readAll()` calls in the `DashboardSummaryService` with the more efficient `count()` method from the data repository.

This avoids fetching all documents from the database just to get their count, significantly improving the performance of the dashboard summary endpoint by leveraging the native counting capabilities of the database.
Adds a specific error handler for `CheckedFromJsonException` to return a
400 Bad Request instead of a generic 500 Internal Server Error when
deserialization fails due to invalid client input.

This provides more accurate feedback to API clients when they send a
request body that is missing required fields or contains incorrect
data types.
Fixes a bug where administrators were incorrectly scoped to their own
userId when accessing user-owned resources, preventing them from
managing other users' data.

The logic in the generic data handlers (`/data` and `/data/[id]`) has
been updated to only apply the `userId` filter to repository calls if
the model is user-owned AND the authenticated user is not an admin.
This allows administrators to perform global operations as intended.
@fulleni fulleni merged commit 3283ed3 into main Jul 12, 2025
1 check failed
@fulleni fulleni deleted the feature_document_database_migration branch July 12, 2025 07:26
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @fulleni, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements a significant architectural shift by migrating the application's primary data store from PostgreSQL to MongoDB. This change necessitated a comprehensive update of the data access layer, database initialization, and API interaction patterns to leverage MongoDB's document-oriented capabilities. Additionally, it introduces a new middleware for robust ownership enforcement on user-specific data and refines environment configuration loading.

Highlights

  • Database Migration: The core change is a complete migration of the backend database from PostgreSQL to MongoDB, involving updates to connection management, data seeding, and repository implementations.
  • Data Access Layer Update: The data access layer has been refactored to use ht_data_mongodb instead of ht_data_postgres, adapting all repository interactions to a document-oriented model.
  • API Query Simplification: Collection endpoints (/api/v1/data) now accept a generic, JSON-encoded filter parameter for MongoDB-style queries, replacing model-specific query parameter parsing. Note that this change means certain models (UserAppSettings, UserContentPreferences, RemoteConfig) are no longer directly queryable via the collection endpoint.
  • Ownership Enforcement: A new ownershipCheckMiddleware has been introduced to enforce ownership for user-specific data (e.g., User, UserAppSettings, UserContentPreferences) on item-level routes, ensuring users can only access or modify their own records unless they are an administrator.
  • Robust Environment Loading: The .env file loading mechanism has been improved to robustly locate the environment file by traversing the directory tree, making local development setup more reliable.
  • Optimized Dashboard Summary: The dashboard summary service now efficiently retrieves only counts of headlines, topics, and sources, rather than fetching all data, significantly improving performance.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request migrates the data layer from PostgreSQL to MongoDB. The changes include database seeding, performance enhancements, resilient configuration, and API design improvements. The ownership check in the item-specific route handlers is redundant and should be removed as it's already handled by the new middleware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant