Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 31, 2025

Implements Priority 1 from the roadmap: stabilizing the core container runtime with proper cgroup handling, state tracking, and essential CLI operations.

Changes

Cgroup support (cgroup.go)

  • Auto-detect cgroup v1 vs v2, handle both memory.limit_in_bytes (v1) and memory.max (v2)
  • Degrade gracefully when cgroups unavailable—containers run without limits, warnings logged
  • Expose cgroup capabilities via info command

Container lifecycle (container.go)

  • State model: createdrunningexited|failed
  • Persist metadata to state.json (command, args, timestamps, exit code, PID)
  • Atomic state updates via UpdateContainerState()

CLI commands (main.go)

  • rm <id> - remove stopped containers, cleanup cgroups
  • logs <id> - read stdout/stderr from persistent log files
  • inspect <id> - dump full container metadata as JSON
  • ps - now shows actual states and creation time
  • info - displays cgroup version, controller availability

Logging

  • Use io.MultiWriter to send output to both console and log file simultaneously

Testing

  • Unit tests in container_test.go for state management
  • Integration tests in verify-new.sh with 12 test scenarios
  • CodeQL: 0 vulnerabilities

Example

$ sudo ./basic-docker run alpine /bin/echo "test"
Starting container container-1767175530
test

$ sudo ./basic-docker ps
CONTAINER ID         STATE     COMMAND        CREATED
container-1767175530 exited    /bin/echo      2025-12-31 10:05:30

$ sudo ./basic-docker inspect container-1767175530
{
  "state": "exited",
  "exit_code": 0,
  "started_at": "2025-12-31T10:05:30Z",
  ...
}

$ sudo ./basic-docker logs container-1767175530
test

$ sudo ./basic-docker rm container-1767175530
Container removed successfully
Original prompt

This section details on the original issue you should resolve

<issue_title>next</issue_title>
<issue_description>This is a solid, non‑trivial learning project that already goes well beyond “toy runtime”; it’s not something to abandon, but something to narrow, finish in layers, and then position as a reference implementation.[1]

Below is a concrete way to think about it.

What it is today

From the repo, you effectively have:[1]

  • A basic container runtime: basic-docker run, ps, minimal image/layer logic, filesystem isolation, namespaces, with some cgroup integration (currently permission‑sensitive).[1]
  • Monitoring abstraction across process/container/host isolation levels, including monitor host|process|container|all|gap|correlation and a clear table of gaps. [1]
  • Kubernetes integration via a ResourceCapsule CRD, operator, and kubernetes.go/crd_*.go plumbing, plus ADRs for networking, image download, and resource capsules.[1]

This is already a nice intersection of: containers internals, observability, and K8s custom resource design.[1]

What’s missing / incomplete

Reading the README, code list, and docs, the main gaps look like:[1]

  • Runtime robustness

    • Cgroup handling is brittle (permission issues, hardcoded memory cgroup path, limited feature flags).[1]
    • Networking stack only lightly tested; no clear story for port mapping, DNS, multi‑container networks beyond basic veth tests.[1]
    • Container lifecycle is minimal: no restart policies, logs, stop/kill, or state model beyond simple IDs and directories.[1]
  • Image and filesystem story

    • Image handling exists (image.go, tests, real Docker image download), but no clear CLI for pull, images, rmi, caching policy, or clear OCI boundary.[1]
    • Rootfs / layering is described in the architecture doc but not formalized as a “contract” (e.g., what exactly is a layer, where metadata lives, how GC works).[1]
  • Kubernetes / ResourceCapsule

    • CRD and operator exist, but the end‑to‑end story is not obvious: how does a user write a Capsule, attach to a Deployment, and what guarantees does the system provide.[1]
    • No versioned spec for “ResourceCapsule” (apiVersion/kind semantics, status, conditions, examples for scaling, limits, etc.).[1]
  • Monitoring narrative

    • Monitoring features are implemented and documented, but they’re not yet framed as a coherent “problem statement → design → implementation → examples → limitations”.[1]
    • No export story: e.g., Prometheus metrics, JSON output, or how to plug this into existing infra.[1]
  • External readiness

    • No releases, no clear “v0.1/v0.2” milestones.[1]
    • Tests are present but not organized into a crisp “what guarantees do tests give you” section, and CI only runs go build ..[1]

How to approach it (strategy)

Given your profile, this project should serve 3 purposes: concept mastery, a portfolio artifact, and a foundation for writing/talks. Explicitly pick that as the goal, not “replace Docker”.

Suggested approach:

  1. Define a sharp scope: “lean Docker engine with monitoring + K8s capsules”

    • Declare this in README: a teaching/runtime prototype focused on isolation, monitoring gaps, and K8s integration, not a production daemon.[1]
    • Add a high‑level design doc (or extend RESEARCH.md) with: goals, non‑goals, core constraints (single‑host, Linux only, root required).[1]
  2. Work in small, versioned milestones

    • v0.1: Core runtime (run/ps, minimal image, filesystem + process namespaces, verify.sh green).
    • v0.2: Monitoring system fully polished: consistent CLI, nice examples, basic JSON export, clearer docs.
    • v0.3: ResourceCapsule CRD/operator hardened with at least one end‑to‑end scenario documented.
    • Each version gets a GitHub release + short changelog so it looks like a living, coherent project.[1]
  3. Use ADRs aggressively

    • You already have adr-00x. For every major decision (image format, network model, monitoring semantics, CRD shape) capture the tradeoffs and link back in README.[1]
    • This directly supports future talks/blog posts.

Concrete next priorities

Priority 1: Stabilize core runtime

Goal: “If you follow README steps on a modern Linux host as root, basic flows always work.”

  • Fix cgroup access + permission errors:

    • Detect cgroup v1 vs v2; handle paths correctly; degrade gracefully when limits cannot be set and surface this in info.[1]
    • Add tests that simulate “no cgroup access” vs “full cgroup access” and assert feature flags.[1]
  • Clarify lifecycle:

    • Introduce a simple state model: created, running, exited, failed.
    • Ensure ps and on‑disk metadata reflect it (e.g., per‑container JSON state file in /tmp/basic-docker/containers/<id>/state.json).[1]
  • Round out CLI surface:

    • Implement rm (cleanup directories), logs (even if it just tails stdout file), and inspect (dump container JSON state).
    • Document all comman...

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI self-assigned this Dec 31, 2025
Copilot AI and others added 4 commits December 31, 2025 10:04
- Add cgroup.go with automatic v1/v2 detection
- Add container.go with state management (created, running, exited, failed)
- Implement rm, logs, and inspect CLI commands
- Update info command to show cgroup details
- Update ps command to show container states
- Store container metadata in state.json files

Co-authored-by: j143 <[email protected]>
- Add container_test.go with full test coverage for state management
- Create verify-new.sh with structured test framework
- Update README with container lifecycle, cgroup support, and new CLI commands
- Document graceful degradation and state persistence

Co-authored-by: j143 <[email protected]>
- Use io.MultiWriter to send output to both console and log file
- Add warning logs for cgroup degradation instead of silent failures
- Remove duplicate command/args extraction in run function
- Improve verify-new.sh with proper binary validation and sudo usage

Co-authored-by: j143 <[email protected]>
Complete documentation of all changes, improvements, and testing results

Co-authored-by: j143 <[email protected]>
Copilot AI changed the title [WIP] Narrow down learning project implementation Stabilize core runtime: cgroup v1/v2 detection, container lifecycle, and CLI commands Dec 31, 2025
Copilot AI requested a review from j143 December 31, 2025 10:16
@j143 j143 marked this pull request as ready for review December 31, 2025 11:22
@j143 j143 merged commit ccf6a33 into main Dec 31, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

next

2 participants