Skip to content

Commit 637d1d1

Browse files
authored
Merge pull request #2596 from bcgov/feat/alex-documentation-250509
Feat: Comprehensive LCFS Wiki Documentation - 2409 2410
2 parents 6398345 + c8798a0 commit 637d1d1

30 files changed

+2869
-0
lines changed

.github/workflows/sync-wiki.yml

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
name: Sync Wiki to GitHub Wiki
2+
3+
on:
4+
push:
5+
branches:
6+
- develop # On merge to develop
7+
paths:
8+
- 'wiki/**' # Only run if files in the wiki/ directory change
9+
10+
jobs:
11+
sync-wiki-content:
12+
runs-on: ubuntu-latest
13+
steps:
14+
- name: Checkout Main Repository
15+
uses: actions/checkout@v4
16+
with:
17+
path: main-repo # Checkout main repo into a specific directory
18+
19+
- name: Checkout Wiki Repository
20+
uses: actions/checkout@v4
21+
with:
22+
repository: ${{ github.repository }}.wiki # bcgov/lcfs.wiki
23+
path: wiki-repo # Checkout wiki repo into a specific directory
24+
token: ${{ secrets.GITHUB_TOKEN }} # Use the default GITHUB_TOKEN
25+
26+
- name: Sync files to Wiki Repo
27+
run: |
28+
echo "Source (main-repo/wiki):"
29+
ls -R ../main-repo/wiki
30+
echo "Destination (wiki-repo) before sync:"
31+
ls -R ../wiki-repo
32+
33+
# The wiki repo root is where the files should go.
34+
# Clean the wiki-repo root, except for .git and specific preserved files/dirs like Home.md or _Sidebar.md if managed manually there
35+
cd wiki-repo
36+
find . -maxdepth 1 -type f ! -name 'Home.md' ! -name '_Sidebar.md' ! -name '_Footer.md' -delete
37+
# Remove all directories except .git and images. Add other preserved dirs as needed.
38+
find . -maxdepth 1 -mindepth 1 -type d ! -name '.git' ! -name 'images' -exec rm -rf {} +
39+
cd ..
40+
41+
# Copy content from main-repo/wiki to wiki-repo, effectively mirroring it
42+
# The `.` after main-repo/wiki/ means copy the *contents* of the directory
43+
rsync -av --delete --checksum main-repo/wiki/. wiki-repo/
44+
45+
echo "Destination (wiki-repo) after sync:"
46+
ls -R wiki-repo
47+
48+
- name: Commit and Push to Wiki
49+
run: |
50+
cd wiki-repo
51+
git status # For debugging
52+
if ! git diff --quiet || ! git diff --staged --quiet; then
53+
git config user.name "GitHub Actions Bot"
54+
git config user.email "actions@github.com"
55+
git add .
56+
git commit -m "Sync wiki content from main repository (Commit: ${{ github.sha }}) [skip ci]"
57+
# The git push command will use the credentials context of the GITHUB_TOKEN
58+
git push
59+
else
60+
echo "No changes to sync to the wiki."
61+
fi
62+
env:
63+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # Ensure git push uses the GITHUB_TOKEN

wiki/Backend-Logging-Guide.md

Lines changed: 204 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,204 @@
1+
# Backend Logging Guide
2+
3+
This document provides comprehensive guidelines for developers on how to effectively use the standardized logging system in the LCFS backend. By following this guide, you will ensure consistent logging practices, facilitating easier debugging, monitoring, and analysis. This guide is based on the original "11. Backend Logging Guide" and adapted for the project's use of `structlog`.
4+
5+
## Table of Contents
6+
1. [Introduction](#introduction)
7+
2. [Logging Library: Structlog](#logging-library-structlog)
8+
3. [Logging Configuration](#logging-configuration)
9+
4. [Logging Modes](#logging-modes)
10+
* [Development Mode](#development-mode)
11+
* [Non-Development Mode (JSON)](#non-development-mode-json)
12+
5. [Logging Practices](#logging-practices)
13+
* [What to Log](#what-to-log)
14+
* [How to Log with Structlog](#how-to-log-with-structlog)
15+
6. [Sample Logs](#sample-logs)
16+
* [Development Mode Logs](#development-mode-logs)
17+
* [Non-Development Mode Logs](#non-development-mode-logs)
18+
7. [Logging Levels](#logging-levels)
19+
8. [Correlation ID Usage](#correlation-id-usage)
20+
9. [Best Practices and Guidelines](#best-practices-and-guidelines)
21+
* [Consistency](#consistency)
22+
* [Security and Privacy (Sensitive Data)](#security-and-privacy-sensitive-data)
23+
* [Performance Considerations](#performance-considerations)
24+
10. [Using Kibana for Log Analysis (If Applicable)](#using-kibana-for-log-analysis-if-applicable)
25+
11. [FAQs](#faqs)
26+
27+
## 1. Introduction
28+
29+
Effective logging is crucial for diagnosing issues, monitoring application behavior, and gaining insights into system performance. This guide outlines our standardized approach to backend logging using `structlog`, helping you produce logs that are both developer-friendly and suitable for production analysis.
30+
31+
Our logging system, powered by `structlog`, is designed to be developer-friendly and automatically enriches logs with useful metadata like source information and correlation IDs.
32+
33+
## 2. Logging Library: Structlog
34+
35+
The backend uses **`structlog`** (`structlog` dependency in `backend/pyproject.toml`). Structlog is a powerful library for structured logging in Python, allowing for flexible and rich log event creation.
36+
37+
## 3. Logging Configuration
38+
39+
* The primary logging configuration is typically found in `backend/lcfs/logging_config.py` (or a similar path).
40+
* This file sets up `structlog` processors, formatters, and integrates with standard Python logging if needed.
41+
* Key features handled by the configuration include:
42+
* Adding timestamps.
43+
* Including log levels.
44+
* Capturing caller information (filename, line number, function name).
45+
* Automatic inclusion of correlation IDs.
46+
* Processors for development (console-friendly) and production (JSON) output.
47+
* Sensitive data masking.
48+
49+
## 4. Logging Modes
50+
51+
Our logging system operates in two distinct modes:
52+
53+
### Development Mode
54+
* **Purpose**: To provide developers with readable and highlighted logs during local development.
55+
* **Features**:
56+
* Human-readable, often colorized console output.
57+
* Syntax highlighting for better readability.
58+
* Indented and structured output for complex data types.
59+
* Example Output Tool: `structlog.dev.ConsoleRenderer`.
60+
61+
### Non-Development Mode (JSON)
62+
* **Purpose**: To produce logs suitable for production environments, optimized for log aggregation and analysis tools (e.g., Kibana, Splunk, OpenSearch).
63+
* **Features**:
64+
* JSON-formatted logs for easy parsing by log management systems.
65+
* Compact representation of data.
66+
* Example Output Tool: `structlog.processors.JSONRenderer`.
67+
68+
## 5. Logging Practices
69+
70+
### What to Log
71+
72+
Developers should log information that aids in understanding application behavior and diagnosing issues:
73+
74+
* **Significant Business Events**: e.g., "Compliance report submitted", "User account created".
75+
* **Key Data Points/Identifiers**: e.g., `report_id`, `user_id`, `transaction_id`, counts, statuses.
76+
* **Errors and Exceptions**: All caught exceptions or significant error conditions.
77+
* **Service Calls**: Start and end of important service operations or external API calls, along with their success/failure status.
78+
* **Process Milestones**: Key steps in long-running or complex processes.
79+
* **Configuration Values**: Important configuration settings at startup (be mindful of sensitivity).
80+
81+
### How to Log with Structlog
82+
83+
1. **Get a Logger Instance**:
84+
In your Python modules, obtain a `structlog` logger:
85+
```python
86+
import structlog
87+
88+
logger = structlog.get_logger(__name__)
89+
```
90+
91+
2. **Log Messages with Key-Value Pairs**:
92+
Log messages by calling methods on the logger object (e.g., `info`, `error`, `warning`, `debug`). Pass structured data as keyword arguments.
93+
```python
94+
logger.info("fuel_supply_created", fuel_supply_id=123, compliance_report_id=456)
95+
logger.error("payment_processing_failed", order_id=789, reason="Insufficient funds")
96+
```
97+
The first argument is typically the main "event" or message.
98+
99+
3. **No Need to Manually Add Common Metadata**: The `structlog` configuration automatically adds timestamps, log levels, source information (filename, line number, function name), and correlation IDs.
100+
101+
## 6. Sample Logs
102+
103+
*(These are conceptual examples; actual format depends on `logging_config.py`.)*
104+
105+
### Development Mode Logs
106+
*(Example from original wiki, adapted for structlog style)*
107+
```
108+
2024-11-03 18:50:23 [info ] Getting fuel supply list for compliance report [your_module]compliance_report_id=123 correlation_id=177b381a-ca37-484d-a3b9-bbb16061775a filename=reports/views.py func_name=get_fuel_supply_list lineno=101
109+
```
110+
`structlog.dev.ConsoleRenderer` often provides more structured, multi-line output for key-value pairs.
111+
112+
### Non-Development Mode Logs (JSON)
113+
*(Example from original wiki, adapted for structlog style)*
114+
```json
115+
{
116+
"event": "Getting fuel supply list for compliance report",
117+
"compliance_report_id": 123,
118+
"correlation_id": "816dbbdf-11fe-4df5-8dc8-754c07610742",
119+
"level": "info",
120+
"logger": "your_module.reports.views",
121+
"filename": "reports/views.py",
122+
"func_name": "get_fuel_supply_list",
123+
"lineno": 101,
124+
"timestamp": "2024-11-03T18:50:23.123456Z"
125+
}
126+
```
127+
128+
## 7. Logging Levels
129+
130+
Use appropriate logging levels:
131+
132+
* `logger.debug(...)`: Detailed information, typically for diagnosing problems. Often disabled in production.
133+
* `logger.info(...)`: Confirmation that things are working as expected; significant lifecycle events.
134+
* `logger.warning(...)`: Indication of an unexpected event or potential problem that doesn't prevent current operation but might cause issues later.
135+
* `logger.error(...)`: A more serious problem where the software was unable to perform some function.
136+
* `logger.critical(...)`: A very serious error, indicating the program itself may be unable to continue running.
137+
138+
## 8. Correlation ID Usage
139+
140+
* **Purpose**: A unique ID to trace a single request or operation across multiple log entries, services, or components.
141+
* **Automatic Inclusion**: The logging infrastructure (via `structlog` processors) should automatically generate/propagate and include correlation IDs in every log entry.
142+
* **Future Integration**: Aim for end-to-end tracing by passing the correlation ID from the frontend through to all backend services.
143+
144+
## 9. Best Practices and Guidelines
145+
146+
### Consistency
147+
* **Event Names**: Use consistent and descriptive event names (the first argument to log methods like `logger.info("event_name", ...)`).
148+
* **Key Names**: Use consistent key names for common data points (e.g., `user_id`, `report_id`).
149+
150+
### Security and Privacy (Sensitive Data)
151+
* **DO NOT Log Sensitive Data**: Avoid logging PII (Personally Identifiable Information), passwords, access tokens, API keys, financial details, or any other confidential information directly.
152+
* **Automatic Data Masking**: The `structlog` configuration should include a processor to automatically mask or censor known sensitive keys (e.g., `password`, `token`, `authorization`). An example processor function from the original wiki:
153+
```python
154+
# In backend/lcfs/logging_config.py (conceptual)
155+
# def censor_sensitive_data_processor(_, __, event_dict):
156+
# sensitive_keys = {'password', 'token', 'secret_key', 'authorization', 'api_key'}
157+
# for key in event_dict:
158+
# if key in sensitive_keys:
159+
# event_dict[key] = '***' # Or a more robust redaction
160+
# return event_dict
161+
```
162+
Ensure such a processor is active in your `structlog` pipeline.
163+
* **Be Mindful**: Even with masking, exercise caution. It's best not to log sensitive data at all if avoidable.
164+
165+
### Performance Considerations
166+
* **Avoid Excessive Logging**: Log necessary information but avoid overly verbose logging in high-throughput code paths or tight loops, as it can impact performance.
167+
* **Deferred Evaluation**: `structlog` can be configured to defer string formatting or a_lambda_based_value_computation until it's certain a log message will actually be emitted (e.g., based on log level), which can save resources.
168+
* **Asynchronous Logging**: For very high-performance scenarios, consider asynchronous logging handlers (though this adds complexity).
169+
170+
## 10. Using Kibana for Log Analysis (If Applicable)
171+
172+
If logs are shipped to an Elasticsearch, Logstash, Kibana (ELK) stack or similar (like OpenSearch):
173+
174+
* **Accessing Kibana**: Via OpenShift Platform or a direct URL.
175+
* **Index Pattern**: Typically `app-*` or similar depending on your log shipping configuration.
176+
* **Timestamp Field**: Usually `@timestamp`.
177+
* **Searching/Filtering**: Utilize Kibana Query Language (KQL) or Lucene syntax to search and filter logs.
178+
* Filter by `correlation_id` to trace a request.
179+
* Filter by `level` (e.g., `level:error`).
180+
* Filter by `kubernetes.namespace_name`, `kubernetes.pod_name`, `kubernetes.container_name` for specific services in OpenShift.
181+
* Example Kibana Query (from original wiki):
182+
`kubernetes.namespace_name:"YOUR_PROJECT_NAME-tools" AND kubernetes.container_name.raw:"lcfs-backend" AND level:"error"`
183+
184+
## 11. FAQs
185+
186+
* **Q1: Do I need to include source info (filename, line number) in logs?**
187+
A: No, `structlog` is configured to add this automatically.
188+
* **Q2: How do I include additional context?**
189+
A: Pass key-value pairs as keyword arguments to the logger methods: `logger.info("event", key1=value1, key2=value2)`.
190+
* **Q3: Do I need to manage the correlation ID?**
191+
A: No, this is handled by the logging infrastructure.
192+
* **Q4: How to log exceptions?**
193+
A: Use `logger.exception("event_description")` within an `except` block to automatically include exception info and stack trace, or log manually with `logger.error("event", error=str(e), exc_info=True)`.
194+
```python
195+
try:
196+
# ... some operation ...
197+
except ValueError as e:
198+
logger.error("value_error_occurred", input_data=some_data, error_message=str(e))
199+
# For full stack trace (especially in dev or if needed for error level):
200+
# logger.exception("value_error_details")
201+
```
202+
203+
---
204+
*Adherence to these logging guidelines will greatly improve the observability and maintainability of the LCFS backend. Consult `backend/lcfs/logging_config.py` for specific `structlog` processor chain details.*

wiki/CI-CD-Pipeline.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# CI/CD Pipeline
2+
3+
This document describes the Continuous Integration (CI) and Continuous Deployment (CD) pipeline for the LCFS project. This is part of the requirements for ticket #2410.
4+
5+
## 1. Overview
6+
7+
The CI/CD pipeline automates the process of building, testing, and deploying the LCFS application, ensuring rapid and reliable delivery of new features and fixes.
8+
9+
* **Platform**: GitHub Actions (inferred from the presence of `.github/workflows` directory in typical projects, and common for modern development).
10+
* **Action**: Verify if GitHub Actions is indeed the CI/CD platform. If other tools are used (e.g., Jenkins, Azure DevOps, OpenShift Pipelines), this section needs to be updated.
11+
12+
## 2. Continuous Integration (CI)
13+
14+
CI is triggered on every push to a branch or when a Pull Request (PR) is created/updated targeting `main` (or `develop`).
15+
16+
### Key CI Steps (Typical)
17+
18+
1. **Checkout Code**: Fetches the latest code from the branch/PR.
19+
2. **Setup Environment**:
20+
* Sets up specific versions of Node.js (for frontend) and Python (for backend).
21+
* Installs dependencies (npm for frontend, Poetry for backend).
22+
* Caches dependencies to speed up subsequent runs.
23+
3. **Linting & Formatting Checks**:
24+
* **Frontend**: Runs ESLint and Prettier to check for code style and potential errors.
25+
* **Backend**: Runs Flake8, Black (check mode), Isort (check mode), and MyPy for style, formatting, and type checking.
26+
4. **Automated Testing**:
27+
* **Frontend Unit/Integration Tests**: Runs Vitest tests (`npm run test:run`). Generates code coverage reports.
28+
* **Backend Unit/Integration Tests**: Runs Pytest tests (`poetry run pytest`). Generates code coverage reports.
29+
* **(Optional) Frontend E2E Tests**: May run Cypress tests against a preview environment or a mocked backend, though E2E tests are often part of a separate, scheduled pipeline or triggered manually due to their longer execution time.
30+
5. **Build Application**:
31+
* **Frontend**: Creates a production build (`npm run build`).
32+
* **Backend**: No explicit build step for Python itself, but checks might include ensuring the Poetry project is valid.
33+
6. **Build Docker Images** (Optional in CI, more common in CD, but can be a check):
34+
* Validates that `Dockerfile` and `Dockerfile.openshift` files can build successfully.
35+
7. **Security Scans** (Optional but Recommended):
36+
* Dependency vulnerability scans (e.g., `npm audit`, `safety` or `pip-audit`).
37+
* Static Application Security Testing (SAST) tools.
38+
* Container image vulnerability scans (if images are built).
39+
8. **Notifications**: Reports build status (success/failure) to GitHub, PR comments, or team communication channels (e.g., Slack).
40+
41+
## 3. Continuous Deployment (CD)
42+
43+
CD is typically triggered after a successful merge to the `main` branch (for production deployment) or a `develop`/`staging` branch (for pre-production environments).
44+
45+
### Key CD Steps (Typical for OpenShift)
46+
47+
1. **Checkout Code**: Fetches the code from the branch that triggered the deployment (e.g., `main`).
48+
2. **Build Docker Images** (if not already built and pushed by CI):
49+
* Builds the `backend` and `frontend` Docker images using their respective `Dockerfile.openshift` files.
50+
* Tags the images appropriately (e.g., with Git commit SHA, version number).
51+
* Pushes the images to a container registry accessible by OpenShift (e.g., OpenShift Internal Registry, BC Gov Artefact Repository).
52+
3. **Deploy to OpenShift Environment** (e.g., Dev, Test, Prod):
53+
* Connects to the target OpenShift cluster using `oc login` (credentials stored as GitHub Secrets).
54+
* Applies OpenShift templates/configurations:
55+
* May use `oc process -f template.yaml | oc apply -f -` if using OpenShift templates.
56+
* May use `kustomize build | oc apply -f -` if using Kustomize.
57+
* May use Helm charts (`helm upgrade --install ...`).
58+
* This updates `DeploymentConfigs` or `Deployments`, which triggers new pod rollouts.
59+
* The `backend-bc.yaml` and `frontend-bc.yaml` `BuildConfigs` found in `openshift/templates/` might be triggered here if the strategy is to build images directly within OpenShift using S2I or Docker strategy based on code changes.
60+
4. **Run Database Migrations**: For the backend, Alembic migrations (`./migrate.sh -u head` or similar) are run against the target environment's database before the new application version goes live.
61+
5. **Health Checks / Smoke Tests**: Performs basic checks to ensure the deployed application is running and healthy.
62+
6. **Notifications**: Reports deployment status.
63+
64+
## 4. Workflow Files
65+
66+
* CI/CD pipelines are defined as YAML files in the `.github/workflows/` directory of the repository.
67+
* There might be separate workflow files for CI (e.g., `ci.yml`, `pull-request.yml`) and CD (e.g., `deploy-dev.yml`, `deploy-prod.yml`).
68+
69+
## 5. Wiki Documentation Synchronization
70+
71+
* A dedicated GitHub Actions workflow can be set up to automatically synchronize changes made to Markdown files in the `wiki/` directory of the main codebase repository to the actual GitHub Wiki repository (`lcfs.wiki.git`).
72+
* See [GitHub Workflow for Wiki Sync](GitHub-Workflow-for-Wiki-Sync.md) for the implementation.
73+
74+
## Further Investigation
75+
76+
* **Action**: Review the `.github/workflows/` directory in the LCFS repository to confirm the exact CI/CD tools, triggers, and steps.
77+
* Document specific workflow file names and their purposes.
78+
* Detail how environment variables and secrets are managed for different environments in the CI/CD pipeline.
79+
80+
---
81+
*This document provides a general outline. The actual implementation details are in the workflow files within the repository.*

0 commit comments

Comments
 (0)