Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file removed .DS_Store
Binary file not shown.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,5 @@ cdk.out/
__pycache__/
*.pyc
javascript_docker/node_modules
lambda/add_title/venv
lambda/add_title/venv
.DS_Store
48 changes: 35 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,31 @@ Ensure your project has the following structure:
|__ client_credentials.json (The client id and client secret id for adobe)
```

## Environment and Naming

This project supports environment‑aware stack and resource naming to simplify multi‑env deployments.

- name_prefix: Computed as `<stackBase>-<env>` (defaults: `stackBase=pdfaccessibility`, `env=dev`).
- CloudFormation stack: Named `<stackBase>-<env>` unless overridden via `stackName`.
- S3 bucket: Named `<stackBase>-<env>-<account>-<region>` (globally unique; lowercase and hyphenated).
- ECS cluster: Named `<name_prefix>-cluster`.
- Step Functions: Log group `/aws/states/<name_prefix>-state-machine` and state machine name `<name_prefix>-state-machine`.
- CloudWatch dashboard: Named `PDF_Processing_Dashboard_<name_prefix>`.

Common deploy patterns:

- Minimal (uses active AWS profile for account/region):
- `cdk deploy -c env=dev`
- Specify env and base:
- `cdk deploy -c env=prod -c stackBase=pdfrem`
- Force an exact stack name:
- `cdk deploy -c stackName=pdfrem-bgdev`

Account/Region inference:

- You do not need to set `CDK_DEFAULT_ACCOUNT`/`CDK_DEFAULT_REGION`. If not provided, CDK uses the credentials of your current AWS CLI profile to infer them at deploy time.
- If you prefer to pin them, pass `-c account=<acct> -c region=<region>` or export environment variables.

## Setup and Deployment

1. **Clone the Repository**:
Expand All @@ -79,10 +104,7 @@ Ensure your project has the following structure:
```bash
aws configure
```
- Make sure the region is set to
```
us-east-1
```
- Make sure a default region is set (example: `us-east-1`). The CDK will use your configured profile's region unless you explicitly pass one via `-c region=...`.

3. **Set Up CDK Environment**:
- Bootstrap your AWS environment for CDK (run only once per AWS account/region):
Expand All @@ -103,29 +125,29 @@ Ensure your project has the following structure:
- Replace <Your Client ID here> and <Your Secret ID here> with your actual Client ID and Client Secret provided by Adobe and not the whole file.

5. **Upload Credentials to Secrets Manager**:
- Run this command in the terminal of the project to push the secret keys to secret manager:
- Run this command in the terminal of the project to push the secret keys to Secrets Manager (namespaced by project/env):
- For Mac
```
aws secretsmanager create-secret \
--name /myapp/client_credentials \
--name /<stackBase>/<env>/client_credentials \
--description "Client credentials for PDF services" \
--secret-string file://client_credentials.json
```
- For Windows
```bash
aws secretsmanager create-secret --name /myapp/client_credentials --description "Client credentials for PDF services" --secret-string file://client_credentials.json
aws secretsmanager create-secret --name /<stackBase>/<env>/client_credentials --description "Client credentials for PDF services" --secret-string file://client_credentials.json
```
- Run this command if you have already uploaded the keys earlier and would like to update the keys in secret manager.
- For Mac:
```
aws secretsmanager update-secret \
--secret-id /myapp/client_credentials \
--secret-id /<stackBase>/<env>/client_credentials \
--description "Updated client credentials for PDF services" \
--secret-string file://client_credentials.json
```
- For Windows:
```bash
aws secretsmanager update-secret --secret-id /myapp/client_credentials --description "Updated client credentials for PDF services" --secret-string file://client_credentials.json
aws secretsmanager update-secret --secret-id /<stackBase>/<env>/client_credentials --description "Updated client credentials for PDF services" --secret-string file://client_credentials.json
```
6. **Install the Requirements**:
- For both Mac and Windows
Expand All @@ -152,10 +174,10 @@ Ensure your project has the following structure:
```

10. **Deploy the CDK Stack**:
- Deploy the stack to AWS:
```
cdk deploy
```
- Deploy the stack to AWS (environment‑aware naming):
- Minimal: `cdk deploy -c env=dev`
- Example prod: `cdk deploy -c env=prod -c stackBase=pdfrem`
- Exact stack name override: `cdk deploy -c stackName=pdfrem-bgdev`

## Usage

Expand Down
116 changes: 92 additions & 24 deletions app.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,32 @@
)
from constructs import Construct
import platform
import os

class PDFAccessibility(Stack):
def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
super().__init__(scope, construct_id, **kwargs)

# S3 Bucket
bucket = s3.Bucket(self, "pdfaccessibilitybucket1", encryption=s3.BucketEncryption.S3_MANAGED, enforce_ssl=True)
# Environment context and tags
env_name = (self.node.try_get_context("env") or os.getenv("ENV", "dev")).lower()
stack_base = (self.node.try_get_context("stackBase") or os.getenv("STACK_BASE") or "pdfaccessibility").lower()
name_prefix = f"{stack_base}-{env_name}"
cdk.Tags.of(self).add("Environment", env_name)
cdk.Tags.of(self).add("Project", stack_base)

# Resolve account and region early for naming
account_id = Stack.of(self).account
region = Stack.of(self).region

# S3 Bucket (environment-specific name; include account for global uniqueness)
bucket = s3.Bucket(
self,
"pdfaccessibilitybucket1",
encryption=s3.BucketEncryption.S3_MANAGED,
enforce_ssl=True,
# Name: <stackBase>-<env>-<account>-<region>
bucket_name=f"{stack_base}-{env_name}-{account_id}-{region}"
)


python_image_asset = ecr_assets.DockerImageAsset(self, "PythonImage",
Expand All @@ -51,9 +70,16 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
),
]
)
# Name tag for VPC
cdk.Tags.of(vpc).add("Name", f"{name_prefix}-vpc")

# ECS Cluster
cluster = ecs.Cluster(self, "FargateCluster", vpc=vpc)
cluster = ecs.Cluster(
self,
"FargateCluster",
vpc=vpc,
cluster_name=f"{name_prefix}-cluster",
)

ecs_task_execution_role = iam.Role(self, "EcsTaskRole",
assumed_by=iam.ServicePrincipal("ecs-tasks.amazonaws.com"),
Expand All @@ -62,8 +88,6 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
])

# Allow ECS Task Role to access Bedrock services
account_id = Stack.of(self).account
region = Stack.of(self).region

ecs_task_role = iam.Role(self, "EcsTaskExecutionRole",
assumed_by=iam.ServicePrincipal("ecs-tasks.amazonaws.com"),
Expand All @@ -80,20 +104,20 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
actions=["s3:*"], # This gives access to all S3 actions
resources=["*"], # This applies the actions to all resources
))
ecs_task_role.add_to_policy(iam.PolicyStatement(actions=
["secretsmanager:GetSecretValue"],
resources=[f"arn:aws:secretsmanager:{region}:{account_id}:secret:/myapp/db_credentials"] )
)
ecs_task_role.add_to_policy(iam.PolicyStatement(
actions=["secretsmanager:GetSecretValue"],
resources=[f"arn:aws:secretsmanager:{region}:{account_id}:secret:/{stack_base}/{env_name}/*"]
))
# Grant S3 read/write access to ECS Task Role
bucket.grant_read_write(ecs_task_execution_role)
# Create ECS Task Log Groups explicitly
python_container_log_group = logs.LogGroup(self, "PythonContainerLogGroup",
log_group_name="/ecs/MyFirstTaskDef/PythonContainerLogGroup",
log_group_name=f"/ecs/{name_prefix}/python",
retention=logs.RetentionDays.ONE_WEEK,
removal_policy=cdk.RemovalPolicy.DESTROY)

javascript_container_log_group = logs.LogGroup(self, "JavaScriptContainerLogGroup",
log_group_name="/ecs/MySecondTaskDef/JavaScriptContainerLogGroup",
log_group_name=f"/ecs/{name_prefix}/javascript",
retention=logs.RetentionDays.ONE_WEEK,
removal_policy=cdk.RemovalPolicy.DESTROY)
# ECS Task Definitions
Expand Down Expand Up @@ -156,6 +180,14 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
name="model_arn_link",
value=model_arn_link
),
tasks.TaskEnvironmentVariable(
name="STACK_BASE",
value=stack_base
),
tasks.TaskEnvironmentVariable(
name="ENV",
value=env_name
),
]
)],
launch_target=tasks.EcsFargateLaunchTarget(
Expand Down Expand Up @@ -203,6 +235,7 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
)
java_lambda = lambda_.Function(
self, 'JavaLambda',
function_name=f"{name_prefix}-pdf-merger",
runtime=lambda_.Runtime.JAVA_21,
handler='com.example.App::handleRequest',
code=lambda_.Code.from_asset('lambda/java_lambda/PDFMergerLambda/target/PDFMergerLambda-1.0-SNAPSHOT.jar'),
Expand Down Expand Up @@ -232,6 +265,7 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:

add_title_lambda = lambda_.Function(
self, 'AddTitleLambda',
function_name=f"{name_prefix}-add-title",
runtime=lambda_.Runtime.PYTHON_3_12,
handler='myapp.lambda_handler',
code=lambda_.Code.from_docker_build('lambda/add_title'),
Expand Down Expand Up @@ -265,18 +299,23 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:

a11y_precheck = lambda_.Function(
self,'accessibility_checker_before_remidiation',
function_name=f"{name_prefix}-a11y-precheck",
runtime=lambda_.Runtime.PYTHON_3_10,
handler='main.lambda_handler',
code=lambda_.Code.from_docker_build('lambda/accessibility_checker_before_remidiation'),
timeout=Duration.seconds(900),
memory_size=512,
architecture=lambda_arch,
environment={
"STACK_BASE": stack_base,
"ENV": env_name,
}
)

a11y_precheck.add_to_role_policy(
iam.PolicyStatement(
actions=["secretsmanager:GetSecretValue"],
resources=[f"arn:aws:secretsmanager:{region}:{account_id}:secret:/myapp/*"]
resources=[f"arn:aws:secretsmanager:{region}:{account_id}:secret:/{stack_base}/{env_name}/client_credentials*"]
))
bucket.grant_read_write(a11y_precheck)
a11y_precheck.add_to_role_policy(cloudwatch_logs_policy)
Expand All @@ -291,18 +330,23 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:

a11y_postcheck = lambda_.Function(
self,'accessibility_checker_after_remidiation',
function_name=f"{name_prefix}-a11y-postcheck",
runtime=lambda_.Runtime.PYTHON_3_10,
handler='main.lambda_handler',
code=lambda_.Code.from_docker_build('lambda/accessability_checker_after_remidiation'),
timeout=Duration.seconds(900),
memory_size=512,
architecture=lambda_arch,
environment={
"STACK_BASE": stack_base,
"ENV": env_name,
}
)

a11y_postcheck.add_to_role_policy(
iam.PolicyStatement(
actions=["secretsmanager:GetSecretValue"],
resources=[f"arn:aws:secretsmanager:{region}:{account_id}:secret:/myapp/*"]
resources=[f"arn:aws:secretsmanager:{region}:{account_id}:secret:/{stack_base}/{env_name}/client_credentials*"]
))
bucket.grant_read_write(a11y_postcheck)
a11y_postcheck.add_to_role_policy(cloudwatch_logs_policy)
Expand All @@ -323,23 +367,28 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
parallel_state.branch(a11y_precheck_lambda_task)

log_group_stepfunctions = logs.LogGroup(self, "StepFunctionLogs",
log_group_name="/aws/states/MyStateMachine_PDFAccessibility",
log_group_name=f"/aws/states/{name_prefix}-state-machine",
retention=logs.RetentionDays.ONE_WEEK,
removal_policy=cdk.RemovalPolicy.DESTROY
)
# State Machine

state_machine = sfn.StateMachine(self, "MyStateMachine",
definition=parallel_state,
timeout=Duration.minutes(150),
logs=sfn.LogOptions(
destination=log_group_stepfunctions,
level=sfn.LogLevel.ALL
))
state_machine = sfn.StateMachine(
self,
"MyStateMachine",
definition=parallel_state,
timeout=Duration.minutes(150),
logs=sfn.LogOptions(
destination=log_group_stepfunctions,
level=sfn.LogLevel.ALL,
),
state_machine_name=f"{name_prefix}-state-machine",
)

# Lambda Function
split_pdf_lambda = lambda_.Function(
self, 'SplitPDF',
function_name=f"{name_prefix}-split-pdf",
runtime=lambda_.Runtime.PYTHON_3_10,
handler='main.lambda_handler',
code=lambda_.Code.from_docker_build("lambda/split_pdf"),
Expand Down Expand Up @@ -370,10 +419,10 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
java_lambda_log_group_name = f"/aws/lambda/{java_lambda.function_name}"
add_title_lambda_log_group_name = f"/aws/lambda/{add_title_lambda.function_name}"
accessibility_checker_pre_log_group_name = f"/aws/lambda/{a11y_precheck.function_name}"
accessibility_checker_post_log_group_name = f"aws/lambda/{a11y_postcheck.function_name}"
accessibility_checker_post_log_group_name = f"/aws/lambda/{a11y_postcheck.function_name}"


dashboard = cloudwatch.Dashboard(self, "PDF_Processing_Dashboard", dashboard_name="PDF_Processing_Dashboard",
dashboard = cloudwatch.Dashboard(self, "PDF_Processing_Dashboard", dashboard_name=f"PDF_Processing_Dashboard_{name_prefix}",
variables=[cloudwatch.DashboardVariable(
id="filename",
type=cloudwatch.VariableType.PATTERN,
Expand Down Expand Up @@ -439,5 +488,24 @@ def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
)

app = cdk.App()
PDFAccessibility(app, "PDFAccessibility")

# Read environment name from context or ENV
env_name = (app.node.try_get_context("env") or os.getenv("ENV", "dev")).lower()

# Determine account/region from context or CLI-provided defaults (optional)
account = app.node.try_get_context("account") or os.getenv("CDK_DEFAULT_ACCOUNT")
region = app.node.try_get_context("region") or os.getenv("CDK_DEFAULT_REGION")

# Stack name precedence: stackName > stackBase + env
stack_name_override = app.node.try_get_context("stackName") or os.getenv("STACK_NAME")
stack_base = app.node.try_get_context("stackBase") or os.getenv("STACK_BASE") or "pdfaccessibility"
stack_name = stack_name_override or f"{stack_base}-{env_name}"

# Allow CDK to infer env if not provided; use explicit env only when both exist
stack_kwargs = {}
if account and region:
stack_kwargs["env"] = cdk.Environment(account=account, region=region)

PDFAccessibility(app, stack_name, **stack_kwargs)

app.synth()
3 changes: 1 addition & 2 deletions docker_autotag/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
# Use an official Python runtime as a parent image
FROM python:3.10-slim-buster

# Set environment variables
ENV AWS_REGION="us-east-1"
# Region is provided by task execution environment

# Set the working directory in the container to /app
WORKDIR /app
Expand Down
8 changes: 4 additions & 4 deletions docker_autotag/autotag.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,11 +128,11 @@ def get_secret(basefilename):
Returns:
tuple: (client_id, client_secret)
"""
secret_name = "/myapp/client_credentials"
region_name = "us-east-1"
secret_name = f"/{os.getenv('STACK_BASE','pdfaccessibility')}/{os.getenv('ENV','dev')}/client_credentials"
session = boto3.session.Session()
region_name = session.region_name


session = boto3.session.Session()
client = session.client(
service_name='secretsmanager',
region_name=region_name
Expand Down Expand Up @@ -658,4 +658,4 @@ def main():
sys.exit(1)

if __name__ == "__main__":
main()
main()
Binary file removed javascript_docker/.DS_Store
Binary file not shown.
3 changes: 1 addition & 2 deletions javascript_docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,7 @@ COPY package*.json ./
# Install dependencies (force better-sqlite3 to build from source if necessary)
RUN npm install

# Set environment variables
ENV AWS_REGION="us-east-1"
# Region is provided by task execution environment

# Copy the rest of the application source code
COPY . .
Expand Down
2 changes: 1 addition & 1 deletion javascript_docker/alt-text.js
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ const logger = winston.createLogger({
});

// Create an S3 client instance.
const s3Client = new S3Client({ region: "us-east-1" });
const s3Client = new S3Client({ region: process.env.AWS_REGION || process.env.AWS_DEFAULT_REGION });

function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
Expand Down
Binary file removed lambda/.DS_Store
Binary file not shown.
Loading