Skip to content

Commit 64cbd78

Browse files
github-actions[bot]Marfuenclaude
authored
fix: optimize api build and update dependencies
* perf(docker): optimize API build — strip unused deps, remove duplicate prisma generate - Strip root package.json of frontend deps before bun install (~650 fewer packages) - Use --ignore-scripts to skip husky and other lifecycle scripts - Remove duplicate prisma generate in production stage (builder already generates it) - Combine sequential RUN commands into fewer layers - Use COPY --chown instead of recursive chown -R (eliminates 311s step) - Fix .dockerignore to exclude nested node_modules (**/ instead of */) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf(docker): enable ECR layer caching for CodeBuild Pull previous image before building and use --cache-from so Docker can reuse unchanged layers. Most builds will only rebuild from the source COPY step onwards, skipping bun install entirely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(docker): use denylist for root package.json stripping Delete only dependencies/devDependencies/scripts instead of allowlisting fields. Preserves overrides, resolutions, patchedDependencies, and any other fields that affect dependency resolution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(security): upgrade jspdf v3→v4, replace xlsx with exceljs - jspdf 3.x → 4.2.0: fixes 21 vulnerabilities (PDF injection, DoS, XSS). Zero code changes needed — API is fully compatible. - xlsx → exceljs: fixes 7 vulnerabilities (ReDoS, prototype pollution). xlsx is abandoned with no patched version. exceljs was already a dependency. Migrated 3 files, updated callers to async. - Added unit tests for PDF generation (training cert, policy renderer) and Excel read/write (content extractor, export generator, vector store) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Mariano Fuentes <marfuen98@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 80fe633 commit 64cbd78

18 files changed

+845
-159
lines changed

.dockerignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Dependencies
22
node_modules
3-
*/node_modules
3+
**/node_modules
44
npm-debug.log*
55
yarn-debug.log*
66
yarn-error.log*

apps/api/Dockerfile.multistage

Lines changed: 34 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# =============================================================================
2-
# STAGE 1: Dependencies - Install workspace dependencies
2+
# STAGE 1: Dependencies - Install only what the API needs
33
# =============================================================================
44
FROM oven/bun:1.2.8 AS deps
55

@@ -8,7 +8,16 @@ WORKDIR /app
88
# Copy root workspace config
99
COPY package.json bun.lock ./
1010

11-
# Copy all workspace package.json files
11+
# Strip root package.json to only keep workspaces config.
12+
# The root has frontend deps (design-system, react-dnd, sharp, semantic-release, etc.)
13+
# that the API doesn't need. Removing them cuts ~800 packages from the install.
14+
RUN cat package.json | bun -e " \
15+
const pkg = JSON.parse(await Bun.stdin.text()); \
16+
delete pkg.dependencies; delete pkg.devDependencies; delete pkg.scripts; \
17+
console.log(JSON.stringify(pkg, null, 2));" > package.min.json \
18+
&& mv package.min.json package.json
19+
20+
# Copy only the workspace package.json files the API depends on
1221
COPY packages/auth/package.json ./packages/auth/
1322
COPY packages/db/package.json ./packages/db/
1423
COPY packages/utils/package.json ./packages/utils/
@@ -20,16 +29,23 @@ COPY packages/company/package.json ./packages/company/
2029
# Copy API package.json
2130
COPY apps/api/package.json ./apps/api/
2231

23-
# Install all dependencies (including workspace deps)
24-
RUN bun install
32+
# Install dependencies — skip lifecycle scripts (husky, etc. not needed in Docker)
33+
RUN bun install --ignore-scripts
2534

2635
# =============================================================================
2736
# STAGE 2: Builder - Build workspace packages and NestJS app
2837
# =============================================================================
29-
FROM deps AS builder
38+
FROM oven/bun:1.2.8 AS builder
3039

3140
WORKDIR /app
3241

42+
# Copy node_modules first (from deps stage), then source on top.
43+
# This avoids conflicts between workspace symlinks and local node_modules
44+
# that get included from the build context.
45+
COPY --from=deps /app/node_modules ./node_modules
46+
COPY --from=deps /app/package.json ./package.json
47+
COPY --from=deps /app/bun.lock ./bun.lock
48+
3349
# Copy workspace packages source
3450
COPY packages/auth ./packages/auth
3551
COPY packages/db ./packages/db
@@ -42,23 +58,19 @@ COPY packages/company ./packages/company
4258
# Copy API source
4359
COPY apps/api ./apps/api
4460

45-
# Bring in node_modules from deps stage
46-
COPY --from=deps /app/node_modules ./node_modules
47-
48-
# Build workspace packages
49-
RUN cd packages/auth && bun run build && cd ../..
50-
RUN cd packages/db && bun run build && cd ../..
51-
RUN cd packages/integration-platform && bun run build && cd ../..
52-
RUN cd packages/email && bun run build && cd ../..
53-
RUN cd packages/company && bun run build && cd ../..
61+
# Build db first — generates Prisma client needed by other packages
62+
RUN cd packages/db && bun run build
5463

55-
# Generate Prisma client for API (copy schema and generate)
56-
RUN cd packages/db && node scripts/combine-schemas.js && cd ../..
57-
RUN cp packages/db/dist/schema.prisma apps/api/prisma/schema.prisma
58-
RUN cd apps/api && bunx prisma generate
64+
# Build remaining workspace packages
65+
RUN cd packages/auth && bun run build \
66+
&& cd ../integration-platform && bun run build \
67+
&& cd ../email && bun run build \
68+
&& cd ../company && bun run build
5969

60-
# Build NestJS application (skip prebuild since we already generated Prisma)
61-
RUN cd apps/api && bunx nest build
70+
# Generate Prisma schema for API and build NestJS app
71+
RUN cd packages/db && node scripts/combine-schemas.js \
72+
&& cp /app/packages/db/dist/schema.prisma /app/apps/api/prisma/schema.prisma \
73+
&& cd /app/apps/api && bunx prisma generate && bunx nest build
6274

6375
# =============================================================================
6476
# STAGE 3: Production Runtime
@@ -77,7 +89,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends wget openssl &&
7789
# Copy built NestJS app
7890
COPY --from=builder --chown=nestjs:nestjs /app/apps/api/dist ./dist
7991

80-
# Copy prisma files
92+
# Copy prisma schema (for reference only — client is already generated in node_modules)
8193
COPY --from=builder --chown=nestjs:nestjs /app/apps/api/prisma ./prisma
8294

8395
# Copy package.json (for any runtime needs)
@@ -92,7 +104,7 @@ COPY --from=builder --chown=nestjs:nestjs /app/packages/tsconfig ./packages/tsco
92104
COPY --from=builder --chown=nestjs:nestjs /app/packages/email ./packages/email
93105
COPY --from=builder --chown=nestjs:nestjs /app/packages/company ./packages/company
94106

95-
# Copy production node_modules (includes symlinks to workspace packages above)
107+
# Copy production node_modules (includes Prisma client already generated for linux/amd64)
96108
COPY --from=builder --chown=nestjs:nestjs /app/node_modules ./node_modules
97109

98110
# Set production environment
@@ -101,9 +113,6 @@ ENV PORT=3333
101113

102114
USER nestjs
103115

104-
# Regenerate Prisma client for this runtime environment (run as nestjs to keep consistent ownership)
105-
RUN npx prisma generate --schema=./prisma/schema.prisma
106-
107116
EXPOSE 3333
108117

109118
# Health check

apps/api/buildspec.multistage.yml

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
version: 0.2
22

3-
# Simplified buildspec that uses multi stage Docker build
4-
# al building happens inside Docker - CodeBuild just orchestrates ECR/ECS
3+
# Simplified buildspec that uses multi-stage Docker build.
4+
# All building happens inside Docker CodeBuild just orchestrates ECR/ECS.
55

66
phases:
77
pre_build:
@@ -10,12 +10,21 @@ phases:
1010
- aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
1111
- COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
1212
- IMAGE_TAG=${COMMIT_HASH:=latest}
13+
# Pull latest image for Docker layer cache (ignore failure on first build)
14+
- docker pull $ECR_REPOSITORY_URI:latest || true
1315

1416
build:
1517
commands:
1618
- echo "Building Docker image with multi-stage build..."
1719
- cd apps/api
18-
- docker build --build-arg BUILDKIT_INLINE_CACHE=1 --target production -f Dockerfile.multistage -t $ECR_REPOSITORY_URI:$IMAGE_TAG ../..
20+
- >-
21+
docker build
22+
--build-arg BUILDKIT_INLINE_CACHE=1
23+
--cache-from $ECR_REPOSITORY_URI:latest
24+
--target production
25+
-f Dockerfile.multistage
26+
-t $ECR_REPOSITORY_URI:$IMAGE_TAG
27+
../..
1928
- docker tag $ECR_REPOSITORY_URI:$IMAGE_TAG $ECR_REPOSITORY_URI:latest
2029

2130
post_build:
@@ -27,12 +36,7 @@ phases:
2736
- aws ecs update-service --cluster $ECS_CLUSTER_NAME --service $ECS_SERVICE_NAME --force-new-deployment
2837
- 'printf "[{\"name\":\"%s-container\",\"imageUri\":\"%s\"}]" api $ECR_REPOSITORY_URI:$IMAGE_TAG > imagedefinitions.json'
2938

30-
cache:
31-
paths:
32-
- '/root/.docker/buildx/cache/**/*'
33-
3439
artifacts:
3540
files:
3641
- imagedefinitions.json
3742
name: ${APP_NAME}-build
38-

apps/api/package.json

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@
4949
"express": "^4.21.2",
5050
"helmet": "^8.1.0",
5151
"jose": "^6.0.12",
52-
"jspdf": "^3.0.3",
52+
"jspdf": "^4.2.0",
5353
"mammoth": "^1.8.0",
5454
"nanoid": "^5.1.6",
5555
"pdf-lib": "^1.17.1",
@@ -63,7 +63,6 @@
6363
"safe-stable-stringify": "^2.5.0",
6464
"stripe": "^20.4.0",
6565
"swagger-ui-express": "^5.0.1",
66-
"xlsx": "^0.18.5",
6766
"zod": "^4.0.14"
6867
},
6968
"devDependencies": {

apps/api/src/questionnaire/questionnaire.service.spec.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -425,7 +425,7 @@ describe('QuestionnaireService', () => {
425425
mimeType: 'text/csv',
426426
filename: 'test.csv',
427427
};
428-
(generateExportFile as jest.Mock).mockReturnValue(mockExport);
428+
(generateExportFile as jest.Mock).mockResolvedValue(mockExport);
429429

430430
const result = await service.exportById({
431431
questionnaireId: 'q1',

apps/api/src/questionnaire/questionnaire.service.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,7 @@ export class QuestionnaireService {
148148
const zip = new AdmZip();
149149

150150
for (const format of formats) {
151-
const exportFile = generateExportFile(
151+
const exportFile = await generateExportFile(
152152
answered.map((a) => ({ question: a.question, answer: a.answer })),
153153
format,
154154
vendorName,
@@ -182,7 +182,7 @@ export class QuestionnaireService {
182182
}
183183

184184
// Single format export (default behavior)
185-
const exportFile = generateExportFile(
185+
const exportFile = await generateExportFile(
186186
answered.map((a) => ({ question: a.question, answer: a.answer })),
187187
dto.format as ExportFormat,
188188
vendorName,
@@ -433,7 +433,7 @@ export class QuestionnaireService {
433433
format: dto.format,
434434
});
435435

436-
return generateExportFile(
436+
return await generateExportFile(
437437
questionsAndAnswers,
438438
dto.format as ExportFormat,
439439
questionnaire.filename,
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
import { extractContentFromFile } from './content-extractor';
2+
import ExcelJS from 'exceljs';
3+
4+
// Mock AI dependencies
5+
jest.mock('@ai-sdk/openai', () => ({ openai: jest.fn() }));
6+
jest.mock('@ai-sdk/anthropic', () => ({ anthropic: jest.fn() }));
7+
jest.mock('@ai-sdk/groq', () => ({ createGroq: jest.fn(() => jest.fn()) }));
8+
jest.mock('ai', () => ({
9+
generateText: jest.fn(),
10+
generateObject: jest.fn(),
11+
jsonSchema: jest.fn((s) => s),
12+
}));
13+
14+
async function createTestExcelBuffer(
15+
sheets: { name: string; rows: (string | number)[][] }[],
16+
): Promise<Buffer> {
17+
const workbook = new ExcelJS.Workbook();
18+
for (const sheet of sheets) {
19+
const ws = workbook.addWorksheet(sheet.name);
20+
for (const row of sheet.rows) {
21+
ws.addRow(row);
22+
}
23+
}
24+
const arrayBuffer = await workbook.xlsx.writeBuffer();
25+
return Buffer.from(arrayBuffer);
26+
}
27+
28+
describe('content-extractor: extractContentFromFile', () => {
29+
const XLSX_MIME =
30+
'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet';
31+
32+
it('should extract content from an Excel file with headers', async () => {
33+
const buffer = await createTestExcelBuffer([
34+
{
35+
name: 'Survey',
36+
rows: [
37+
['Question', 'Response', 'Comment'],
38+
['Do you agree?', 'Yes', 'Fully agree'],
39+
['Rating?', '5', ''],
40+
],
41+
},
42+
]);
43+
44+
const base64 = buffer.toString('base64');
45+
const result = await extractContentFromFile(base64, XLSX_MIME);
46+
47+
expect(result).toContain('Question');
48+
expect(result).toContain('Do you agree?');
49+
expect(result).toContain('Yes');
50+
expect(result).toContain('Rating?');
51+
});
52+
53+
it('should extract content from multiple sheets', async () => {
54+
const buffer = await createTestExcelBuffer([
55+
{ name: 'General', rows: [['Info', 'Details'], ['Name', 'Acme Corp']] },
56+
{ name: 'Security', rows: [['Control', 'Status'], ['MFA', 'Enabled']] },
57+
]);
58+
59+
const base64 = buffer.toString('base64');
60+
const result = await extractContentFromFile(base64, XLSX_MIME);
61+
62+
expect(result).toContain('Acme Corp');
63+
expect(result).toContain('MFA');
64+
});
65+
66+
it('should handle CSV files', async () => {
67+
const csv = 'question,answer\nWhat is 2+2?,4\n';
68+
const base64 = Buffer.from(csv).toString('base64');
69+
70+
const result = await extractContentFromFile(base64, 'text/csv');
71+
72+
expect(result).toContain('question,answer');
73+
expect(result).toContain('What is 2+2?,4');
74+
});
75+
76+
it('should handle plain text files', async () => {
77+
const text = 'Some compliance document content';
78+
const base64 = Buffer.from(text).toString('base64');
79+
80+
const result = await extractContentFromFile(base64, 'text/plain');
81+
82+
expect(result).toBe(text);
83+
});
84+
85+
it('should throw for Word documents', async () => {
86+
const base64 = Buffer.from('fake').toString('base64');
87+
88+
await expect(
89+
extractContentFromFile(base64, 'application/msword'),
90+
).rejects.toThrow('Word documents');
91+
});
92+
93+
it('should throw for unsupported types', async () => {
94+
const base64 = Buffer.from('data').toString('base64');
95+
96+
await expect(
97+
extractContentFromFile(base64, 'application/octet-stream'),
98+
).rejects.toThrow('Unsupported file type');
99+
});
100+
});

0 commit comments

Comments
 (0)