Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion bin/start-harmony
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash

# This script starts Harmomny in the local kubernetes cluster - it is called from the start-all
# This script starts Harmony in the local kubernetes cluster - it is called from the start-all
# script. The environment variables for Harmony are read from k8s configmaps and secrets.

env_save=$(export -p)
Expand Down
7 changes: 4 additions & 3 deletions docs/guides/adapting-new-services.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,8 +102,8 @@ The structure of an entry in the [services.yml](../../config/services.yml) file
```yaml
- name: harmony/service-example # A unique identifier string for the service, conventionally <team>/<service>
data_operation_version: '0.21.0' # The version of the data-operation messaging schema to use
has_granule_limit: true # Optional flag indicating whether we will impose granule limts for the request. Default to true.
default_sync: false # Optional flag indicating whether we will force the request to run synchrously. Default to false.
has_granule_limit: true # Optional flag indicating whether we will impose granule limits for the request. Default to true.
default_sync: false # Optional flag indicating whether we will force the request to run synchronously. Default to false.
type: # Configuration for service invocation
<<: *default-turbo-config # To reduce boilerplate, services.yml includes default configuration suitable for all Docker based services.
params:
Expand Down Expand Up @@ -134,6 +134,7 @@ The structure of an entry in the [services.yml](../../config/services.yml) file
- image/gif
reprojection: true # The service supports reprojection
validate_variables: true # Whether to validate the requested variables exist in the CMR. Defaults to true.
external_validation_url: http://example.com # Optional endpoint to be called to validate the user making a request
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should try to future proof it to allow for multiple validations and different types of validations. e.g.

validations:
  - url: !Env ${FOO_ENDPOINT}
  - docker: !Env ${FOO_IMAGE}
  - code: /app/validations/foo.ts

It's probably fine to keep it just a single URL for now and change it if needed in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we will keep it simple for now and support those use cases as needed.

steps:
- image: !Env ${QUERY_CMR_IMAGE} # The image to use for the first step in the chain
is_sequential: true # Required for query-cmr
Expand Down Expand Up @@ -184,7 +185,7 @@ Here we have the query-cmr service (this service is the first in every current w
### Sequential Steps
Most steps will produce all of the pieces of work (known as work-items) for a service immediately when the step begins. This allows all of the work-items to be worked in parallel. It is possible, however, for new work-items for the same service to be produced as the step is being worked. In this case, the work-items must be worked sequentially. Steps that must be worked sequentially should include `is_sequential: true` in their definition.

An example of this is the query-cmr service. Each invocation of the query-cmr service can only return up to 2000 granules (due to the CMR page size limit), so, if the job has more granules than that, query-cmr is invoked multiple times. Because the number of granules reported by the CMR may change at any time, we cannot know ahead of time exactly how many invocations we need. So, if the job has more granules than 2000, query-cmr is invoked sequentially until all granules are returned.
An example of this is the query-cmr service. Each invocation of the query-cmr service can only return up to 2000 granules (due to the CMR page size limit), so, if the job has more granules than that, query-cmr is invoked multiple times. This must be done sequentially due to the way the CMR uses a scroll ID for paging.

For most services `is_sequential: true` is not necessary.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import { RequestHandler } from 'express';

import HarmonyRequest from '../models/harmony-request';
import { getUserIdRequest } from '../util/edl-api';

Expand Down
48 changes: 48 additions & 0 deletions services/harmony/app/middleware/external-validation.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import axios from 'axios';
import { NextFunction, Response } from 'express';

import HarmonyRequest from '../models/harmony-request';
import { ExternalValidationError } from '../util/errors';

/**
* Middleware to validate users against an external endpoint configured for a service.
* @param req - The client request, containing an operation
* @param res - The client response
* @param next - The next function in the middleware chain
*/
export async function externalValidation(
req: HarmonyRequest, res: Response, next: NextFunction,
): Promise<void> {
const { operation, context } = req;
const url = context?.serviceConfig?.external_validation_url;
if (!url) return next();

req.context.logger.info('timing.external-validation.start');
const startTime = new Date().getTime();
try {
await axios.post(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to add an info level timing message for calling the validation URL (see the other places we log durationMs in the code).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

url,
operation,
{
headers: {
'Authorization': `Bearer: ${req.accessToken}`,
},
},
);
} catch (e) {
req.context.logger.error('External validation failed');
if (e.response) {
req.context.logger.error(`Validation status: ${e.response.status}`);
req.context.logger.error(`Validation response: ${JSON.stringify(e.response.data, null, 2)}`);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: log the response status here as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return next(new ExternalValidationError(e.response.data, e.response.status));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should log the response body and status code the validation URL returned - I didn't see either in the logs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

} else {
req.context.logger.error(`Error calling validation endpoint: ${url}.`);
return next(e);
}
} finally {
const durationMs = new Date().getTime() - startTime;
req.context.logger.info('timing.external-validation.end', { durationMs });
}

return next();
}
27 changes: 15 additions & 12 deletions services/harmony/app/models/services/base-service.ts
Original file line number Diff line number Diff line change
@@ -1,23 +1,25 @@
import _ from 'lodash';
import { Logger } from 'winston';
import { v4 as uuid } from 'uuid';
import WorkItem from '../work-item';
import WorkflowStep from '../workflow-steps';
import InvocationResult from './invocation-result';
import { Job, JobStatus, statesToDefaultMessages } from '../job';
import DataOperation from '../data-operation';
import { defaultObjectStore } from '../../util/object-store';
import { RequestValidationError, ServerError } from '../../util/errors';
import { Logger } from 'winston';

import { joinTexts } from '@harmony/util/string';

import { QUERY_CMR_SERVICE_REGEX } from '../../backends/workflow-orchestration/util';
import { makeWorkScheduleRequest } from '../../backends/workflow-orchestration/work-item-polling';
import db from '../../util/db';
import env from '../../util/env';
import { WorkItemStatus } from '../work-item-interface';
import { RequestValidationError, ServerError } from '../../util/errors';
import { getRequestMetric } from '../../util/metrics';
import { defaultObjectStore } from '../../util/object-store';
import { getRequestUrl } from '../../util/url';
import DataOperation from '../data-operation';
import HarmonyRequest from '../harmony-request';
import { Job, JobStatus, statesToDefaultMessages } from '../job';
import UserWork from '../user-work';
import { joinTexts } from '@harmony/util/string';
import { makeWorkScheduleRequest } from '../../backends/workflow-orchestration/work-item-polling';
import { QUERY_CMR_SERVICE_REGEX } from '../../backends/workflow-orchestration/util';
import WorkItem from '../work-item';
import { WorkItemStatus } from '../work-item-interface';
import WorkflowStep from '../workflow-steps';
import InvocationResult from './invocation-result';

export interface ServiceCapabilities {
concatenation?: boolean;
Expand Down Expand Up @@ -85,6 +87,7 @@ export interface ServiceConfig<ServiceParamType> {
maximum_sync_granules?: number;
steps?: ServiceStep[];
validate_variables?: boolean;
external_validation_url?: string;
}

/**
Expand Down
85 changes: 49 additions & 36 deletions services/harmony/app/routers/router.ts
Original file line number Diff line number Diff line change
@@ -1,53 +1,65 @@
import process from 'process';
import cookieParser from 'cookie-parser';
import express, { json, RequestHandler } from 'express';
import asyncHandler from 'express-async-handler';
import cookieParser from 'cookie-parser';
import swaggerUi from 'swagger-ui-express';
import * as yaml from 'js-yaml';
import log from '../util/log';
import process from 'process';
import swaggerUi from 'swagger-ui-express';

// Middleware requires in outside-in order
import shapefileUpload from '../middleware/shapefile-upload';
import earthdataLoginTokenAuthorizer from '../middleware/earthdata-login-token-authorizer';
import earthdataLoginOauthAuthorizer from '../middleware/earthdata-login-oauth-authorizer';
import { admin, core } from '../middleware/permission-groups';
import wmsFrontend from '../frontends/wms';
import { getJobsListing, getJobStatus, cancelJob, resumeJob, pauseJob, skipJobPreview, skipJobsPreview, cancelJobs, resumeJobs, pauseJobs } from '../frontends/jobs';
import { getJobs, getJob, getWorkItemsTable, getJobLinks, getWorkItemLogs, retry, getWorkItemTableRow, redirectWithoutTrailingSlash, getJobsTable } from '../frontends/workflow-ui';
import { getStacCatalog, getStacItem } from '../frontends/stac';
import serviceInvoker from '../backends/service-invoker';
import { getCollectionCapabilitiesJson } from '../frontends/capabilities';
import { cloudAccessJson, cloudAccessSh } from '../frontends/cloud-access';
import { setLogLevel } from '../frontends/configuration';
import docsPage from '../frontends/docs/docs';
import { getAdminHealth, getHealth } from '../frontends/health';
import {
cancelJob, cancelJobs, getJobsListing, getJobStatus, pauseJob, pauseJobs, resumeJob, resumeJobs,
skipJobPreview, skipJobsPreview,
} from '../frontends/jobs';
import { addJobLabels, deleteJobLabels } from '../frontends/labels';
import landingPage from '../frontends/landing-page';
import * as ogcCoverageApi from '../frontends/ogc-coverages/index';
import * as ogcEdrApi from '../frontends/ogc-edr/index';
import getRequestMetrics from '../frontends/request-metrics';
import {
getDeploymentLogs, getServiceDeployment, getServiceDeployments, getServiceDeploymentsState,
getServiceImageTag, getServiceImageTags, setServiceDeploymentsState, updateServiceImageTag,
} from '../frontends/service-image-tags';
import { getServiceResult } from '../frontends/service-results';
import { getStacCatalog, getStacItem } from '../frontends/stac';
import { getStagingBucketPolicy } from '../frontends/staging-bucket-policy';
import getVersions from '../frontends/versions';
import wmsFrontend from '../frontends/wms';
import {
getJob, getJobLinks, getJobs, getJobsTable, getWorkItemLogs, getWorkItemsTable,
getWorkItemTableRow, redirectWithoutTrailingSlash, retry,
} from '../frontends/workflow-ui';
import cmrGranuleLocator from '../middleware/cmr-granule-locator';
import {
postServiceConcatenationHandler, preServiceConcatenationHandler,
} from '../middleware/concatenation';
import earthdataLoginOauthAuthorizer from '../middleware/earthdata-login-oauth-authorizer';
import earthdataLoginSkipped from '../middleware/earthdata-login-skipped';
import earthdataLoginTokenAuthorizer from '../middleware/earthdata-login-token-authorizer';
import extendDefault from '../middleware/extend';
import { externalValidation } from '../middleware/external-validation';
import handleJobIDParameter from '../middleware/job-id';
import handleLabelParameter from '../middleware/label';
import parameterValidation from '../middleware/parameter-validation';
import { admin, core } from '../middleware/permission-groups';
import validateRestrictedVariables from '../middleware/restricted-variables';
import chooseService from '../middleware/service-selection';
import shapefileConverter from '../middleware/shapefile-converter';
import { NotFoundError } from '../util/errors';
import * as ogcCoverageApi from '../frontends/ogc-coverages/index';
import * as ogcEdrApi from '../frontends/ogc-edr/index';
import { cloudAccessJson, cloudAccessSh } from '../frontends/cloud-access';
import landingPage from '../frontends/landing-page';
import { setLogLevel } from '../frontends/configuration';
import getVersions from '../frontends/versions';
import serviceInvoker from '../backends/service-invoker';
// Middleware requires in outside-in order
import shapefileUpload from '../middleware/shapefile-upload';
import HarmonyRequest, { addRequestContextToOperation } from '../models/harmony-request';
import { getServiceImageTag, getServiceImageTags, updateServiceImageTag, getServiceDeploymentsState, setServiceDeploymentsState, getServiceDeployment, getServiceDeployments, getDeploymentLogs } from '../frontends/service-image-tags';
import cmrCollectionReader = require('../middleware/cmr-collection-reader');
import cmrUmmCollectionReader = require('../middleware/cmr-umm-collection-reader');
import env from '../util/env';
import { postServiceConcatenationHandler, preServiceConcatenationHandler } from '../middleware/concatenation';
import getRequestMetrics from '../frontends/request-metrics';
import { getStagingBucketPolicy } from '../frontends/staging-bucket-policy';
import { NotFoundError } from '../util/errors';
import { parseGridMiddleware } from '../util/grids';
import docsPage from '../frontends/docs/docs';
import { getCollectionCapabilitiesJson } from '../frontends/capabilities';
import extendDefault from '../middleware/extend';
import { getAdminHealth, getHealth } from '../frontends/health';
import handleLabelParameter from '../middleware/label';
import { addJobLabels, deleteJobLabels } from '../frontends/labels';
import handleJobIDParameter from '../middleware/job-id';
import earthdataLoginSkipped from '../middleware/earthdata-login-skipped';
import log from '../util/log';
import { validateAndSetVariables } from '../util/variables';
import validateRestrictedVariables from '../middleware/restricted-variables';

import cmrCollectionReader = require('../middleware/cmr-collection-reader');
import cmrUmmCollectionReader = require('../middleware/cmr-umm-collection-reader');
export interface RouterConfig {
PORT?: string | number; // The port to run the frontend server on
BACKEND_PORT?: string | number; // The port to run the backend server on
Expand Down Expand Up @@ -216,6 +228,7 @@ export default function router({ USE_EDL_CLIENT_APP = 'false' }: RouterConfig):
result.use(logged(preServiceConcatenationHandler));
result.use(logged(chooseService));
result.use(logged(postServiceConcatenationHandler));
result.use(logged(externalValidation));
result.use(logged(validateAndSetVariables));
result.use(logged(validateRestrictedVariables));

Expand Down
10 changes: 8 additions & 2 deletions services/harmony/app/util/errors.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,12 @@ export class ForbiddenError extends HttpError {
}
}

export class ExternalValidationError extends HttpError {
constructor(message = 'External validation failed', code = 403) {
super(code, message);
}
}

export class ServerError extends HttpError {
constructor(message = 'An unexpected error occurred') {
super(500, message);
Expand Down Expand Up @@ -68,7 +74,7 @@ export function getEndUserErrorMessage(error: HttpError): string {

/**
* Returns the appropriate http status code for the provided error
* @param error - The error that occured
* @param error - The error that occurred
* @returns the http status code
*/
export function getHttpStatusCode(error: HttpError): number {
Expand All @@ -83,7 +89,7 @@ export function getHttpStatusCode(error: HttpError): number {

/**
* Returns the appropriate string code to use for the JSON response to indicate a type of error
* @param error - The error that occured
* @param error - The error that occurred
* @returns a string indicating the class of error that occurred
*/
export function getCodeForError(error: HttpError): string {
Expand Down
2 changes: 1 addition & 1 deletion services/harmony/app/util/queue/queue-factory.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import env from '../env';
import { WorkItemQueueType, Queue } from './queue';
import { Queue, WorkItemQueueType } from './queue';
import { SqsQueue } from './sqs-queue';

const queuesByType = {};
Expand Down
Loading