Skip to content
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions react_on_rails_pro/docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,9 @@ const config = {
// allWorkersRestartInterval: 15,
// time in minutes between each worker restarting when restarting all workers
// delayBetweenIndividualWorkerRestarts: 2,
// Also, you can set he parameter gracefulWorkerRestartTimeout to force the worker to restart
// If it's the time for the worker to restart, the worker waits until it serves all active requests before restarting
// If a worker stuck because of a memory leakage or an infinite loop, you can set a timeout that master waits for it before killing the worker
}

// Renderer detects a total number of CPUs on virtual hostings like Heroku
Expand Down
1 change: 1 addition & 0 deletions react_on_rails_pro/docs/node-renderer/js-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Here are the options available for the JavaScript renderer configuration object,
If no password is set, no authentication will be required.
1. **allWorkersRestartInterval** (default: `env.RENDERER_ALL_WORKERS_RESTART_INTERVAL`) - Interval in minutes between scheduled restarts of all workers. By default restarts are not enabled. If restarts are enabled, `delayBetweenIndividualWorkerRestarts` should also be set.
1. **delayBetweenIndividualWorkerRestarts** (default: `env.RENDERER_DELAY_BETWEEN_INDIVIDUAL_WORKER_RESTARTS`) - Interval in minutes between individual worker restarts (when cluster restart is triggered). By default restarts are not enabled. If restarts are enabled, `allWorkersRestartInterval` should also be set.
1. **gracefulWorkerRestartTimeout**: (default: `env.GRACEFUL_WORKER_RESTART_TIMEOUT`) - Time in seconds that master waits for worker to gracefully restart (after serving all active requests) before killing it. You need to use this config to avoid situations where worker stucks at an inifite loop and doesn't restart. This config is only usable if worker restart is enabled. This time is counted starting from the time when should the worker restart, if that timeout value passed without restarting the worker, the worker is killed.
1. **maxDebugSnippetLength** (default: 1000) - If the rendering request is longer than this, it will be truncated in exception and logging messages.
1. **supportModules** - (default: `env.RENDERER_SUPPORT_MODULES || null`) - If set to true, `supportModules` enables the server-bundle code to call a default set of NodeJS global objects and functions that get added to the VM context:
`{ Buffer, TextDecoder, TextEncoder, URLSearchParams, ReadableStream, process, setTimeout, setInterval, setImmediate, clearTimeout, clearInterval, clearImmediate, queueMicrotask }`.
Expand Down
2 changes: 2 additions & 0 deletions react_on_rails_pro/docs/node-renderer/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# Node Renderer Troubleshooting

- If you enabled restarts (having `allWorkersRestartInterval` and `delayBetweenIndividualWorkerRestarts`), you should set it with a high number to avoid the app from crashing because all Node renderer workers are stopped/killed.

- If your app contains streamed pages that take too much time to be streamed to the client, ensure to not set the `gracefulWorkerRestartTimeout` parameter or set to a high number, so the worker is not killed while it's still serving an active request.
2 changes: 1 addition & 1 deletion react_on_rails_pro/lib/react_on_rails_pro/request.rb
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ def connection_without_retries
def perform_request(path, **post_options) # rubocop:disable Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/PerceivedComplexity
# For streaming requests, use connection without retries to prevent body duplication
# The StreamRequest class handles retries properly by starting fresh requests
conn = post_options[:stream] ? connection_without_retries : connection
conn = connection

available_retries = ReactOnRailsPro.configuration.renderer_request_retry_limit
retry_request = true
Expand Down
19 changes: 15 additions & 4 deletions react_on_rails_pro/packages/node-renderer/src/master.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,12 @@ export = function masterRun(runningConfig?: Partial<Config>) {

// Store config in app state. From now it can be loaded by any module using getConfig():
const config = buildConfig(runningConfig);
const { workersCount, allWorkersRestartInterval, delayBetweenIndividualWorkerRestarts } = config;
const {
workersCount,
allWorkersRestartInterval,
delayBetweenIndividualWorkerRestarts,
gracefulWorkerRestartTimeout,
} = config;

logSanitizedConfig();

Expand Down Expand Up @@ -48,9 +53,15 @@ export = function masterRun(runningConfig?: Partial<Config>) {
allWorkersRestartInterval,
delayBetweenIndividualWorkerRestarts,
);
setInterval(() => {
restartWorkers(delayBetweenIndividualWorkerRestarts);
}, allWorkersRestartInterval * MILLISECONDS_IN_MINUTE);

const allWorkersRestartIntervalMS = allWorkersRestartInterval * MILLISECONDS_IN_MINUTE;
const scheduleWorkersRestart = () => {
void restartWorkers(delayBetweenIndividualWorkerRestarts, gracefulWorkerRestartTimeout).finally(() => {
setTimeout(scheduleWorkersRestart, allWorkersRestartIntervalMS);
});
};

setTimeout(scheduleWorkersRestart, allWorkersRestartIntervalMS);
} else if (allWorkersRestartInterval || delayBetweenIndividualWorkerRestarts) {
log.error(
"Misconfiguration, please provide both 'allWorkersRestartInterval' and " +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

import cluster from 'cluster';
import log from '../shared/log';
import { SHUTDOWN_WORKER_MESSAGE } from '../shared/utils';

const MILLISECONDS_IN_MINUTE = 60000;

Expand All @@ -14,26 +15,48 @@ declare module 'cluster' {
}
}

export = function restartWorkers(delayBetweenIndividualWorkerRestarts: number) {
export = async function restartWorkers(
delayBetweenIndividualWorkerRestarts: number,
gracefulWorkerRestartTimeout: number | undefined,
) {
log.info('Started scheduled restart of workers');

let delay = 0;
if (!cluster.workers) {
throw new Error('No workers to restart');
}
Object.values(cluster.workers).forEach((worker) => {
const killWorker = () => {
if (!worker) return;
log.debug('Kill worker #%d', worker.id);
// eslint-disable-next-line no-param-reassign -- necessary change
worker.isScheduledRestart = true;
worker.destroy();
};
setTimeout(killWorker, delay);
delay += delayBetweenIndividualWorkerRestarts * MILLISECONDS_IN_MINUTE;
});

setTimeout(() => {
log.info('Finished scheduled restart of workers');
}, delay);
for (const worker of Object.values(cluster.workers)) {
if (!worker) return;
log.debug('Kill worker #%d', worker.id);
worker.isScheduledRestart = true;

worker.send(SHUTDOWN_WORKER_MESSAGE);

// It's inteded to restart worker in sequence, it shouldn't happens in parallel
// eslint-disable-next-line no-await-in-loop
await new Promise<void>((resolve) => {
let timeout: NodeJS.Timeout;

const onExit = () => {
clearTimeout(timeout);
resolve();
};
worker.on('exit', onExit);

// Zero means no timeout
if (gracefulWorkerRestartTimeout) {
timeout = setTimeout(() => {
log.debug('Worker #%d timed out, forcing kill it', worker.id);
worker.destroy();
worker.off('exit', onExit);
resolve();
}, gracefulWorkerRestartTimeout);
}
});
// eslint-disable-next-line no-await-in-loop
await new Promise((resolve) => {
setTimeout(resolve, delayBetweenIndividualWorkerRestarts * MILLISECONDS_IN_MINUTE);
});
}

log.info('Finished scheduled restart of workers');
};
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ export interface Config {
allWorkersRestartInterval: number | undefined;
// Time in minutes between each worker restarting when restarting all workers
delayBetweenIndividualWorkerRestarts: number | undefined;
// Time in seconds to wait for worker to restart before killing it
// Set it to 0 or undefined to never kill the worker
gracefulWorkerRestartTimeout: number | undefined;
// If the rendering request is longer than this, it will be truncated in exception and logging messages
maxDebugSnippetLength: number;
// @deprecated See https://www.shakacode.com/react-on-rails-pro/docs/node-renderer/error-reporting-and-tracing.
Expand Down Expand Up @@ -165,6 +168,10 @@ const defaultConfig: Config = {
? parseInt(env.RENDERER_DELAY_BETWEEN_INDIVIDUAL_WORKER_RESTARTS, 10)
: undefined,

gracefulWorkerRestartTimeout: env.GRACEFUL_WORKER_RESTART_TIMEOUT
? parseInt(env.GRACEFUL_WORKER_RESTART_TIMEOUT, 10)
: undefined,

maxDebugSnippetLength: MAX_DEBUG_SNIPPET_LENGTH,

// default to true if empty, otherwise it is set to false
Expand Down Expand Up @@ -195,6 +202,8 @@ function envValuesUsed() {
RENDERER_DELAY_BETWEEN_INDIVIDUAL_WORKER_RESTARTS:
!userConfig.delayBetweenIndividualWorkerRestarts &&
env.RENDERER_DELAY_BETWEEN_INDIVIDUAL_WORKER_RESTARTS,
GRACEFUL_WORKER_RESTART_TIMEOUT:
!userConfig.gracefulWorkerRestartTimeout && env.GRACEFUL_WORKER_RESTART_TIMEOUT,
INCLUDE_TIMER_POLYFILLS: !('includeTimerPolyfills' in userConfig) && env.INCLUDE_TIMER_POLYFILLS,
REPLAY_SERVER_ASYNC_OPERATION_LOGS:
!userConfig.replayServerAsyncOperationLogs && env.REPLAY_SERVER_ASYNC_OPERATION_LOGS,
Expand All @@ -209,6 +218,7 @@ function sanitizedSettings(aConfig: Partial<Config> | undefined, defaultValue?:
password: aConfig.password != null ? '<MASKED>' : defaultValue,
allWorkersRestartInterval: aConfig.allWorkersRestartInterval || defaultValue,
delayBetweenIndividualWorkerRestarts: aConfig.delayBetweenIndividualWorkerRestarts || defaultValue,
gracefulWorkerRestartTimeout: aConfig.gracefulWorkerRestartTimeout || defaultValue,
}
: {};
}
Expand Down
2 changes: 2 additions & 0 deletions react_on_rails_pro/packages/node-renderer/src/shared/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ import type { RenderResult } from '../worker/vm';

export const TRUNCATION_FILLER = '\n... TRUNCATED ...\n';

export const SHUTDOWN_WORKER_MESSAGE = 'NODE_RENDERER_SHUTDOWN_WORKER';

export function workerIdLabel() {
// eslint-disable-next-line @typescript-eslint/no-unnecessary-condition -- worker is nullable in the primary process
return cluster?.worker?.id || 'NO WORKER ID';
Expand Down
3 changes: 3 additions & 0 deletions react_on_rails_pro/packages/node-renderer/src/worker.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import type { FastifyInstance, FastifyReply, FastifyRequest } from './worker/typ
import checkProtocolVersion from './worker/checkProtocolVersionHandler';
import authenticate from './worker/authHandler';
import { handleRenderRequest, type ProvidedNewBundle } from './worker/handleRenderRequest';
import handleGracefulShutdown from './worker/handleGracefulShutdown';
import {
errorResponseResult,
formatExceptionMessage,
Expand Down Expand Up @@ -127,6 +128,8 @@ export default function run(config: Partial<Config>) {
...fastifyServerOptions,
});

handleGracefulShutdown(app);

// We shouldn't have unhandled errors here, but just in case
app.addHook('onError', (req, res, err, done) => {
// Not errorReporter.error so that integrations can decide how to log the errors.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
import cluster from 'cluster';
import { FastifyInstance } from './types';
import { SHUTDOWN_WORKER_MESSAGE } from '../shared/utils';
import log from '../shared/log';

const handleGracefulShutdown = (app: FastifyInstance) => {
const { worker } = cluster;
if (!worker) {
log.error('handleGracefulShutdown is called on master, expected to call it on worker only');
return;
}

let activeRequestsCount = 0;
let isShuttingDown = false;

process.on('message', (msg) => {
if (msg === SHUTDOWN_WORKER_MESSAGE) {
log.debug('Worker #%d received graceful shutdown message', worker.id);
isShuttingDown = true;
if (activeRequestsCount === 0) {
log.debug('Worker #%d has no active requests, killing the worker', worker.id);
worker.destroy();
} else {
log.debug(
'Worker #%d has "%d" active requests, disconnecting the worker',
worker.id,
activeRequestsCount,
);
worker.disconnect();
}
}
});

app.addHook('onRequest', (_req, _reply, done) => {
activeRequestsCount += 1;
done();
});

app.addHook('onResponse', (_req, _reply, done) => {
activeRequestsCount -= 1;
if (isShuttingDown && activeRequestsCount === 0) {
log.debug('Worker #%d served all active requests and going to be killed', worker.id);
worker.destroy();
}
done();
});
};

export default handleGracefulShutdown;
31 changes: 15 additions & 16 deletions react_on_rails_pro/spec/dummy/spec/system/integration_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ def change_text_expect_dom_selector(dom_selector, expect_no_change: false)
find("input").set new_text
within("h3") do
if expect_no_change
expect(subject).to have_no_content new_text
expect(subject).not_to have_content new_text
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these changes at this file just reverts the changes made at the PR that skips flaky tests (I think many of these changes unnecessary at that PR)

else
expect(subject).to have_content new_text
end
Expand Down Expand Up @@ -110,7 +110,7 @@ def change_text_expect_dom_selector(dom_selector, expect_no_change: false)
it "changes name in message according to input" do
visit "/client_side_hello_world"
change_text_expect_dom_selector("#HelloWorld-react-component-0")
click_link "Hello World Component Server Rendered, with extra options" # rubocop:disable Capybara/ClickLinkOrButtonStyle
click_link "Hello World Component Server Rendered, with extra options"
change_text_expect_dom_selector("#my-hello-world-id")
end
end
Expand Down Expand Up @@ -174,19 +174,19 @@ def change_text_expect_dom_selector(dom_selector, expect_no_change: false)

before do
visit "/"
click_link "React Router" # rubocop:disable Capybara/ClickLinkOrButtonStyle
click_link "React Router"
end

context "when rendering /react_router" do
it { is_expected.to have_text("Woohoo, we can use react-router here!") }

it "clicking links correctly renders other pages" do
click_link "Router First Page" # rubocop:disable Capybara/ClickLinkOrButtonStyle
click_link "Router First Page"
expect(page).to have_current_path("/react_router/first_page")
first_page_header_text = page.find(:css, "h2#first-page").text
expect(first_page_header_text).to eq("React Router First Page")

click_link "Router Second Page" # rubocop:disable Capybara/ClickLinkOrButtonStyle
click_link "Router Second Page"
expect(page).to have_current_path("/react_router/second_page")
second_page_header_text = page.find(:css, "h2#second-page").text
expect(second_page_header_text).to eq("React Router Second Page")
Expand Down Expand Up @@ -239,12 +239,12 @@ def change_text_expect_dom_selector(dom_selector, expect_no_change: false)
end
end

describe "Manual client hydration", :js do
describe "Manual client hydration", :js, type: :system do
before { visit "/xhr_refresh" }

it "HelloWorldRehydratable onChange should trigger" do
within("form") do
click_button "refresh" # rubocop:disable Capybara/ClickLinkOrButtonStyle
click_button "refresh"
end
within("#HelloWorldRehydratable-react-component-1") do
find("input").set "Should update"
Expand Down Expand Up @@ -396,7 +396,7 @@ def change_text_expect_dom_selector(dom_selector, expect_no_change: false)

it "hydrates the component" do
visit path
expect(page.html).to include("client-bundle.js")
expect(page.html).to match(/client-bundle[^\"]*.js/)
change_text_expect_dom_selector(selector)
end

Expand All @@ -407,28 +407,27 @@ def change_text_expect_dom_selector(dom_selector, expect_no_change: false)
# Ensure that the component state is not updated
change_text_expect_dom_selector(selector, expect_no_change: true)

expect(page).to have_no_text "Loading branch1"
expect(page).to have_no_text "Loading branch2"
expect(page).to have_no_text(/Loading branch1 at level \d+/)
expect(page).not_to have_text "Loading branch1"
expect(page).not_to have_text "Loading branch2"
expect(page).not_to have_text(/Loading branch1 at level \d+/)
expect(page).to have_text(/branch1 \(level \d+\)/, count: 5)
end

it "doesn't hydrate status component if packs are not loaded" do
# visit waits for the page to load, so we ensure that the page is loaded before checking the hydration status
visit "#{path}?skip_js_packs=true"
expect(page).to have_text "HydrationStatus: Streaming server render"
expect(page).to have_no_text "HydrationStatus: Hydrated"
expect(page).to have_no_text "HydrationStatus: Page loaded"
expect(page).not_to have_text "HydrationStatus: Hydrated"
expect(page).not_to have_text "HydrationStatus: Page loaded"
end
end

describe "Pages/stream_async_components_for_testing", :js,
skip: "Flaky test replaced by Playwright E2E tests in e2e-tests/streaming.spec.ts" do
describe "Pages/stream_async_components_for_testing", :js do
it_behaves_like "streamed component tests", "/stream_async_components_for_testing",
"#AsyncComponentsTreeForTesting-react-component-0"
end

describe "React Router Sixth Page", :js, skip: "Work in progress in another branch: justin808/fix-body-dup-retry" do
describe "React Router Sixth Page", :js do
it_behaves_like "streamed component tests", "/server_router/streaming-server-component",
"#ServerComponentRouter-react-component-0"
end
Loading