Skip to content

Conversation

@anton-nayshtut
Copy link
Contributor

@anton-nayshtut anton-nayshtut commented Nov 4, 2025

This PR introduces config file support to nixlbench.

The name of a config file can be specified using the --config_file command-line parameter. The config file is in INI format.

Each existing command-line parameter can also be placed in the [global] section of the configuration file, so the following invocations:

nixlbench --etcd_endpoints http://localhost:2379 --backend POSIX --filepath /mnt/test --posix_api_type AIO

and

nixlbench --config_file /tmp/nixlbench.config

where /tmp/nixlbench.config contains:

[global]
etcd_endpoints=http://localhost:2379
backend=POSIX
filepath=/mnt/test
posix_api_type=AIO

are identical.

If a parameter exists in the config file and is also explicitly specified on the command line, the latter takes precedence.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 4, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

👋 Hi anton-nayshtut! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

This patch doesn't change any logic. It adjusts the parameter
definitions to help us move to a different parameter parsing
infrastructure in upcoming patches.

Signed-off-by: Anton Nayshtut <[email protected]>
gflags-specific parameter definitions have been replaced with the
NB_ARG_x macros.

This will help minimize changes when we move to a different parameter
parsing infrastructure in upcoming patches.

This patch doesn't change any logic.

Signed-off-by: Anton Nayshtut <[email protected]>
NB_ARG parameter getter macro introduced.

This will help us to minimize changes when we'll move to a different
parameter parsing infrastructure in upcoming patches.

This patch doesn't change any logic.

Signed-off-by: Anton Nayshtut <[email protected]>
We'll use cxxopts for command line parsing in upcoming patches.

Copied from https://github.com/jarro2783/cxxopts
SHA: 8df9a4d2716ebf1c68eced0b6d9de185c6cfd50a

Signed-off-by: Anton Nayshtut <[email protected]>
This patch replaces gflags-based parameter parsing with cxxopts-based
parsing, and removes the gflags dependency.

This change will enable us to support config file-based parameters in
upcoming patches.

Unfortunately, gflags is not flexible enough for our needs. For example,
it does not indicate whether a parameter was explicitly provided by the
user or if it is using a default. This functionality is required for
correct parameter specification priority: first, a value provided by the
user on the command line; then a value from the config file; and finally,
the default value.

Additionally, gflags does not support dynamically formed arguments,
which is needed to handle a list of tasks as discussed. For this use
case, the config file would look like:

[general]
...
num_tasks=2

[task0]
local=...
remote=...
type=write
...

[task1]
local=...
remote=...
type=read
...

This translates to command line parameters like:

--num_tasks=2 --task0.local=... ... --task1.local=...

Since gflags is a compile-time mechanism, it cannot support this level
of dynamic argument handling.

Signed-off-by: Anton Nayshtut <[email protected]>
We'll use ini-cpp for config file parsing in upcoming patches.

Copied from https://github.com/SSARCandy/ini-cpp/
SHA: 513c0947477855d3925446084779697145246714

Signed-off-by: Anton Nayshtut <[email protected]>
This patch introduces config file support to nixlbench.

The name of a config file can be specified using the --config_file
command-line parameter. The config file is in INI format.

Each existing command-line parameter can also be placed in the
"[global]" section of the configuration file, so the following
invocations:

  nixlbench --etcd_endpoints http://localhost:2379 --backend POSIX
    --filepath /mnt/test --posix_api_type AIO

and

  nixlbench --config_file /tmp/nixlbench.config

where /tmp/nixlbench.config contains:

[global]
etcd_endpoints=http://localhost:2379
backend=POSIX
filepath=/mnt/test
posix_api_type=AIO

are identical.

If a parameter exists in the config file and is also explicitly
specified on the command line, the latter takes precedence.

Signed-off-by: Anton Nayshtut <[email protected]>
@anton-nayshtut anton-nayshtut requested review from a team as code owners November 6, 2025 11:33
@aranadive
Copy link
Contributor

@anton-nayshtut Thanks so much adding this. I had two general comments.
Did you consider using JSON format for config file? From a usability, portability aspect its well-known, with number of tools already around it. Plus, we can reuse it for kvbench since json is easily parsed in python.
Another option, would be to use the flagfile option of Gflags - to pass all the values/parameters from command line into the file. This would be easier than json or ini since we wouldnt have to go through licensing again.

For CXXOPTS, Im unclear why this was required? If you use the gflags flagfile now, any other options after the file should overwrite the ones in the file.

WDYT?

Copy link
Contributor

@vvenkates27 vvenkates27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments, and why not JSon?

@@ -0,0 +1,624 @@
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

License needs to be fixed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not our code. It's an external MIT-licensed library. We can't change the license, AFAIK. But I exempted the external files from the relevant CI checks.

@@ -0,0 +1,2925 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here .. fix license

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

"[default: SG]") NB_ARG_STRING(op_type,
XFERBENCH_OP_WRITE,
"Op type: READ, WRITE")
NB_ARG_BOOL(check_consistency,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alignment is off in this file please fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s weird. This is what clang’s autoformat does (see nixlbench: utils autoformatted by clang commit).

I had to autoformat with clang because the CI test failed without it.

Please advise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem nested for some reason? These are supposed to be separate arguments. Maybe the comma needs to be moved from the macro definition to end of each parameter created here.
Clang is probably confused with whether everything is one single line and breaking out accordingly.

# include <regex>
#endif // CXXOPTS_NO_REGEX

// Nonstandard before C++17, which is coincidentally what we also need for <optional>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of CXX_OPTS compared to gflags? Why can't we parse using glags even for a config file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the commit message for benchmark/nixlbench: switched to cxxopts - I discuss it at length there. Hope it makes sense.

@anton-nayshtut
Copy link
Contributor Author

@anton-nayshtut Thanks so much adding this. I had two general comments. Did you consider using JSON format for config file? From a usability, portability aspect its well-known, with number of tools already around it. Plus, we can reuse it for kvbench since json is easily parsed in python. Another option, would be to use the flagfile option of Gflags - to pass all the values/parameters from command line into the file. This would be easier than json or ini since we wouldnt have to go through licensing again.

For CXXOPTS, Im unclear why this was required? If you use the gflags flagfile now, any other options after the file should overwrite the ones in the file.

WDYT?
Hi Adit. Thanks for your review!

As for the questions:

INI vs JSON

Sure, I considered using JSON for config files and I agree that, as you mentioned, it has some advantages. However, JSON also has some downsides that led me to prefer INI format.

My thoughts were along these lines:

INI format is often a better choice than JSON for simple configuration files because it’s more approachable, easier to manually edit, and visually cleaner for basic use cases.

INI files are highly readable and easy to work with - especially for non-programmers or for quick manual edits. They support organizing settings into basic sections and key-value pairs, so simple configs stay tidy and quick to review. Comments are supported in INI (unlike JSON by default), letting you explain settings directly in the file, which is a huge advantage for documentation and collaboration. INI files are much simpler than JSON, and avoid extra punctuation, strict syntax, and nested requirements - reducing the chance of errors when hand-editing.

In summary, if your configs are flat, small, and likely to be edited by hand, INI is much easier for users to read and update than JSON. JSON is better when you need complex or hierarchical data, but INI shines for basic, human-friendly settings.

Some additional considerations:

  • Many popular test and benchmark tools use INI format for config files, like fio, tox, bdperf, etc.
  • INI files are easily generated and parsed in Python with the ConfigParser module.

Of course, I'm certainly open to revisiting this decision and would be happy to go with JSON if you or others in the community feel it's a better fit.

CXXOPTS vs gflags

See the commit message for benchmark/nixlbench: switched to cxxopts commit - I discuss it at length there. Hope the explanation makes sense.

@aranadive
Copy link
Contributor

@anton-nayshtut Thanks so much adding this. I had two general comments. Did you consider using JSON format for config file? From a usability, portability aspect its well-known, with number of tools already around it. Plus, we can reuse it for kvbench since json is easily parsed in python. Another option, would be to use the flagfile option of Gflags - to pass all the values/parameters from command line into the file. This would be easier than json or ini since we wouldnt have to go through licensing again.
For CXXOPTS, Im unclear why this was required? If you use the gflags flagfile now, any other options after the file should overwrite the ones in the file.
WDYT?
Hi Adit. Thanks for your review!

As for the questions:

INI vs JSON

Sure, I considered using JSON for config files and I agree that, as you mentioned, it has some advantages. However, JSON also has some downsides that led me to prefer INI format.

My thoughts were along these lines:

INI format is often a better choice than JSON for simple configuration files because it’s more approachable, easier to manually edit, and visually cleaner for basic use cases.

INI files are highly readable and easy to work with - especially for non-programmers or for quick manual edits. They support organizing settings into basic sections and key-value pairs, so simple configs stay tidy and quick to review. Comments are supported in INI (unlike JSON by default), letting you explain settings directly in the file, which is a huge advantage for documentation and collaboration. INI files are much simpler than JSON, and avoid extra punctuation, strict syntax, and nested requirements - reducing the chance of errors when hand-editing.

In summary, if your configs are flat, small, and likely to be edited by hand, INI is much easier for users to read and update than JSON. JSON is better when you need complex or hierarchical data, but INI shines for basic, human-friendly settings.

Some additional considerations:

  • Many popular test and benchmark tools use INI format for config files, like fio, tox, bdperf, etc.
  • INI files are easily generated and parsed in Python with the ConfigParser module.

Of course, I'm certainly open to revisiting this decision and would be happy to go with JSON if you or others in the community feel it's a better fit.

CXXOPTS vs gflags

See the commit message for benchmark/nixlbench: switched to cxxopts commit - I discuss it at length there. Hope the explanation makes sense.

To me it looks like INI are an upgraded version from just using a flagfile with gflags (https://gflags.github.io/gflags/#special). That also allows us to specify comments, and just copy gflags into it as they are today (no need for extra deps). We are not changing any parameters anyways so this would be the easiest option.

CXXOPTS vs GFlags
In terms of priority GFLAGS can already give this if the flagfile is specified last. Anything in the flagfile will overrule the commandline one and if no value is specified the default is taken. Based on the code, you need to explicitly parse ini file then import to cxxopts. If we stick with gflags flagfile this should be handled automatically.

I didnt understand the advantage of dynamic arguments, since the file parsing and command line parsing are handled separately. Wouldn't you still need logic to match the ini parsed value to cxxopts? Wouldn't these be interpreted separately, i.e. task0.local and task0.remote are considered two different parameters without any hierarchy?

There is also TOML (its made for config files) which is a cross between INI and JSON files. That might be a nicer option over INI to use when multi-tests can be run to give mroe flexibility in specifying config.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants