Skip to content

Empty components in CPATH and friends handled inconsistently #49742

@dg0yt

Description

@dg0yt
mannequin
Bugzilla Link 50398
Version unspecified
OS All
CC @zygoloid

Extended Description

Originally discovered with Apple's toolchain on macOS:

clang documentation for CPATH, C_INCLUDE_PATH and friends says that "Empty components in the environment variable are ignored."
https://clang.llvm.org/docs/CommandGuide/clang.html#envvar-C_INCLUDE_PATH,OBJC_INCLUDE_PATH,CPLUS_INCLUDE_PATH,OBJCPLUS_INCLUDE_PATH

But this is not true, as can be verified by experimentation and source code inspection.

  • Empty components at the beginning or end are treated as "."

    while ((Delim = Dirs.find(llvm::sys::EnvPathSeparator)) != StringRef::npos) {
    if (Delim == 0) { // Leading colon.
    if (CombinedArg) {
    CmdArgs.push_back(Args.MakeArgString(std::string(ArgName) + "."));
    } else {
    CmdArgs.push_back(ArgName);
    CmdArgs.push_back(".");
    }
    } else {
    if (CombinedArg) {
    CmdArgs.push_back(
    Args.MakeArgString(std::string(ArgName) + Dirs.substr(0, Delim)));
    } else {
    CmdArgs.push_back(ArgName);
    CmdArgs.push_back(Args.MakeArgString(Dirs.substr(0, Delim)));
    }
    }
    Dirs = Dirs.substr(Delim + 1);
    }
    if (Dirs.empty()) { // Trailing colon.
    if (CombinedArg) {
    CmdArgs.push_back(Args.MakeArgString(std::string(ArgName) + "."));
    } else {
    CmdArgs.push_back(ArgName);
    CmdArgs.push_back(".");
    }
    } else { // Add the last path.
    if (CombinedArg) {
    CmdArgs.push_back(Args.MakeArgString(std::string(ArgName) + Dirs));
    } else {
    CmdArgs.push_back(ArgName);
    CmdArgs.push_back(Args.MakeArgString(Dirs));
    }
    }

    This is as documented for gcc:
    https://gcc.gnu.org/onlinedocs/cpp/Environment-Variables.html#index-environment-variables

  • Unlike CPATH (-I), the components in the per-language variables are added as system include directories (-c-isystem).

    addDirectoryList(Args, CmdArgs, "-I", "CPATH");
    // C_INCLUDE_PATH - system includes enabled when compiling C.
    addDirectoryList(Args, CmdArgs, "-c-isystem", "C_INCLUDE_PATH");
    // CPLUS_INCLUDE_PATH - system includes enabled when compiling C++.

  • For system include directories, clang removes the earlier duplicates, i.e. those from command line arguments.

    // If we have a normal #include dir/framework/headermap that is shadowed
    // later in the chain by a system include location, we actually want to
    // ignore the user's request and drop the user dir... keeping the system
    // dir. This is weird, but required to emulate GCC's search path correctly.
    //
    // Since dupes of system dirs are rare, just rescan to find the original
    // that we're nuking instead of using a DenseMap.
    if (CurEntry.getDirCharacteristic() != SrcMgr::C_User) {

Reproducer:

mkdir /tmp/MARK-I-OPTION
mkdir /tmp/MARK-ENV-VAR
echo | C_INCLUDE_PATH=/tmp/MARK-ENV-VAR clang -E -Wp,-v - -o /dev/null -I. -I/tmp/MARK-I-OPTION 2> c_include_path-no-colon 
echo | C_INCLUDE_PATH=/tmp/MARK-ENV-VAR: clang -E -Wp,-v - -o /dev/null -I. -I/tmp/MARK-I-OPTION 2> c_include_path-with-colon
diff -U20 c_include_path-no-colon c_include_path-with-colon
echo | CPATH=/tmp/MARK-ENV-VAR clang -E -Wp,-v - -o /dev/null -I. -I/tmp/MARK-I-OPTION 2> cpath-no-colon 
echo | CPATH=/tmp/MARK-ENV-VAR: clang -E -Wp,-v - -o /dev/null -I. -I/tmp/MARK-I-OPTION 2> cpath-with-colon 
diff -U20 cpath-no-colon cpath-with-colon

Output for C_INCLUDE_PATH (somewhat older clang on Linux):

--- c_include_path-no-colon	2021-05-18 19:23:22.718867910 +0200
+++ c_include_path-with-colon	2021-05-18 19:59:06.197846524 +0200
@@ -1,12 +1,14 @@
 clang -cc1 version 8.0.0 based upon LLVM 8.0.0 default target x86_64-pc-linux-gnu
 ignoring nonexistent directory "/include"
+ignoring duplicate directory "."
+  as it is a non-system directory that duplicates a system directory
 #include "..." search starts here:
 #include <...> search starts here:
- .
  /tmp/MARK-I-OPTION
  /tmp/MARK-ENV-VAR
+ .
  /usr/local/include
  /usr/lib/llvm-8/lib/clang/8.0.0/include
  /usr/include/x86_64-linux-gnu
  /usr/include
 End of search list.

Output for CPATH:

--- cpath-no-colon	2021-05-18 19:23:53.654933769 +0200
+++ cpath-with-colon	2021-05-18 19:24:00.914949225 +0200
@@ -1,12 +1,13 @@
 clang -cc1 version 8.0.0 based upon LLVM 8.0.0 default target x86_64-pc-linux-gnu
 ignoring nonexistent directory "/include"
+ignoring duplicate directory "."
 #include "..." search starts here:
 #include <...> search starts here:
  .
  /tmp/MARK-I-OPTION
  /tmp/MARK-ENV-VAR
  /usr/local/include
  /usr/lib/llvm-8/lib/clang/8.0.0/include
  /usr/include/x86_64-linux-gnu
  /usr/include

In particular, the behaviour for per-language variables is a subtle source of errors when includes which are expected to be found from the current directory happen to be found in other directories.
E.g. for building gettext tools with vcpkg on osx,
microsoft/vcpkg#17970 (comment)

Expected behaviour:

  • Consistent with documentation
  • No removal of user-specified early "."

Metadata

Metadata

Assignees

Labels

bugzillaIssues migrated from bugzillaclangClang issues not falling into any other categorydocumentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions