Skip to content

Commit 5229ace

Browse files
authored
Merge pull request #11545 from ethereum/stripping-base-path-from-cli
Stripping base path from CLI paths
2 parents 1f95348 + 13f46eb commit 5229ace

File tree

11 files changed

+1268
-24
lines changed

11 files changed

+1268
-24
lines changed

Changelog.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ Language Features:
44

55

66
Compiler Features:
7+
* Commandline Interface: Normalize paths specified on the command line and make them relative for files located inside base path.
78
* Immutable variables can be read at construction time once they are initialized.
89
* SMTChecker: Support low level ``call`` as external calls to unknown code.
910
* SMTChecker: Add constraints to better correlate ``address(this).balance`` and ``msg.value``.

docs/path-resolution.rst

Lines changed: 65 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -71,8 +71,10 @@ The initial content of the VFS depends on how you invoke the compiler:
7171
7272
solc contract.sol /usr/local/dapp-bin/token.sol
7373
74-
The source unit name of a file loaded this way is simply the specified path after shell expansion
75-
and with platform-specific separators converted to forward slashes.
74+
The source unit name of a file loaded this way is constructed by converting its path to a
75+
canonical form and making it relative to the base path if it is located inside.
76+
See :ref:`Base Path Normalization and Stripping <base-path-normalization-and-stripping>` for
77+
a detailed description of this process.
7678

7779
.. index:: standard JSON
7880

@@ -309,9 +311,67 @@ When the source unit name is a relative path, this results in the file being loo
309311
directory the compiler has been invoked from.
310312
It is also the only value that results in absolute paths in source unit names being actually
311313
interpreted as absolute paths on disk.
314+
If the base path itself is relative, it is interpreted as relative to the current working directory
315+
of the compiler.
316+
317+
.. _base-path-normalization-and-stripping:
318+
319+
Base Path Normalization and Stripping
320+
-------------------------------------
321+
322+
On the command line the compiler behaves just as you would expect from any other program:
323+
it accepts paths in a format native to the platform and relative paths are relative to the current
324+
working directory.
325+
The source unit names assigned to files whose paths are specified on the command line, however,
326+
should not change just because the project is being compiled on a different platform or because the
327+
compiler happens to have been invoked from a different directory.
328+
To achieve this, paths to source files coming from the command line must be converted to a canonical
329+
form, and, if possible, made relative to the base path.
330+
331+
The normalization rules are as follows:
332+
333+
- If a path is relative, it is made absolute by prepending the current working directory to it.
334+
- Internal ``.`` and ``..`` segments are collapsed.
335+
- Platform-specific path separators are replaced with forward slashes.
336+
- Sequences of multiple consecutive path separators are squashed into a single separator (unless
337+
they are the leading slashes of an `UNC path <https://en.wikipedia.org/wiki/Path_(computing)#UNC>`_).
338+
- If the path includes a root name (e.g. a drive letter on Windows) and the root is the same as the
339+
root of the current working directory, the root is replaced with ``/``.
340+
- Symbolic links in the path are **not** resolved.
341+
342+
- The only exception is the path to the current working directory prepended to relative paths in
343+
the process of making them absolute.
344+
On some platforms the working directory is reported always with symbolic links resolved so for
345+
consistency the compiler resolves them everywhere.
346+
347+
- The original case of the path is preserved even if the filesystem is case-insensitive but
348+
`case-preserving <https://en.wikipedia.org/wiki/Case_preservation>`_ and the actual case on
349+
disk is different.
312350

313-
If the base path itself is relative, it is also interpreted as relative to the current working
314-
directory of the compiler.
351+
.. note::
352+
353+
There are situations where paths cannot be made platform-independent.
354+
For example on Windows the compiler can avoid using drive letters by referring to the root
355+
directory of the current drive as ``/`` but drive letters are still necessary for paths leading
356+
to other drives.
357+
You can avoid such situations by ensuring that all the files are available within a single
358+
directory tree on the same drive.
359+
360+
Once canonicalized, the base path is stripped from all source file paths that start with it.
361+
If the base path is empty or not specified, it is treated as if it was equal to the path to the
362+
current working directory (with all symbolic links resolved).
363+
The result is accepted only if the normalized directory path is the exact prefix of the normalized
364+
file path.
365+
Otherwise the file path remains absolute.
366+
This makes the conversion unambiguous and ensures that the relative path does not start with ``../``.
367+
The resulting file path becomes the source unit name.
368+
369+
.. note::
370+
371+
Prior to version 0.8.8, CLI path stripping was not performed and the only normalization applied
372+
was the conversion of path separators.
373+
When working with older versions of the compiler it is recommended to invoke the compiler from
374+
the base path and to only use relative paths on the command line.
315375

316376
.. index:: ! remapping; import, ! import; remapping, ! remapping; context, ! remapping; prefix, ! remapping; target
317377
.. _import-remapping:
@@ -414,7 +474,7 @@ Here are the detailed rules governing the behaviour of remappings:
414474

415475
.. code-block:: bash
416476
417-
solc /project/=/contracts/ /project/contract.sol --base-path /project # source unit name: /project/contract.sol
477+
solc /project/=/contracts/ /project/contract.sol --base-path /project # source unit name: contract.sol
418478
419479
.. code-block:: solidity
420480
:caption: /project/contract.sol

libsolidity/interface/FileReader.cpp

Lines changed: 149 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@
2222
#include <libsolutil/CommonIO.h>
2323
#include <libsolutil/Exceptions.h>
2424

25+
#include <boost/algorithm/string/predicate.hpp>
26+
2527
using solidity::frontend::ReadCallback;
2628
using solidity::langutil::InternalCompilerError;
2729
using solidity::util::errinfo_comment;
@@ -31,9 +33,22 @@ using std::string;
3133
namespace solidity::frontend
3234
{
3335

36+
void FileReader::setBasePath(boost::filesystem::path const& _path)
37+
{
38+
m_basePath = (_path.empty() ? "" : normalizeCLIPathForVFS(_path));
39+
}
40+
3441
void FileReader::setSource(boost::filesystem::path const& _path, SourceCode _source)
3542
{
36-
m_sourceCodes[_path.generic_string()] = std::move(_source);
43+
boost::filesystem::path normalizedPath = normalizeCLIPathForVFS(_path);
44+
boost::filesystem::path prefix = (m_basePath.empty() ? normalizeCLIPathForVFS(".") : m_basePath);
45+
46+
m_sourceCodes[stripPrefixIfPresent(prefix, normalizedPath).generic_string()] = std::move(_source);
47+
}
48+
49+
void FileReader::setStdin(SourceCode _source)
50+
{
51+
m_sourceCodes["<stdin>"] = std::move(_source);
3752
}
3853

3954
void FileReader::setSources(StringMap _sources)
@@ -92,5 +107,138 @@ ReadCallback::Result FileReader::readFile(string const& _kind, string const& _so
92107
}
93108
}
94109

110+
boost::filesystem::path FileReader::normalizeCLIPathForVFS(boost::filesystem::path const& _path)
111+
{
112+
// Detailed normalization rules:
113+
// - Makes the path either be absolute or have slash as root (note that on Windows paths with
114+
// slash as root are not considered absolute by Boost). If it is empty, it becomes
115+
// the current working directory.
116+
// - Collapses redundant . and .. segments.
117+
// - Removes leading .. segments from an absolute path (i.e. /../../ becomes just /).
118+
// - Squashes sequences of multiple path separators into one.
119+
// - Ensures that forward slashes are used as path separators on all platforms.
120+
// - Removes the root name (e.g. drive letter on Windows) when it matches the root name in the
121+
// path to the current working directory.
122+
//
123+
// Also note that this function:
124+
// - Does NOT resolve symlinks (except for symlinks in the path to the current working directory).
125+
// - Does NOT check if the path refers to a file or a directory. If the path ends with a slash,
126+
// the slash is preserved even if it's a file.
127+
// - The only exception are paths where the file name is a dot (e.g. '.' or 'a/b/.'). These
128+
// always have a trailing slash after normalization.
129+
// - Preserves case. Even if the filesystem is case-insensitive but case-preserving and the
130+
// case differs, the actual case from disk is NOT detected.
131+
132+
boost::filesystem::path canonicalWorkDir = boost::filesystem::weakly_canonical(boost::filesystem::current_path());
133+
134+
// NOTE: On UNIX systems the path returned from current_path() has symlinks resolved while on
135+
// Windows it does not. To get consistent results we resolve them on all platforms.
136+
boost::filesystem::path absolutePath = boost::filesystem::absolute(_path, canonicalWorkDir);
137+
138+
// NOTE: boost path preserves certain differences that are ignored by its operator ==.
139+
// E.g. "a//b" vs "a/b" or "a/b/" vs "a/b/.". lexically_normal() does remove these differences.
140+
boost::filesystem::path normalizedPath = absolutePath.lexically_normal();
141+
solAssert(normalizedPath.is_absolute() || normalizedPath.root_path() == "/", "");
142+
143+
// If the path is on the same drive as the working dir, for portability we prefer not to
144+
// include the root name. Do this only for non-UNC paths - my experiments show that on Windows
145+
// when the working dir is an UNC path, / does not not actually refer to the root of the UNC path.
146+
boost::filesystem::path normalizedRootPath = normalizedPath.root_path();
147+
if (!isUNCPath(normalizedPath))
148+
{
149+
boost::filesystem::path workingDirRootPath = canonicalWorkDir.root_path();
150+
if (normalizedRootPath == workingDirRootPath)
151+
normalizedRootPath = "/";
152+
}
153+
154+
// lexically_normal() will not squash paths like "/../../" into "/". We have to do it manually.
155+
boost::filesystem::path dotDotPrefix = absoluteDotDotPrefix(normalizedPath);
156+
157+
boost::filesystem::path normalizedPathNoDotDot = normalizedPath;
158+
if (dotDotPrefix.empty())
159+
normalizedPathNoDotDot = normalizedRootPath / normalizedPath.relative_path();
160+
else
161+
normalizedPathNoDotDot = normalizedRootPath / normalizedPath.lexically_relative(normalizedPath.root_path() / dotDotPrefix);
162+
solAssert(!hasDotDotSegments(normalizedPathNoDotDot), "");
163+
164+
// NOTE: On Windows lexically_normal() converts all separators to forward slashes. Convert them back.
165+
// Separators do not affect path comparison but remain in internal representation returned by native().
166+
// This will also normalize the root name to start with // in UNC paths.
167+
normalizedPathNoDotDot = normalizedPathNoDotDot.generic_string();
168+
169+
// For some reason boost considers "/." different than "/" even though for other directories
170+
// the trailing dot is ignored.
171+
if (normalizedPathNoDotDot == "/.")
172+
return "/";
173+
174+
return normalizedPathNoDotDot;
95175
}
96176

177+
bool FileReader::isPathPrefix(boost::filesystem::path const& _prefix, boost::filesystem::path const& _path)
178+
{
179+
solAssert(!_prefix.empty() && !_path.empty(), "");
180+
// NOTE: On Windows paths starting with a slash (rather than a drive letter) are considered relative by boost.
181+
solAssert(_prefix.is_absolute() || isUNCPath(_prefix) || _prefix.root_path() == "/", "");
182+
solAssert(_path.is_absolute() || isUNCPath(_path) || _path.root_path() == "/", "");
183+
solAssert(_prefix == _prefix.lexically_normal() && _path == _path.lexically_normal(), "");
184+
solAssert(!hasDotDotSegments(_prefix) && !hasDotDotSegments(_path), "");
185+
186+
boost::filesystem::path strippedPath = _path.lexically_relative(
187+
// Before 1.72.0 lexically_relative() was not handling paths with empty, dot and dot dot segments
188+
// correctly (see https://github.com/boostorg/filesystem/issues/76). The only case where this
189+
// is possible after our normalization is a directory name ending in a slash (filename is a dot).
190+
_prefix.filename_is_dot() ? _prefix.parent_path() : _prefix
191+
);
192+
return !strippedPath.empty() && *strippedPath.begin() != "..";
193+
}
194+
195+
boost::filesystem::path FileReader::stripPrefixIfPresent(boost::filesystem::path const& _prefix, boost::filesystem::path const& _path)
196+
{
197+
if (!isPathPrefix(_prefix, _path))
198+
return _path;
199+
200+
boost::filesystem::path strippedPath = _path.lexically_relative(
201+
_prefix.filename_is_dot() ? _prefix.parent_path() : _prefix
202+
);
203+
solAssert(strippedPath.empty() || *strippedPath.begin() != "..", "");
204+
return strippedPath;
205+
}
206+
207+
boost::filesystem::path FileReader::absoluteDotDotPrefix(boost::filesystem::path const& _path)
208+
{
209+
solAssert(_path.is_absolute() || _path.root_path() == "/", "");
210+
211+
boost::filesystem::path _pathWithoutRoot = _path.relative_path();
212+
boost::filesystem::path prefix;
213+
for (boost::filesystem::path const& segment: _pathWithoutRoot)
214+
if (segment.filename_is_dot_dot())
215+
prefix /= segment;
216+
217+
return prefix;
218+
}
219+
220+
bool FileReader::hasDotDotSegments(boost::filesystem::path const& _path)
221+
{
222+
for (boost::filesystem::path const& segment: _path)
223+
if (segment.filename_is_dot_dot())
224+
return true;
225+
226+
return false;
227+
}
228+
229+
bool FileReader::isUNCPath(boost::filesystem::path const& _path)
230+
{
231+
string rootName = _path.root_name().string();
232+
233+
return (
234+
rootName.size() == 2 ||
235+
(rootName.size() > 2 && rootName[2] != rootName[1])
236+
) && (
237+
(rootName[0] == '/' && rootName[1] == '/')
238+
#if defined(_WIN32)
239+
|| (rootName[0] == '\\' && rootName[1] == '\\')
240+
#endif
241+
);
242+
}
243+
244+
}

libsolidity/interface/FileReader.h

Lines changed: 47 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -45,30 +45,35 @@ class FileReader
4545
boost::filesystem::path _basePath = {},
4646
FileSystemPathSet _allowedDirectories = {}
4747
):
48-
m_basePath(std::move(_basePath)),
4948
m_allowedDirectories(std::move(_allowedDirectories)),
5049
m_sourceCodes()
51-
{}
50+
{
51+
setBasePath(_basePath);
52+
}
5253

53-
void setBasePath(boost::filesystem::path _path) { m_basePath = std::move(_path); }
54+
void setBasePath(boost::filesystem::path const& _path);
5455
boost::filesystem::path const& basePath() const noexcept { return m_basePath; }
5556

5657
void allowDirectory(boost::filesystem::path _path) { m_allowedDirectories.insert(std::move(_path)); }
5758
FileSystemPathSet const& allowedDirectories() const noexcept { return m_allowedDirectories; }
5859

5960
StringMap const& sourceCodes() const noexcept { return m_sourceCodes; }
6061

61-
/// Retrieves the source code for a given source unit ID.
62+
/// Retrieves the source code for a given source unit name.
6263
SourceCode const& sourceCode(SourceUnitName const& _sourceUnitName) const { return m_sourceCodes.at(_sourceUnitName); }
6364

64-
/// Resets all sources to the given map of source unit ID to source codes.
65+
/// Resets all sources to the given map of source unit name to source codes.
6566
/// Does not enforce @a allowedDirectories().
6667
void setSources(StringMap _sources);
6768

68-
/// Adds the source code for a given source unit ID.
69+
/// Adds the source code under a source unit name created by normalizing the file path.
6970
/// Does not enforce @a allowedDirectories().
7071
void setSource(boost::filesystem::path const& _path, SourceCode _source);
7172

73+
/// Adds the source code under the source unit name of @a <stdin>.
74+
/// Does not enforce @a allowedDirectories().
75+
void setStdin(SourceCode _source);
76+
7277
/// Receives a @p _sourceUnitName that refers to a source unit in compiler's virtual filesystem
7378
/// and attempts to interpret it as a path and read the corresponding file from disk.
7479
/// The read will only succeed if the canonical path of the file is within one of the @a allowedDirectories().
@@ -83,7 +88,43 @@ class FileReader
8388
return [this](std::string const& _kind, std::string const& _path) { return readFile(_kind, _path); };
8489
}
8590

91+
/// Normalizes a filesystem path to make it include all components up to the filesystem root,
92+
/// remove small, inconsequential differences that do not affect the meaning and make it look
93+
/// the same on all platforms (if possible). Symlinks in the path are not resolved.
94+
/// The resulting path uses forward slashes as path separators, has no redundant separators,
95+
/// has no redundant . or .. segments and has no root name if removing it does not change the meaning.
96+
/// The path does not have to actually exist.
97+
static boost::filesystem::path normalizeCLIPathForVFS(boost::filesystem::path const& _path);
98+
99+
/// @returns true if all the path components of @a _prefix are present at the beginning of @a _path.
100+
/// Both paths must be absolute (or have slash as root) and normalized (no . or .. segments, no
101+
/// multiple consecutive slashes).
102+
/// Paths are treated as case-sensitive. Does not require the path to actually exist in the
103+
/// filesystem and does not follow symlinks. Only considers whole segments, e.g. /abc/d is not
104+
/// considered a prefix of /abc/def. Both paths must be non-empty.
105+
/// Ignores the trailing slash, i.e. /a/b/c.sol/ is treated as a valid prefix of /a/b/c.sol.
106+
static bool isPathPrefix(boost::filesystem::path const& _prefix, boost::filesystem::path const& _path);
107+
108+
/// If @a _prefix is actually a prefix of @p _path, removes it from @a _path to make it relative.
109+
/// @returns The path without the prefix or unchanged path if there is not prefix.
110+
/// If @a _path and @_prefix are identical, the result is '.'.
111+
static boost::filesystem::path stripPrefixIfPresent(boost::filesystem::path const& _prefix, boost::filesystem::path const& _path);
112+
113+
/// @returns true if the specified path is an UNC path.
114+
/// UNC paths start with // followed by a name (on Windows they can also start with \\).
115+
/// They are used for network shares on Windows. On UNIX systems they do not have the same
116+
/// functionality but usually they are still recognized and treated in a special way.
117+
static bool isUNCPath(boost::filesystem::path const& _path);
118+
86119
private:
120+
/// If @a _path starts with a number of .. segments, returns a path consisting only of those
121+
/// segments (root name is not included). Otherwise returns an empty path. @a _path must be
122+
/// absolute (or have slash as root).
123+
static boost::filesystem::path absoluteDotDotPrefix(boost::filesystem::path const& _path);
124+
125+
/// @returns true if the path contains any .. segments.
126+
static bool hasDotDotSegments(boost::filesystem::path const& _path);
127+
87128
/// Base path, used for resolving relative paths in imports.
88129
boost::filesystem::path m_basePath;
89130

solc/CommandLineInterface.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -451,7 +451,7 @@ bool CommandLineInterface::readInputFiles()
451451
m_standardJsonInput = readUntilEnd(m_sin);
452452
}
453453
else
454-
m_fileReader.setSource(g_stdinFileName, readUntilEnd(m_sin));
454+
m_fileReader.setStdin(readUntilEnd(m_sin));
455455
}
456456

457457
if (m_fileReader.sourceCodes().empty() && !m_standardJsonInput.has_value())

test/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,7 @@ set(libsolidity_sources
103103
libsolidity/SyntaxTest.h
104104
libsolidity/ViewPureChecker.cpp
105105
libsolidity/analysis/FunctionCallGraph.cpp
106+
libsolidity/interface/FileReader.cpp
106107
)
107108
detect_stray_source_files("${libsolidity_sources}" "libsolidity/")
108109

0 commit comments

Comments
 (0)