Skip to content

Commit 64dc8cd

Browse files
Use file modification times for dirty working directory timestamps
When setuptools-scm encounters a dirty working directory, it now uses the latest modification time of changed files instead of falling back to the current timestamp. This provides more meaningful version timestamps during local development. Changes: - Added get_dirty_tag_date() method to Git and Mercurial working directory classes - Enhanced timestamp logic in parsing functions to prioritize file mtimes - Updated documentation to explain new timestamp behavior - Maintains backward compatibility with clean repository behavior The logic flow is now: 1. Try to get node_date from HEAD commit 2. If that fails AND working directory is dirty, use latest file mtime 3. Only fall back to datetime.now() as last resort
1 parent b070402 commit 64dc8cd

File tree

4 files changed

+209
-49
lines changed

4 files changed

+209
-49
lines changed

docs/usage.md

Lines changed: 44 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -294,71 +294,68 @@ be kept in version control. It's strongly recommended to be put into gitignore.
294294
`setuptools-scm` implements a [file_finders] entry point
295295
which returns all files tracked by your SCM.
296296
This eliminates the need for a manually constructed `MANIFEST.in` in most cases where this
297-
would be required when not using `setuptools-scm`, namely:
297+
would be required when not using `setuptools-scm`.
298298

299-
* To ensure all relevant files are packaged when running the `sdist` command.
300-
* When using [include_package_data] to include package data as part of the `build` or `bdist_wheel`.
299+
[file_finders]: https://setuptools.pypa.io/en/stable/userguide/extension.html
301300

302301
#### How it works
303302

304-
The file finder integration works through setuptools' plugin system:
305-
306-
1. **Entry Point Registration**: setuptools-scm registers itself as a file finder via the `setuptools.file_finders` entry point
307-
2. **Automatic Discovery**: When setuptools builds a source distribution, it automatically calls setuptools-scm to get the list of files
308-
3. **SCM Integration**: setuptools-scm queries your SCM (Git, Mercurial) to get all tracked files
309-
4. **File Inclusion**: All SCM-tracked files are automatically included in the sdist
303+
1. **Automatic Discovery**: When building source distributions (`python -m build --sdist`), setuptools automatically calls the `setuptools-scm` file finder
304+
2. **SCM Integration**: The file finder queries your SCM (Git/Mercurial) for all tracked files
305+
3. **Inclusion**: All tracked files are automatically included in the sdist
310306

311307
#### Controlling file inclusion
312308

313-
**Using MANIFEST.in**: You can still use `MANIFEST.in` to override the automatic behavior:
309+
**To exclude unwanted files:**
314310

315-
- **Exclude files**: Use `global-exclude` or `exclude` to remove files that are SCM-tracked but shouldn't be in the package
316-
- **Include additional files**: Use `include` to add files that aren't SCM-tracked
311+
1. **Use `MANIFEST.in`** to exclude specific files/patterns:
312+
```
313+
exclude development.txt
314+
recursive-exclude tests *.pyc
315+
```
317316

318-
```text title="MANIFEST.in"
319-
# Exclude development files
320-
exclude *.nix
321-
exclude .pre-commit-config.yaml
322-
global-exclude *.pyc
317+
2. **Configure Git archive** (for Git repositories):
318+
```bash
319+
# Add to .gitattributes
320+
tests/ export-ignore
321+
*.md export-ignore
322+
```
323323

324-
# Include additional files not in SCM
325-
include data/special-file.dat
326-
```
324+
3. **Use `.hgignore`** or **Mercurial archive configuration** (for Mercurial repositories)
327325

328-
**Example of what gets included automatically**:
326+
#### Troubleshooting
329327

330-
- All files tracked by Git/Mercurial in your repository
331-
- Includes source code, data files, documentation, etc.
332-
- Excludes untracked files and files ignored by your SCM
328+
**Problem: Unwanted files in my package**
329+
- **Solution**: Add exclusions to `MANIFEST.in`
330+
- **Alternative**: Use Git/Mercurial archive configuration
333331

334-
#### Troubleshooting
332+
**Problem: Missing files in package**
333+
-**Check**: Are the files tracked in your SCM?
334+
-**Solution**: `git add` missing files or override with `MANIFEST.in`
335335

336-
**Too many files in your sdist?**
336+
**Problem: File finder not working**
337+
-**Check**: Is setuptools-scm installed in your build environment?
338+
-**Check**: Are you in a valid SCM repository?
337339

338-
1. Check what's being included: `python -m setuptools_scm ls`
339-
2. Use `MANIFEST.in` to exclude unwanted files:
340-
```text
341-
exclude development-file.txt
342-
global-exclude *.log
343-
prune unnecessary-directory/
344-
```
340+
### Timestamps for Local Development Versions
345341

346-
**Files missing from your sdist?**
342+
!!! info "Improved Timestamp Behavior"
347343

348-
1. Ensure files are tracked by your SCM: `git add` or `hg add`
349-
2. For non-SCM files, add them via `MANIFEST.in`:
350-
```text
351-
include important-file.txt
352-
recursive-include data *.json
353-
```
344+
When your working directory has uncommitted changes (dirty), setuptools-scm now uses the **actual modification time of changed files** instead of the current time for local version schemes like `node-and-date`.
345+
346+
**Before**: Dirty working directories always used current time (`now`)
347+
**Now**: Uses the latest modification time of changed files, falling back to current time only if no changed files are found
348+
349+
This provides more stable and meaningful timestamps that reflect when you actually made changes to your code.
354350

355-
**Disable automatic file finding** (not recommended):
351+
**How it works:**
356352

357-
If you need to completely disable setuptools-scm's file finder (not recommended), you would need to uninstall setuptools-scm from your build environment and handle versioning differently.
353+
1. **Clean repository**: Uses commit timestamp from SCM
354+
2. **Dirty repository**: Uses latest modification time of changed files
355+
3. **Fallback**: Uses current time if no modification times can be determined
358356

359-
`MANIFEST.in` may still be used: anything defined there overrides the hook.
360-
This is mostly useful to exclude files tracked in your SCM from packages,
361-
although in principle it can be used to explicitly include non-tracked files too.
357+
**Benefits:**
362358

363-
[file_finders]: https://setuptools.pypa.io/en/latest/userguide/extension.html#adding-support-for-revision-control-systems
364-
[include_package_data]: https://setuptools.readthedocs.io/en/latest/setuptools.html#including-data-files
359+
- More stable builds during development
360+
- Timestamps reflect actual change times
361+
- Better for reproducible development workflows

src/setuptools_scm/git.py

Lines changed: 53 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,45 @@ def parse_timestamp(timestamp_text: str) -> date | None:
144144
error_msg="logging the iso date for head failed",
145145
)
146146

147+
def get_dirty_tag_date(self) -> date | None:
148+
"""Get the latest modification time of changed files in the working directory.
149+
150+
Returns the date of the most recently modified file that has changes,
151+
or None if no files are changed or if an error occurs.
152+
"""
153+
if not self.is_dirty():
154+
return None
155+
156+
try:
157+
# Get list of changed files
158+
changed_files_res = run_git(["diff", "--name-only"], self.path)
159+
if changed_files_res.returncode != 0:
160+
return None
161+
162+
changed_files = changed_files_res.stdout.strip().split("\n")
163+
if not changed_files or changed_files == [""]:
164+
return None
165+
166+
latest_mtime = 0.0
167+
for filepath in changed_files:
168+
full_path = self.path / filepath
169+
try:
170+
file_stat = full_path.stat()
171+
latest_mtime = max(latest_mtime, file_stat.st_mtime)
172+
except OSError:
173+
# File might not exist or be accessible, skip it
174+
continue
175+
176+
if latest_mtime > 0:
177+
# Convert to UTC date
178+
dt = datetime.fromtimestamp(latest_mtime, timezone.utc)
179+
return dt.date()
180+
181+
except Exception as e:
182+
log.debug("Failed to get dirty tag date: %s", e)
183+
184+
return None
185+
147186
def is_shallow(self) -> bool:
148187
return self.path.joinpath(".git/shallow").is_file()
149188

@@ -277,7 +316,20 @@ def _git_parse_inner(
277316
tag=tag, distance=distance, dirty=dirty, node=node, config=config
278317
)
279318
branch = wd.get_branch()
280-
node_date = wd.get_head_date() or datetime.now(timezone.utc).date()
319+
node_date = wd.get_head_date()
320+
321+
# If we can't get node_date from HEAD (e.g., no commits yet),
322+
# and the working directory is dirty, try to use the latest
323+
# modification time of changed files instead of current time
324+
if node_date is None and wd.is_dirty():
325+
dirty_date = wd.get_dirty_tag_date()
326+
if dirty_date is not None:
327+
node_date = dirty_date
328+
329+
# Final fallback to current time
330+
if node_date is None:
331+
node_date = datetime.now(timezone.utc).date()
332+
281333
return dataclasses.replace(version, branch=branch, node_date=node_date)
282334

283335

src/setuptools_scm/hg.py

Lines changed: 60 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,17 @@ def get_meta(self, config: Configuration) -> ScmVersion | None:
5252
check=True,
5353
).stdout.split("\n")
5454
dirty = bool(int(dirty_str))
55-
node_date = datetime.date.fromisoformat(dirty_date if dirty else node_date_str)
55+
56+
# For dirty working directories, try to use the latest file modification time
57+
# before falling back to the hg id date
58+
if dirty:
59+
file_mod_date = self.get_dirty_tag_date()
60+
if file_mod_date is not None:
61+
node_date = file_mod_date
62+
else:
63+
node_date = datetime.date.fromisoformat(dirty_date)
64+
else:
65+
node_date = datetime.date.fromisoformat(node_date_str)
5666

5767
if node == "0" * len(node):
5868
log.debug("initial node %s", self.path)
@@ -144,6 +154,55 @@ def check_changes_since_tag(self, tag: str | None) -> bool:
144154

145155
return bool(self.hg_log(revset, "."))
146156

157+
def get_dirty_tag_date(self) -> datetime.date | None:
158+
"""Get the latest modification time of changed files in the working directory.
159+
160+
Returns the date of the most recently modified file that has changes,
161+
or None if no files are changed or if an error occurs.
162+
"""
163+
try:
164+
# Check if working directory is dirty first
165+
res = _run([HG_COMMAND, "id", "-T", "{dirty}"], cwd=self.path)
166+
if res.returncode != 0 or not bool(res.stdout):
167+
return None
168+
169+
# Get list of changed files using hg status
170+
status_res = _run([HG_COMMAND, "status", "-m", "-a", "-r"], cwd=self.path)
171+
if status_res.returncode != 0:
172+
return None
173+
174+
changed_files = []
175+
for line in status_res.stdout.strip().split("\n"):
176+
if line and len(line) > 2:
177+
# Format is "M filename" or "A filename" etc.
178+
filepath = line[2:] # Skip status char and space
179+
changed_files.append(filepath)
180+
181+
if not changed_files:
182+
return None
183+
184+
latest_mtime = 0.0
185+
for filepath in changed_files:
186+
full_path = self.path / filepath
187+
try:
188+
file_stat = full_path.stat()
189+
latest_mtime = max(latest_mtime, file_stat.st_mtime)
190+
except OSError:
191+
# File might not exist or be accessible, skip it
192+
continue
193+
194+
if latest_mtime > 0:
195+
# Convert to UTC date
196+
dt = datetime.datetime.fromtimestamp(
197+
latest_mtime, datetime.timezone.utc
198+
)
199+
return dt.date()
200+
201+
except Exception as e:
202+
log.debug("Failed to get dirty tag date: %s", e)
203+
204+
return None
205+
147206

148207
def parse(root: _t.PathT, config: Configuration) -> ScmVersion | None:
149208
_require_command(HG_COMMAND)

src/setuptools_scm/hg_git.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,58 @@ def get_head_date(self) -> date | None:
4747
[HG_COMMAND, "log", "-r", ".", "-T", "{shortdate(date)}"], cwd=self.path
4848
).parse_success(parse=date.fromisoformat, error_msg="head date err")
4949

50+
def get_dirty_tag_date(self) -> date | None:
51+
"""Get the latest modification time of changed files in the working directory.
52+
53+
Returns the date of the most recently modified file that has changes,
54+
or None if no files are changed or if an error occurs.
55+
"""
56+
if not self.is_dirty():
57+
return None
58+
59+
try:
60+
from datetime import datetime
61+
from datetime import timezone
62+
63+
# Get list of changed files using hg status
64+
status_res = _run([HG_COMMAND, "status", "-m", "-a", "-r"], cwd=self.path)
65+
if status_res.returncode != 0:
66+
return None
67+
68+
changed_files = []
69+
for line in status_res.stdout.strip().split("\n"):
70+
if line and len(line) > 2:
71+
# Format is "M filename" or "A filename" etc.
72+
filepath = line[2:] # Skip status char and space
73+
changed_files.append(filepath)
74+
75+
if not changed_files:
76+
return None
77+
78+
latest_mtime = 0.0
79+
for filepath in changed_files:
80+
full_path = self.path / filepath
81+
try:
82+
file_stat = full_path.stat()
83+
latest_mtime = max(latest_mtime, file_stat.st_mtime)
84+
except OSError:
85+
# File might not exist or be accessible, skip it
86+
continue
87+
88+
if latest_mtime > 0:
89+
# Convert to UTC date
90+
dt = datetime.fromtimestamp(latest_mtime, timezone.utc)
91+
return dt.date()
92+
93+
except Exception as e:
94+
# Use the parent's log module
95+
import logging
96+
97+
log = logging.getLogger(__name__)
98+
log.debug("Failed to get dirty tag date: %s", e)
99+
100+
return None
101+
50102
def is_shallow(self) -> bool:
51103
return False
52104

0 commit comments

Comments
 (0)