Skip to content

Commit fd32ffa

Browse files
committed
release 0.9.2
fix #5 and other improvements: - new option -z to index archives and compressed files - new option --zmax=NUM to specify nesting depth of archives to index - new ugrep-indexer.exe Windows version
1 parent 78ca69a commit fd32ffa

File tree

5 files changed

+140
-47
lines changed

5 files changed

+140
-47
lines changed

bin/win32/ugrep-indexer.exe

1 KB
Binary file not shown.

bin/win64/ugrep-indexer.exe

1 KB
Binary file not shown.

man.sh

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -58,34 +58,41 @@ Indexes are up to date.
5858
.IP 1
5959
Indexing check with option \fB-c\fR detected missing and outdated index files.
6060
.SH EXAMPLES
61-
Recursively and incrementally index all non-binary files showing progress
61+
Recursively and incrementally index all non-binary files showing progress:
6262
.IP
6363
$ ugrep-indexer -I -v
6464
.PP
65-
Index all non-binary files, show progress, follow symbolic links to files (but
66-
not to directories), and do not index files and directories matching the globs
67-
in .gitignore:
65+
Recursively and incrementally index all non-binary files, including non-binary
66+
files stored in archives and in compressed files, showing progress:
6867
.IP
69-
$ ugrep-indexer -I -v -S -X
68+
$ ugrep-indexer -z -I -v
7069
.PP
71-
Recursively force re-indexing of all non-binary files:
70+
Incrementally index all non-binary files, including archives and compressed
71+
files, show progress, follow symbolic links to files (but not to directories),
72+
but do not index files and directories matching the globs in .gitignore:
7273
.IP
73-
$ ugrep-indexer -f -I
74+
$ ugrep-indexer -z -I -v -S -X
7475
.PP
75-
Recursively delete all hidden ._UG#_Store index files to restore the directory
76-
tree to non-indexed:
76+
Force re-indexing of all non-binary files, including archives and compressed
77+
files, follow symbolic links to files (but not to directories), but do not
78+
index files and directories matching the globs in .gitignore:
7779
.IP
78-
$ ugrep-indexer -d
80+
$ ugrep-indexer -f -z -I -v -S -X
7981
.PP
80-
Decrease index file storage to a minimum by decreasing indexing accuracy from 5
81-
(default) to 0:
82+
Same, but decrease index file storage to a minimum by decreasing indexing
83+
accuracy from 5 (default) to 0:
8284
.IP
83-
$ ugrep-indexer -If0
85+
$ ugrep-indexer -f -0 -z -I -v -X
8486
.PP
8587
Increase search performance by increasing the indexing accuracy from 5
8688
(default) to 7 at a cost of larger index files:
8789
.IP
88-
$ ugrep-indexer -If7
90+
$ ugrep-indexer -f7zIvX
91+
.PP
92+
Recursively delete all hidden ._UG#_Store index files to restore the directory
93+
tree to non-indexed:
94+
.IP
95+
$ ugrep-indexer -d
8996
.SH BUGS
9097
Report bugs at:
9198
.IP

man/ugrep-indexer.1

Lines changed: 75 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.TH UGREP-INDEXER "1" "August 12, 2023" "ugrep-indexer 0.9.1" "User Commands"
1+
.TH UGREP-INDEXER "1" "December 05, 2023" "ugrep-indexer 0.9.2" "User Commands"
22
.SH NAME
33
\fBugrep-indexer\fR -- file indexer for accelerated ugrep search
44
.SH SYNOPSIS
@@ -11,7 +11,8 @@ The following options are available:
1111
Usage:
1212
ugrep\-indexer [\fB\-0\fR|...|\fB\-9\fR] [\fB\-.\fR] [\fB\-c\fR|\fB\-d\fR|\fB\-f\fR] [\fB\-I\fR] [\fB\-q\fR] [\fB\-S\fR] [\fB\-s\fR] [\fB\-X\fR] [\fB\-z\fR] [\fIPATH\fR]
1313
.TP
14-
PATH Optional pathname to the root of the directory tree to index.
14+
PATH Optional pathname to the root of the directory tree to index. The
15+
default is to recursively index the working directory tree.
1516
.TP
1617
\fB\-0\fR, \fB\-1\fR, \fB\-2\fR, \fB\-3\fR, ..., \fB\-9\fR, \fB\-\-accuracy\fR=\fIDIGIT\fR
1718
Specifies indexing accuracy. A low accuracy reduces the indexing
@@ -54,53 +55,105 @@ their error messages and warnings are suppressed.
5455
Display version and exit.
5556
.TP
5657
\fB\-v\fR, \fB\-\-verbose\fR
57-
Produce verbose output.
58+
Produce verbose output. Indexed files are indicated with an A for
59+
archive, C for compressed, B for binary or I for ignored binary.
5860
.TP
5961
\fB\-X\fR, \fB\-\-ignore\-files\fR[=\fIFILE\fR]
6062
Do not index files and directories matching the globs in a FILE
6163
encountered during indexing. The default FILE is `.gitignore'.
64+
This option may be repeated to specify additional files.
6265
.TP
6366
\fB\-z\fR, \fB\-\-decompress\fR
64-
Index the contents of compressed files and archives.
65-
This option is not yet available in this version.
66-
ugrep\-indexer 0.9.1 beta
67-
License BSD\-3\-Clause: <https://opensource.org/licenses/BSD\-3\-Clause>
68-
Written by Robert van Engelen and others: <https://github.com/Genivia/ugrep>
67+
Index the contents of compressed files and archives. When used
68+
with option \fB\-\-zmax\fR=\fINUM\fR, indexes the contents of compressed files
69+
and archives stored within archives up to NUM levels deep.
70+
Supported compression formats: gzip (.gz), compress (.Z), zip,
71+
bzip2 (requires suffix .bz, .bz2, .bzip2, .tbz, .tbz2, .tb2, .tz2),
72+
lzma and xz (requires suffix .lzma, .tlz, .xz, .txz),
73+
lz4 (requires suffix .lz4),
74+
zstd (requires suffix .zst, .zstd, .tzst),
75+
brotli (requires suffix .br).
76+
.TP
77+
\fB\-\-zmax\fR=\fINUM\fR
78+
When used with option \fB\-z\fR (\fB\-\-decompress\fR), indexes the contents of
79+
compressed files and archives stored within archives by up to NUM
80+
expansion levels deep. The default \fB\-\-zmax\fR=1 only permits indexing
81+
uncompressed files stored in cpio, pax, tar and zip archives;
82+
compressed files and archives are detected as binary files and are
83+
effectively ignored. Specify \fB\-\-zmax\fR=2 to index compressed files
84+
and archives stored in cpio, pax, tar and zip archives. NUM may
85+
range from 1 to 99 for up to 99 decompression and de\-archiving
86+
steps. Increasing NUM values gradually degrades performance.
87+
.TP
88+
Indexes are incrementally updated unless option \fB\-f\fR or \fB\-\-force\fR is specified.
89+
.TP
90+
91+
.TP
92+
When option \fB\-I\fR or \fB\-\-ignore\-binary\fR is specified, binary files are ignored
93+
.TP
94+
and not indexed. Searching with ugrep \fB\-\-index\fR still searches binary files
95+
.TP
96+
unless ugrep option \fB\-I\fR or \fB\-\-ignore\-binary\fR is specified also.
97+
.TP
98+
99+
.TP
100+
Archives and compressed files are incrementally indexed only when option \fB\-z\fR
101+
.TP
102+
or \fB\-\-decompress\fR is specified. Otherwise, archives and compressed files are
103+
.TP
104+
indexed as binary files, or are ignored with option \fB\-I\fR or \fB\-\-ignore\-binary\fR.
105+
.TP
106+
107+
.TP
108+
To create an indexing log file, specify option \fB\-v\fR or \fB\-\-verbose\fR and redirect
109+
.TP
110+
standard output to a log file. All messages are sent to standard output.
111+
.TP
112+
113+
.TP
114+
69115
.SH "EXIT STATUS"
70116
The \fBugrep-indexer\fR utility exits with one of the following values:
71117
.IP 0
72118
Indexes are up to date.
73119
.IP 1
74120
Indexing check with option \fB-c\fR detected missing and outdated index files.
75121
.SH EXAMPLES
76-
Recursively and incrementally index all non-binary files showing progress
122+
Recursively and incrementally index all non-binary files showing progress:
77123
.IP
78124
$ ugrep-indexer -I -v
79125
.PP
80-
Index all non-binary files, show progress, follow symbolic links to files (but
81-
not to directories), and do not index files and directories matching the globs
82-
in .gitignore:
126+
Recursively and incrementally index all non-binary files, including non-binary
127+
files stored in archives and in compressed files, showing progress:
83128
.IP
84-
$ ugrep-indexer -I -v -S -X
129+
$ ugrep-indexer -z -I -v
85130
.PP
86-
Recursively force re-indexing of all non-binary files:
131+
Incrementally index all non-binary files, including archives and compressed
132+
files, show progress, follow symbolic links to files (but not to directories),
133+
but do not index files and directories matching the globs in .gitignore:
87134
.IP
88-
$ ugrep-indexer -f -I
135+
$ ugrep-indexer -z -I -v -S -X
89136
.PP
90-
Recursively delete all hidden ._UG#_Store index files to restore the directory
91-
tree to non-indexed:
137+
Force re-indexing of all non-binary files, including archives and compressed
138+
files, follow symbolic links to files (but not to directories), but do not
139+
index files and directories matching the globs in .gitignore:
92140
.IP
93-
$ ugrep-indexer -d
141+
$ ugrep-indexer -f -z -I -v -S -X
94142
.PP
95-
Decrease index file storage to a minimum by decreasing indexing accuracy from 5
96-
(default) to 0:
143+
Same, but decrease index file storage to a minimum by decreasing indexing
144+
accuracy from 5 (default) to 0:
97145
.IP
98-
$ ugrep-indexer -If0
146+
$ ugrep-indexer -f -0 -z -I -v -X
99147
.PP
100148
Increase search performance by increasing the indexing accuracy from 5
101149
(default) to 7 at a cost of larger index files:
102150
.IP
103-
$ ugrep-indexer -If7
151+
$ ugrep-indexer -f7zIvX
152+
.PP
153+
Recursively delete all hidden ._UG#_Store index files to restore the directory
154+
tree to non-indexed:
155+
.IP
156+
$ ugrep-indexer -d
104157
.SH BUGS
105158
Report bugs at:
106159
.IP

src/ugrep-indexer.cpp

Lines changed: 44 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -242,7 +242,7 @@ const char ugrep_index_file_magic[5] = "UG#\x03";
242242
const char *arg_pathname = NULL;
243243

244244
// command-line options
245-
int flag_accuracy = 6; // -0 ... -9 (--accuracy) default is -6
245+
int flag_accuracy = 5; // -0 ... -9 (--accuracy) default is -5
246246
bool flag_check = false; // -c (--check)
247247
bool flag_decompress = false; // -z (--decompress)
248248
bool flag_delete = false; // -d (--delete)
@@ -524,7 +524,7 @@ void help()
524524
{
525525
std::cout << "\nUsage:\n\nugrep-indexer [-0|...|-9] [-.] [-c|-d|-f] [-I] [-q] [-S] [-s] [-X] [-z] [PATH]\n\n\
526526
PATH Optional pathname to the root of the directory tree to index. The\n\
527-
default is to index the working directory tree.\n\n\
527+
default is to recursively index the working directory tree.\n\n\
528528
-0, -1, -2, -3, ..., -9, --accuracy=DIGIT\n\
529529
Specifies indexing accuracy. A low accuracy reduces the indexing\n\
530530
storage overhead at the cost of a higher rate of false positive\n\
@@ -612,8 +612,26 @@ void help()
612612
"\
613613
This option is not available in this build configuration of ugrep.\n"
614614
#endif
615-
"\n";
616-
version();
615+
"\n\
616+
Indexes are incrementally updated unless option -f or --force is specified.\n\
617+
\n\
618+
When option -I or --ignore-binary is specified, binary files are ignored\n\
619+
and not indexed. Searching with ugrep --index still searches binary files\n\
620+
unless ugrep option -I or --ignore-binary is specified also.\n\
621+
\n\
622+
Archives and compressed files are incrementally indexed only when option -z\n\
623+
or --decompress is specified. Otherwise, archives and compressed files are\n\
624+
indexed as binary files, or are ignored with option -I or --ignore-binary.\n\
625+
\n\
626+
To create an indexing log file, specify option -v or --verbose and redirect\n\
627+
standard output to a log file. All messages are sent to standard output.\n\
628+
\n\
629+
The ugrep-indexer utility exits with one of the following values:\n\
630+
0 Indexes are up to date.\n\
631+
1 Some indexes appear to be stale and are outdated or missing.\n\
632+
\n";
633+
634+
exit(EXIT_SUCCESS);
617635
}
618636

619637
// display usage information and exit
@@ -629,7 +647,7 @@ void warning(const char *message, const char *arg = NULL)
629647
if (flag_no_messages)
630648
return;
631649
fflush(stdout);
632-
fprintf(stderr, "ugrep-indexer: warning: %s%s%s\n", message, arg != NULL ? " " : "", arg != NULL ? arg : "");
650+
printf("ugrep-indexer: warning: %s%s%s\n", message, arg != NULL ? " " : "", arg != NULL ? arg : "");
633651
}
634652

635653
// display an error message unless option -s (--no-messages)
@@ -645,7 +663,7 @@ void error(const char *message, const char *arg)
645663
const char *errmsg = strerror(errno);
646664
#endif
647665
fflush(stdout);
648-
fprintf(stderr, "ugrep-indexer: error: %s%s%s: %s\n", message, arg != NULL ? " " : "", arg != NULL ? arg : "", errmsg);
666+
printf("ugrep-indexer: error: %s%s%s: %s\n", message, arg != NULL ? " " : "", arg != NULL ? arg : "", errmsg);
649667
}
650668

651669
#ifdef HAVE_LIBZ
@@ -655,7 +673,7 @@ void cannot_decompress(const char *pathname, const char *message)
655673
if (!flag_verbose || flag_no_messages)
656674
return;
657675
fflush(stdout);
658-
fprintf(stderr, "ugrep-indexer: warning: cannot decompress %s: %s\n", pathname, message != NULL ? message : "");
676+
printf("ugrep-indexer: warning: cannot decompress %s: %s\n", pathname, message != NULL ? message : "");
659677
}
660678
#endif
661679

@@ -817,6 +835,10 @@ void options(int argc, const char **argv)
817835
}
818836
}
819837

838+
// -q overrides -v
839+
if (flag_quiet)
840+
flag_verbose = false;
841+
820842
// -c silently overrides -d and -f
821843
if (flag_check)
822844
flag_delete = flag_force = false;
@@ -1430,8 +1452,6 @@ void cat(const std::string& pathname, std::stack<Entry>& dir_entries, std::vecto
14301452
// recursively delete index files
14311453
void deleter(const char *pathname)
14321454
{
1433-
flag_no_messages = true;
1434-
14351455
std::stack<Entry> dir_entries;
14361456
std::vector<Entry> file_entries;
14371457
std::string index_filename;
@@ -1444,6 +1464,7 @@ void deleter(const char *pathname)
14441464
int64_t ign_files = 0;
14451465
uint64_t index_time;
14461466
uint64_t last_time;
1467+
uint64_t num_removed = 0;
14471468

14481469
// pathname to the directory tree to index or .
14491470
if (pathname == NULL)
@@ -1459,13 +1480,25 @@ void deleter(const char *pathname)
14591480

14601481
cat(visit.pathname, dir_entries, file_entries, num_dirs, num_links, num_other, ign_dirs, ign_files, index_time, last_time, true);
14611482

1462-
// if index time is nonzero, there is a valid index file in this directory we should remove
1483+
// if index time is nonzero, there is a valid index file in this directory that we should remove
14631484
if (index_time > 0)
14641485
{
14651486
index_filename.assign(visit.pathname).append(PATHSEPSTR).append(ugrep_index_filename);
1466-
remove(index_filename.c_str());
1487+
if (remove(index_filename.c_str()) != 0)
1488+
{
1489+
error("cannot remove", index_filename.c_str());
1490+
}
1491+
else
1492+
{
1493+
++num_removed;
1494+
if (flag_verbose)
1495+
printf("%13" PRIu64 " %s\n", num_removed, index_filename.c_str());
1496+
}
14671497
}
14681498
}
1499+
1500+
if (!flag_quiet)
1501+
printf("\n%13" PRIu64 " indexes removed from %" PRIu64 " directories\n\n", num_removed, num_dirs);
14691502
}
14701503

14711504
// recursively index files

0 commit comments

Comments
 (0)