Skip to content

Commit e35dec5

Browse files
aitapben-schwenHughParsonagejangorecki
authored
Replace TRUELENGTH markers with a hash (#6694)
* Implement the hash table * memrecycle(): replace TRUELENGTH marks with a hash * rbindlist(): replace 1/2 TRUELENGTH with hashing Also avoid crashing when creating a 0-size hash. * rbindlist(): replace 2/2 TRUELENGTH with hashing This may likely require a dynamically growing hash of TRUELENGTHs instead of the current pre-allocation approach with a very conservative over-estimate. * chmatchMain(): replace TRUELENGTH marks with hash * copySharedColumns(): hash instead of TRUELENGTH * combineFactorLevels(): hash instead of TRUELENGTH * anySpecialStatic(): hash instead of TRUELENGTH * forder(): hash instead of TRUELENGTH The hash needs O(n) memory (actually 2*n/load_factor entries) which isn't great. * Remove savetl() * Add codecov suppressions * Dynamically grow the hash table with bound unknown In forder() and rbindlist(), there is no good upper boundary on the number of elements in the hash known ahead of time. Grow the hash table dynamically. Since the R/W locks are far too slow and OpenMP atomics are too limited, rely on strategically placed flushes, which isn't really a solution. * Minor hash improvements Use only 28 bits of the pointer (lower 32 but discard the lowest 4). Inline the linear search by advancing the pointer instead of repeatedly computing and dividing the hash value. Average improvement of 10%. * dhash: no need to keep previous table The hash can only be enlarged from (1) a single-thread context, or (2) under a critical section, so there is no need to worry about other threads getting a use-after-free due to a reallocation. This should halve the memory use by the hash table. * use double hashing instead of linear probing (#7418) * add lookup or insert * use lookup or insert * use lookup_or_insert * really use lookup or insert * use cuckoo hashing * add rehash * use power of 2 and mask instead of modulo * mix instead of multiplication * use different mixes * change multipliers * use double hashing * remove xor folding * Fix allocation non-overflow precondition * Set the default load factor * Inline hash_rehash() * update comments * Leave overflow checking to R_alloc * internal_error() is not covered --------- Co-authored-by: Ivan K <[email protected]> * replace dhashtab with hashtab in rbindlist * Use hashtab in forder() Since range_str() runs a parallel OpenMP loop that may update the hash table in a critical section, use a special form of hash_set that returns the newly reallocated hash table instead of overwriting it in place. * Drop dhashtab * rbindlist: initial hash size = upperBoundUniqueNames * Don't bother cleaning the hash before an error * Avoid setting same key&value twice * chmatch: hash x instead of table (#7454) * add hash x branch in chmatch * adapt kick-in threshold * make chin branch more explicit * Use linear probing instead of double hashing (#7455) * use linear probing instead of double hashing * remove mask from struct * fix comment * rbindlist(): better initial allocation size Also adjust the minimal hash table size to avoid shift overflow on size=0 and free=0 on size=1. * dogroups: replace remaining uses of SETLENGTH * hashtab: switch to C allocator to avoid longjumps PROTECT() the corresponding EXTPTRSXP while used. Introduce a separate hash_set_shared() operation that avoids long jumps. Deallocate the previous hash table when growing a non-shared hash table. * range_str: use hash_set_shared in OpenMP region Explicitly check the return value and update the shared pointer when necessary. If a reallocation attempt fails, signal an error when possible. * range_str: allocate temporaries on the R heap Avoid memory leaks from potential long jumps in hash_set(). * memrecycle: allocate temporary on the R heap This avoids a memory leak in case growVector or hash_set fails. * chmatchdup: allocate temporaries on the R heap Prevent memory leak in case hash_set() causes a long jump. * rbindlist: drop 'uniq' Turns out it wasn't used. * rbindlist: use hash_set_shared Where C-heap allocations also exist, catch and handle potential allocation failures from the hash table. * range_str: propagate rehashed marks to caller * fix stale comment * NEWS item Co-authored-by: Benjamin Schwendinger <[email protected]> Co-authored-by: HughParsonage <[email protected]> Co-authored-by: Jan Gorecki <[email protected]> * glci: update status expectations for R-devel * copyAsGrowable: use the correct number of arguments --------- Co-authored-by: Benjamin Schwendinger <[email protected]> Co-authored-by: Benjamin Schwendinger <[email protected]> Co-authored-by: HughParsonage <[email protected]> Co-authored-by: Jan Gorecki <[email protected]>
1 parent 8e29d59 commit e35dec5

File tree

13 files changed

+399
-279
lines changed

13 files changed

+399
-279
lines changed

.gitlab-ci.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,7 @@ test-lin-dev-gcc-strict-cran:
183183
- R CMD check --as-cran $(ls -1t data.table_*.tar.gz | head -n 1)
184184
- (! grep "warning:" data.table.Rcheck/00install.out)
185185
- >-
186-
Rscript -e 'l=tail(readLines("data.table.Rcheck/00check.log"), 1L); notes<-"Status: 2 NOTEs"; if (!identical(l, notes)) stop("Last line of ", shQuote("00check.log"), " is not ", shQuote(notes), " (non-API calls, V8 package) but ", shQuote(l)) else q("no")'
186+
Rscript -e 'l=tail(readLines("data.table.Rcheck/00check.log"), 1L); notes<-"Status: 1 NOTE"; if (!identical(l, notes)) stop("Last line of ", shQuote("00check.log"), " is not ", shQuote(notes), " (V8 package) but ", shQuote(l)) else q("no")'
187187
188188
## R-devel on Linux clang
189189
# R compiled with clang, flags removed: -flto=auto -fopenmp
@@ -206,7 +206,7 @@ test-lin-dev-clang-cran:
206206
- R CMD check --as-cran $(ls -1t data.table_*.tar.gz | head -n 1)
207207
- (! grep "warning:" data.table.Rcheck/00install.out)
208208
- >-
209-
Rscript -e 'l=tail(readLines("data.table.Rcheck/00check.log"), 1L); notes<-"Status: 2 NOTEs"; if (!identical(l, notes)) stop("Last line of ", shQuote("00check.log"), " is not ", shQuote(notes), " (non-API calls, V8 package) but ", shQuote(l)) else q("no")'
209+
Rscript -e 'l=tail(readLines("data.table.Rcheck/00check.log"), 1L); notes<-"Status: 1 NOTE"; if (!identical(l, notes)) stop("Last line of ", shQuote("00check.log"), " is not ", shQuote(notes), " (V8 package) but ", shQuote(l)) else q("no")'
210210
211211
# stated dependency on R
212212
test-lin-ancient-cran:

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -379,6 +379,8 @@ See [#2611](https://github.com/Rdatatable/data.table/issues/2611) for details. T
379379
380380
8. Retain important information in the error message about the source of the error when `i=` fails, e.g. pointing to `charToDate()` failing in `DT[date_col == "20250101"]`, [#7444](https://github.com/Rdatatable/data.table/issues/7444). Thanks @jan-swissre for the report and @MichaelChirico for the fix.
381381
382+
9. Internal use of declared non-API R functions `SETLENGTH`, `TRUELENGTH`, `SET_TRUELENGTH`, and `SET_GROWABLE_BIT` has been eliminated. Most usages have been migrated to R's experimental resizable vectors API (thanks to @ltierney, introduced in R 4.6.0, backported for older R versions), [#7451](https://github.com/Rdatatable/data.table/pull/7451). Uses of `TRUELENGTH` for marking seen items during grouping and binding operations (aka free hash table trick) have been replaced with proper hash tables, [#6694](https://github.com/Rdatatable/data.table/pull/6694). The new hash table implementation uses linear probing with power of 2 tables and automatic resizing. Additionally, `chmatch()` now hashes the needle (`x`) instead of the haystack (`table`) when `length(table) >> length(x)`, significantly improving performance for lookups into large tables. We've benchmarked the refactored code and find the performance satisfactory, but please do report any edge case performance regressions we may have missed. Thanks to @aitap, @ben-schwen, @jangorecki and @HughParsonage for implementation and reviews.
383+
382384
## data.table [v1.17.8](https://github.com/Rdatatable/data.table/milestone/41) (6 July 2025)
383385
384386
1. Internal functions used to signal errors are now marked as non-returning, silencing a compiler warning about potentially unchecked allocation failure. Thanks to Prof. Brian D. Ripley for the report and @aitap for the fix, [#7070](https://github.com/Rdatatable/data.table/pull/7070).

R/data.table.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1611,7 +1611,7 @@ replace_dot_alias = function(e) {
16111611
if (length(xcols)) {
16121612
# TODO add: if (max(len__)==nrow) stopf("There is no need to deep copy x in this case")
16131613
# TODO move down to dogroup.c, too.
1614-
SDenv$.SDall = .Call(CsubsetDT, x, if (length(len__)) seq_len(max(len__)) else 0L, xcols) # must be deep copy when largest group is a subset
1614+
SDenv$.SDall = .Call(CcopyAsGrowable, .Call(CsubsetDT, x, if (length(len__)) seq_len(max(len__)) else 0L, xcols)) # must be deep copy when largest group is a subset
16151615
if (!is.data.table(SDenv$.SDall)) setattr(SDenv$.SDall, "class", c("data.table","data.frame")) # DF |> DT(,.SD[...],by=grp) needs .SD to be data.table, test 2022.012
16161616
if (xdotcols) setattr(SDenv$.SDall, 'names', ansvars[xcolsAns]) # now that we allow 'x.' prefix in 'j', #2313 bug fix - [xcolsAns]
16171617
SDenv$.SD = if (length(non_sdvars)) shallow(SDenv$.SDall, sdvars) else SDenv$.SDall
@@ -1884,7 +1884,7 @@ replace_dot_alias = function(e) {
18841884
grpcols = leftcols # 'leftcols' are the columns in i involved in the join (either head of key(i) or head along i)
18851885
jiscols = chmatch(jisvars, names_i) # integer() if there are no jisvars (usually there aren't, advanced feature)
18861886
xjiscols = chmatch(xjisvars, names_x)
1887-
SDenv$.xSD = x[min(nrow(i), 1L), xjisvars, with=FALSE]
1887+
SDenv$.xSD = .Call(CcopyAsGrowable, x[min(nrow(i), 1L), xjisvars, with=FALSE])
18881888
if (!missing(on)) o__ = xo else o__ = integer(0L)
18891889
} else {
18901890
groups = byval

inst/tests/tests.Rraw

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21888,3 +21888,11 @@ foo = function(dt) { dt[,b:=4:6]; return(7:9) }
2188821888
DT = data.table(a=1:3)
2188921889
test(2349, DT[,c:=outer(DT)], data.table(a=1:3, b=4:6, c=7:9))
2189021890
test(2349.1, DT[,c:=foo(DT)], data.table(a=1:3, b=4:6, c=7:9))
21891+
rm(inner, outer, foo, DT)
21892+
21893+
# exercise rehashing during forder, #6694
21894+
strings = as.character(6145:1)
21895+
DT = data.table(x = strings)
21896+
setorder(DT, x)
21897+
test(2350, DT[["x"]], sort.int(strings, method='radix'))
21898+
rm(DT, strings)

src/assign.c

Lines changed: 13 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -762,29 +762,19 @@ const char *memrecycle(const SEXP target, const SEXP where, const int start, con
762762
const int nTargetLevels=length(targetLevels), nSourceLevels=length(sourceLevels);
763763
const SEXP *targetLevelsD=STRING_PTR_RO(targetLevels), *sourceLevelsD=STRING_PTR_RO(sourceLevels);
764764
SEXP newSource = PROTECT(allocVector(INTSXP, length(source))); protecti++;
765-
savetl_init();
765+
hashtab * marks = hash_create((size_t)nTargetLevels + nSourceLevels);
766+
PROTECT(marks->prot); protecti++;
766767
for (int k=0; k<nTargetLevels; ++k) {
767768
const SEXP s = targetLevelsD[k];
768-
const int tl = TRUELENGTH(s);
769-
if (tl>0) {
770-
savetl(s);
771-
} else if (tl<0) {
772-
// # nocov start
773-
for (int j=0; j<k; ++j) SET_TRUELENGTH(s, 0); // wipe our negative usage and restore 0
774-
savetl_end(); // then restore R's own usage (if any)
775-
internal_error(__func__, "levels of target are either not unique or have truelength<0"); // # nocov
776-
// # nocov end
777-
}
778-
SET_TRUELENGTH(s, -k-1);
769+
hash_set(marks, s, -k-1);
779770
}
780771
int nAdd = 0;
781772
for (int k=0; k<nSourceLevels; ++k) {
782773
const SEXP s = sourceLevelsD[k];
783-
const int tl = TRUELENGTH(s);
774+
const int tl = hash_lookup(marks, s, 0);
784775
if (tl>=0) {
785776
if (!sourceIsFactor && s==NA_STRING) continue; // don't create NA factor level when assigning character to factor; test 2117
786-
if (tl>0) savetl(s);
787-
SET_TRUELENGTH(s, -nTargetLevels-(++nAdd));
777+
hash_set(marks, s, -nTargetLevels-(++nAdd));
788778
} // else, when sourceIsString, it's normal for there to be duplicates here
789779
}
790780
const int nSource = length(source);
@@ -793,45 +783,36 @@ const char *memrecycle(const SEXP target, const SEXP where, const int start, con
793783
const int *sourceD = INTEGER(source);
794784
for (int i=0; i<nSource; ++i) { // convert source integers to refer to target levels
795785
const int val = sourceD[i];
796-
newSourceD[i] = val==NA_INTEGER ? NA_INTEGER : -TRUELENGTH(sourceLevelsD[val-1]); // retains NA factor levels here via TL(NA_STRING); e.g. ordered factor
786+
newSourceD[i] = val==NA_INTEGER ? NA_INTEGER : -hash_lookup(marks, sourceLevelsD[val-1], 0); // retains NA factor levels here via TL(NA_STRING); e.g. ordered factor
797787
}
798788
} else {
799789
const SEXP *sourceD = STRING_PTR_RO(source);
800790
for (int i=0; i<nSource; ++i) { // convert source integers to refer to target levels
801791
const SEXP val = sourceD[i];
802-
newSourceD[i] = val==NA_STRING ? NA_INTEGER : -TRUELENGTH(val);
792+
newSourceD[i] = val==NA_STRING ? NA_INTEGER : -hash_lookup(marks, val, 0);
803793
}
804794
}
805795
source = newSource;
806-
for (int k=0; k<nTargetLevels; ++k) SET_TRUELENGTH(targetLevelsD[k], 0); // don't need those anymore
796+
for (int k=0; k<nTargetLevels; ++k) hash_set(marks, targetLevelsD[k], 0); // don't need those anymore
807797
if (nAdd) {
808-
// cannot grow the levels yet as that would be R call which could fail to alloc and we have no hook to clear up
809-
SEXP *temp = malloc(sizeof(*temp) * nAdd);
810-
if (!temp) {
811-
// # nocov start
812-
for (int k=0; k<nSourceLevels; ++k) SET_TRUELENGTH(sourceLevelsD[k], 0);
813-
savetl_end();
814-
error(_("Unable to allocate working memory of %zu bytes to combine factor levels"), nAdd*sizeof(SEXP *));
815-
// # nocov end
816-
}
798+
void *vmax = vmaxget();
799+
SEXP *temp = (SEXP *)R_alloc(nAdd, sizeof(*temp));
817800
for (int k=0, thisAdd=0; thisAdd<nAdd; ++k) { // thisAdd<nAdd to stop early when the added ones are all reached
818801
SEXP s = sourceLevelsD[k];
819-
int tl = TRUELENGTH(s);
802+
int tl = hash_lookup(marks, s, 0);
820803
if (tl) { // tl negative here
821804
if (tl != -nTargetLevels-thisAdd-1) internal_error(__func__, "extra level check sum failed"); // # nocov
822805
temp[thisAdd++] = s;
823-
SET_TRUELENGTH(s,0);
806+
hash_set(marks, s, 0);
824807
}
825808
}
826-
savetl_end();
827809
setAttrib(target, R_LevelsSymbol, targetLevels=growVector(targetLevels, nTargetLevels + nAdd));
828810
for (int k=0; k<nAdd; ++k) {
829811
SET_STRING_ELT(targetLevels, nTargetLevels+k, temp[k]);
830812
}
831-
free(temp);
813+
vmaxset(vmax);
832814
} else {
833815
// all source levels were already in target levels, but not with the same integers; we're done
834-
savetl_end();
835816
}
836817
// now continue, but with the mapped integers in the (new) source
837818
}
@@ -1206,62 +1187,6 @@ SEXP allocNAVectorLike(SEXP x, R_len_t n) {
12061187
return(v);
12071188
}
12081189

1209-
static SEXP *saveds=NULL;
1210-
static R_len_t *savedtl=NULL, nalloc=0, nsaved=0;
1211-
1212-
void savetl_init(void) {
1213-
if (nsaved || nalloc || saveds || savedtl) {
1214-
internal_error(__func__, "savetl_init checks failed (%d %d %p %p)", nsaved, nalloc, (void *)saveds, (void *)savedtl); // # nocov
1215-
}
1216-
nsaved = 0;
1217-
nalloc = 100;
1218-
saveds = malloc(sizeof(*saveds) * nalloc);
1219-
savedtl = malloc(sizeof(*savedtl) * nalloc);
1220-
if (!saveds || !savedtl) {
1221-
free(saveds); free(savedtl); // # nocov
1222-
savetl_end(); // # nocov
1223-
error(_("Failed to allocate initial %d items in savetl_init"), nalloc); // # nocov
1224-
}
1225-
}
1226-
1227-
void savetl(SEXP s)
1228-
{
1229-
if (nsaved==nalloc) {
1230-
if (nalloc==INT_MAX) {
1231-
savetl_end(); // # nocov
1232-
internal_error(__func__, "reached maximum %d items for savetl", nalloc); // # nocov
1233-
}
1234-
nalloc = nalloc>(INT_MAX/2) ? INT_MAX : nalloc*2;
1235-
char *tmp = realloc(saveds, sizeof(SEXP)*nalloc);
1236-
if (tmp==NULL) {
1237-
// C spec states that if realloc() fails the original block is left untouched; it is not freed or moved. We rely on that here.
1238-
savetl_end(); // # nocov free(saveds) happens inside savetl_end
1239-
error(_("Failed to realloc saveds to %d items in savetl"), nalloc); // # nocov
1240-
}
1241-
saveds = (SEXP *)tmp;
1242-
tmp = realloc(savedtl, sizeof(R_len_t)*nalloc);
1243-
if (tmp==NULL) {
1244-
savetl_end(); // # nocov
1245-
error(_("Failed to realloc savedtl to %d items in savetl"), nalloc); // # nocov
1246-
}
1247-
savedtl = (R_len_t *)tmp;
1248-
}
1249-
saveds[nsaved] = s;
1250-
savedtl[nsaved] = TRUELENGTH(s);
1251-
nsaved++;
1252-
}
1253-
1254-
void savetl_end(void) {
1255-
// Can get called if nothing has been saved yet (nsaved==0), or even if _init() hasn't been called yet (pointers NULL). Such
1256-
// as to clear up before error. Also, it might be that nothing needed to be saved anyway.
1257-
for (int i=0; i<nsaved; i++) SET_TRUELENGTH(saveds[i],savedtl[i]);
1258-
free(saveds); // possible free(NULL) which is safe no-op
1259-
saveds = NULL;
1260-
free(savedtl);
1261-
savedtl = NULL;
1262-
nsaved = nalloc = 0;
1263-
}
1264-
12651190
SEXP setcharvec(SEXP x, SEXP which, SEXP newx)
12661191
{
12671192
int w;

src/chmatch.c

Lines changed: 56 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,7 @@ static SEXP chmatchMain(SEXP x, SEXP table, int nomatch, bool chin, bool chmatch
3636
}
3737
// Since non-ASCII strings may be marked with different encodings, it only make sense to compare
3838
// the bytes under a same encoding (UTF-8) #3844 #3850.
39-
// Not 'const' because we might SET_TRUELENGTH() below.
40-
SEXP *xd;
39+
const SEXP *xd;
4140
if (isSymbol(x)) {
4241
xd = &sym;
4342
} else {
@@ -55,34 +54,54 @@ static SEXP chmatchMain(SEXP x, SEXP table, int nomatch, bool chin, bool chmatch
5554
UNPROTECT(nprotect);
5655
return ans;
5756
}
58-
// else xlen>1; nprotect is const above since no more R allocations should occur after this point
59-
savetl_init();
60-
for (int i=0; i<xlen; i++) {
61-
SEXP s = xd[i];
62-
const int tl = TRUELENGTH(s);
63-
if (tl>0) {
64-
savetl(s); // R's internal hash (which is positive); save it
65-
SET_TRUELENGTH(s,0);
66-
} else if (tl<0) {
67-
// R 2.14.0+ initializes truelength to 0 (before that it was uninitialized/random).
68-
// Now that data.table depends on R 3.1.0+, that is after 2.14.0 too.
69-
// We rely on that 0-initialization, and that R's internal hash is positive.
70-
// # nocov start
71-
savetl_end();
72-
internal_error(__func__, "CHARSXP '%s' has a negative truelength (%d)", CHAR(s), tl); // # nocov
73-
// # nocov end
57+
// Else xlen > 1.
58+
// When table >> x, hash x and scan table // ToDo tune the kick-in factor
59+
if (!chmatchdup && tablelen > 2 * xlen) {
60+
hashtab *marks = hash_create(xlen);
61+
PROTECT(marks->prot); nprotect++;
62+
int nuniq = 0;
63+
for (int i = 0; i < xlen; ++i) {
64+
// todo use lookup_insert?
65+
int tl = hash_lookup(marks, xd[i], 0);
66+
if (tl == 0) {
67+
hash_set(marks, xd[i], -1);
68+
nuniq++;
69+
}
70+
}
71+
72+
for (int i = 0; i < tablelen; ++i) {
73+
int tl = hash_lookup(marks, td[i], 0);
74+
if (tl == -1) {
75+
hash_set(marks, td[i], i + 1);
76+
nuniq--;
77+
if (nuniq == 0) break; // all found, stop scanning
78+
}
79+
}
80+
81+
if (chin) {
82+
#pragma omp parallel for num_threads(getDTthreads(xlen, true))
83+
for (int i = 0; i < xlen; ++i) {
84+
ansd[i] = hash_lookup(marks, xd[i], 0) > 0;
85+
}
86+
} else {
87+
#pragma omp parallel for num_threads(getDTthreads(xlen, true))
88+
for (int i = 0; i < xlen; ++i) {
89+
const int m = hash_lookup(marks, xd[i], 0);
90+
ansd[i] = (m < 0) ? nomatch : m;
91+
}
7492
}
93+
UNPROTECT(nprotect);
94+
return ans;
7595
}
96+
hashtab * marks = hash_create(tablelen);
97+
PROTECT(marks->prot); nprotect++;
7698
int nuniq=0;
7799
for (int i=0; i<tablelen; ++i) {
78100
const SEXP s = td[i];
79-
int tl = TRUELENGTH(s);
80-
if (tl>0) { savetl(s); tl=0; }
81-
if (tl==0) SET_TRUELENGTH(s, chmatchdup ? -(++nuniq) : -i-1); // first time seen this string in table
101+
int tl = hash_lookup(marks, s, 0);
102+
if (tl==0) hash_set(marks, s, chmatchdup ? -(++nuniq) : -i-1); // first time seen this string in table
82103
}
83104
// in future if we need NAs in x not to be matched to NAs in table ...
84-
// if (!matchNAtoNA && TRUELENGTH(NA_STRING)<0)
85-
// SET_TRUELENGTH(NA_STRING, 0);
86105
if (chmatchdup) {
87106
// chmatchdup() is basically base::pmatch() but without the partial matching part. For example :
88107
// chmatchdup(c("a", "a"), c("a", "a")) # 1,2 - the second 'a' in 'x' has a 2nd match in 'table'
@@ -96,45 +115,37 @@ static SEXP chmatchMain(SEXP x, SEXP table, int nomatch, bool chin, bool chmatch
96115
// For example: A,B,C,B,D,E,A,A => A(TL=1),B(2),C(3),D(4),E(5) => dupMap 1 2 3 5 6 | 8 7 4
97116
// dupLink 7 8 | 6 (blank=0)
98117
unsigned int mapsize = tablelen+nuniq; // lto compilation warning #5760 // +nuniq to store a 0 at the end of each group
99-
int *counts = calloc(nuniq, sizeof(*counts));
100-
int *map = calloc(mapsize, sizeof(*map));
101-
if (!counts || !map) {
102-
// # nocov start
103-
free(counts); free(map);
104-
for (int i=0; i<tablelen; i++) SET_TRUELENGTH(td[i], 0);
105-
savetl_end();
106-
error(_("Failed to allocate %"PRIu64" bytes working memory in chmatchdup: length(table)=%d length(unique(table))=%d"), ((uint64_t)tablelen*2+nuniq)*sizeof(int), tablelen, nuniq);
107-
// # nocov end
108-
}
109-
for (int i=0; i<tablelen; ++i) counts[-TRUELENGTH(td[i])-1]++;
118+
void *vmax = vmaxget();
119+
int *counts = (int *)R_alloc(nuniq, sizeof(*counts));
120+
if (nuniq) memset(counts, 0, sizeof(*counts) * nuniq);
121+
int *map = (int *)R_alloc(mapsize, sizeof(*map));
122+
if (mapsize) memset(map, 0, sizeof(*map) * mapsize);
123+
for (int i=0; i<tablelen; ++i) counts[-hash_lookup(marks, td[i], 0)-1]++;
110124
for (int i=0, sum=0; i<nuniq; ++i) { int tt=counts[i]; counts[i]=sum; sum+=tt+1; }
111-
for (int i=0; i<tablelen; ++i) map[counts[-TRUELENGTH(td[i])-1]++] = i+1; // 0 is left ending each group thanks to the calloc
125+
for (int i=0; i<tablelen; ++i) map[counts[-hash_lookup(marks, td[i], 0)-1]++] = i+1; // 0 is left ending each group thanks to the memset
112126
for (int i=0, last=0; i<nuniq; ++i) {int tt=counts[i]+1; counts[i]=last; last=tt;} // rewind counts to the beginning of each group
113127
for (int i=0; i<xlen; ++i) {
114-
int u = TRUELENGTH(xd[i]);
128+
int u = hash_lookup(marks, xd[i], 0);
115129
if (u<0) {
116130
const int w = counts[-u-1]++;
117131
if (map[w]) { ansd[i]=map[w]; continue; }
118-
SET_TRUELENGTH(xd[i],0); // w falls on ending 0 marker: dups used up; any more dups should return nomatch
119-
// we still need the 0-setting loop at the end of this function because often there will be some values in table that are not matched to at all.
132+
hash_set(marks,xd[i],0); // w falls on ending 0 marker: dups used up; any more dups should return nomatch
120133
}
121134
ansd[i] = nomatch;
122135
}
123-
free(counts);
124-
free(map);
136+
vmaxset(vmax);
125137
} else if (chin) {
138+
#pragma omp parallel for num_threads(getDTthreads(xlen, true))
126139
for (int i=0; i<xlen; i++) {
127-
ansd[i] = TRUELENGTH(xd[i])<0;
140+
ansd[i] = hash_lookup(marks,xd[i],0)<0;
128141
}
129142
} else {
143+
#pragma omp parallel for num_threads(getDTthreads(xlen, true))
130144
for (int i=0; i<xlen; i++) {
131-
const int m = TRUELENGTH(xd[i]);
145+
const int m = hash_lookup(marks,xd[i],0);
132146
ansd[i] = (m<0) ? -m : nomatch;
133147
}
134148
}
135-
for (int i=0; i<tablelen; i++)
136-
SET_TRUELENGTH(td[i], 0); // reinstate 0 rather than leave the -i-1
137-
savetl_end();
138149
UNPROTECT(nprotect); // ans, xd, td
139150
return ans;
140151
}

0 commit comments

Comments
 (0)