diff --git a/srfi-250.html b/srfi-250.html index 5f50379..caf53f4 100644 --- a/srfi-250.html +++ b/srfi-250.html @@ -9,9 +9,9 @@ -

250: Insertion-ordered hash tables

+

250: Insertion-ordered hash tables

-

by John Cowan (shepherd, text), Will Clinger (text), Daphne Preston-Kendal (implementation)

+

by John Cowan and Daphne Preston-Kendal, based on prior SRFIs by Will Clinger and Panu Kalliokoski

Status

@@ -21,7 +21,7 @@

Status

  • Draft #1 published: 2023-11-15
  • This SRFI was withdrawn by the editor on 2024-09-24 because there had been no progress since 11-2023, when the first draft was - published, and because he hadn't been able to reach the author, + published, and because he hadn’t been able to reach the author, John Cowan, since 5-2024, and because no one stepped forward to take over.
  • John has returned, so the editor moved this SRFI back to draft @@ -38,16 +38,16 @@

    Abstract

    -

    Unlike the hash tables -of SRFI 125, which -is the direct ancestor of this specification, the hash tables -described here are ordered by insertion: that is, associations -inserted earlier in the history of the hash table appear earlier in -the ordering. Advances in -the implementations of hash tables, as provided by C++, Python, +

    Unlike the hash tables of SRFI 125, which is the +direct ancestor of this specification, the hash tables described here +are ordered by insertion: that is, associations inserted earlier in +the history of the hash table appear earlier in the ordering. Advances +in the implementations of hash tables, as provided by C++, Python, JavaScript, etc., make the provision of this new facility practical. -As a result, the hash tables of this SRFI do not interoperate with the -hash tables of SRFI -125, SRFI 126, or -existing R6RS implementations.

    +As a result, the hash tables of this SRFI do not necessarily +interoperate with the hash tables of SRFI 125, SRFI 126, or existing +R6RS implementations.

    Issues

    - +None at present.

    Rationale

    -Hash tables themselves don't really need defending: almost all +Hash tables themselves don’t really need defending: almost all dynamically typed languages, from awk to JavaScript to Lua to Perl to Python to Common Lisp, and including many Scheme implementations, provide them in some form as a fundamental @@ -110,14 +105,15 @@

    Rationale

    hash function into a single bundle.

    -The relatively few hash table procedures in R6RS are all available -in this SRFI under somewhat different names. -This SRFI adopts SRFI 69's spelling hash-table -rather than R6RS's hashtable, because of the universal -use of "hash table" rather than "hashtable" in other computer languages and -in technical prose generally. Besides, the English word -hashtable obviously means something that can be ... hashted. -It would be trivial to provide the R6RS names on top of this SRFI.

    +The relatively few hash table procedures in R6RS are all available in +this SRFI under somewhat different names. This SRFI adopts SRFI 69’s +spelling hash-table rather than R6RS’s +hashtable, because of the universal use of ‘hash table’ +rather than ‘hashtable’ in other computer languages and in technical +prose generally.* Besides, the English word hashtable +obviously means something that can be ... hashted. It would be trivial +to provide the R6RS names on top of this SRFI.

    Common Lisp compatibility

    @@ -168,11 +164,14 @@

    Sources

  • The procedures hash-table-unfold and hash-table-count were suggested by SRFI 1.
  • -
  • The procedures hash-table=? and +
  • The procedures hash-table= and hash-table-map were suggested by - Haskell's Data.Map.Strict module.
  • + Haskell’s Data.Map.Strict module.
  • The procedure hash-table-map->list is from Guile.
  • +
  • The cursor-based interface to hash table iteration is based + on the version from + Racket.
  • @@ -196,21 +195,140 @@

    Sources

    and FemtoLisp were also investigated, but no additional procedures were incorporated.

    +

    The hash-table-keys, hash-table-values, +hash-table-entries (from SRFI 69) and the corresponding +vector-based versions (used in R6RS) have been removed from this SRFI +in favour of the cursor-based iteration interface. +

    Pronunciation

    -The slash in the names of some procedures can be pronounced "with".

    +The slash in the names of some procedures can be pronounced ‘with’.

    + +

    Editorial conventions

    + +

    +This SRFI uses a convention from the Racket documentation, which +extends the usual Scheme specification use of ellipsis (‘...’) in +procedure entry headers. + +

    +Namely, when two specified formals are followed by a pair of ellipses, +it means there must be any even number of arguments in sequence +(including zero arguments). The zeroth, second, fourth, etc. actual +arguments are treated as the values of the first specified formal, and +the first, third, fifth, etc. actual arguments as the corresponding +values of the second of the specified formals. + +

    The following formal names used in the specification of procedures + imply the type of their actual arguments. + +

    +
    hash-table +

    A hash table as defined by this SRFI. It is an assertion + violation if the argument is not a hash table object originally + returned from one of the procedures in this SRFI, or by an + implementation-specified set of additional constructor + procedures. In particular, it is implementation-specified + whether the hash table types defined by R6RS, SRFI 69, SRFI 125, + or SRFI 126 will be interoperable with the procedures in this + SRFI. If they are, the hash tables provided in those libraries + must also be insertion-ordered, although this SRFI makes no + guarantees about how any of the procedures of those libraries + will affect the insertion order. +

    k +

    An exact nonnegative integer representing the initial + capacity of a hash table being created (that is, the number of + associations it can hold without having to grow). If not + present, the initial capacity of any hash table is unspecified. + An implementation may be significantly more efficient in time + and/or memory if it is given a correct value of k + when a hash table is created (meaning a value which actually + corresponds to the maximum size of the hash table throughout its + existence). +

    comparator +

    A SRFI 128 comparator with a hash function, which will be + used to provide a hash function, equality predicate, and + potentially type test for a hash table being created. It is an + assertion violation to pass an object which is not a comparator, + or a comparator without a hash function, as the value of a + comparator argument. +

    key +

    An object representing the key of an association or + potential association within a hash table. Implementations + should signal an assertion violation if any value of + key does not satisfy the type test of the hash + table’s comparator, but this is not required. +

    In particular, it is unspecified whether this is an assertion violation: +

    (let ((ht (make-hash-table
    +           (make-comparator integer? = #f number-hash))))
    +  (hash-table-set! ht 1/2 "one half"))
    +

    Because while the comparator specifies integer? as + its type test, the equality predicate = and hash + function number-hash will work on any number + object, including the non-integer 1/2. It is + expected that the majority of comparators’ equality predicates + and hash functions will themselves signal assertion violations + if their input is the wrong type. +

    In cases where multiple key arguments are provided + to procedures which mutate the hash table, it is unspecified + whether, if some but not all of the given keys are + not the right type for the hash table’s comparator, any of the + mutations specified for the other keys will have taken + place before an assertion violation is raised or before the hash + function or equality predicate is called on the problematic + key. However, such cases must not leave the hash + table object in an internally inconsistent state, and the hash + table object following such a case must be in a state which + represents the successful completion of some number of mutations + (potentially zero) that were correctly specified by the call to + the mutating hash table procedure. +

    Unless otherwise specified, in cases where multiple + key arguments are provided to procedures which mutate + a hash table or create a new one, it is unspecified whether + providing the same key (in the sense of the hash table’s + comparator’s hash function and equality predicate) multiple + times will be an assertion violation, or if an arbitrary one of + the provided keys (and potential associated values) + will be used. In the latter case, if a new association is + created for the given keys, it is also unspecified if + the association’s position in the insertion order will represent + the same position as in the sequence of arguments for the value + which is actually chosen, or another position. +

    value +

    An object representing the value of an association or + potential association within a hash table. The hash table + implementation imposes no requirements on the type of such + an object. +

    cursor +

    A hash table cursor as defined by the section ‘Low-level iteration’. As + specified in that section, incorrect use of a hash table + cursor, including passing a cursor of the wrong type, is + undefined behaviour. +

    Specification

    The procedures in this SRFI are in the (srfi 250) library -(or (srfi :250) on R6RS).

    +(or (srfi :250 hash-tables) on R6RS).

    -All references to "executing in expected amortized constant time" -presuppose that a satisfactory hash function is available. Arbitrary -or impure hash functions can make a hash of any implementation.

    +Hash tables may be mutable or immutable. All hash tables created by +the procedures in this SRFI are mutable by default, except hash tables +created by the hash-table-copy procedure with the +mutable? argument set of #f. It is an +assertion violation to attempt to mutate an immutable hash table, +whether by adding new associations, deleting associations, or +changing the value of any association. + +

    +All references to ‘executing in expected amortized constant time’ +presuppose that a satisfactory hash function is available. Arbitrary or +impure hash functions can make a hash of any +implementation.

    Hash tables are allowed to cache the results of calling the equality @@ -219,7 +337,7 @@

    Specification

    operation: it may be called zero, one, or more times.

    -It is an error if the procedure argument of +It is undefined behaviour if the procedure argument of hash-table-find, hash-table-count, hash-table-map, hash-table-for-each, hash-table-map!, hash-table-map->list, @@ -227,11 +345,6 @@

    Specification

    hash-table-fold-right, or hash-table-prune! mutates the hash table being walked.

    -

    -It is an error to pass two hash tables that have different (in the -sense of eq?) comparators to any of the procedures of -this SRFI.

    -

    Implementations are permitted to ignore user-specified hash functions in certain circumstances. Specifically, if the @@ -247,74 +360,103 @@

    Specification

    of course be exposed by an implementation as an extension, with suitable warnings against inappropriate uses.

    -

    It is an error to mutate a key during or after its insertion -into a hash table in such a way that the hash function of the -table will return a different result when applied to that key.

    +

    It is undefined behaviour to mutate a key during or after its +insertion into a hash table in such a way that the hash function of +the table will return a different result when applied to that key.

    + +

    Unless otherwise specified, procedures whose names end in +! return unspecified values.

    Index

    Constructors

    -

    Note that the argument k is a positive integer -representing the initial capacity of the hashtable being created (that -is, the number of associations it can hold without having to grow). -If not present, the initial capacity is implementation-dependent.

    +

    Note: The examples in the subsequent sections of this SRFI +assume that the example variable definitions in this section have been +run. (The mutations of the defined hash tables do not accumulate +between examples, though.) -

    (make-hash-table comparator [ k ])

    +

    + (make-hash-table comparator)
    + (make-hash-table comparator k) +

    Returns a newly allocated hash table whose equality predicate and -hash function are extracted from comparator.

    +hash function are extracted from comparator.

    + +

    +It is an assertion violation if comparator is not a hash comparator.

    As mentioned above, implementations are free to use an appropriate @@ -332,120 +474,239 @@

    Constructors

    (R6RS make-eq-hashtable, make-eqv-hashtable, and make-hashtable; Common Lisp make-hash-table)

    -

    -(hash-table comparator [ key value ] ...)

    +

    Example: + +

    (define tiny-table
    +  (make-hash-table (make-comparator number? = #f number-hash)))
    +;; tiny-table => #<an empty hash table whose future keys could be numbers>
    + +

    + (hash-table comparator key0 value0 ... ...) +

    Returns a newly allocated hash table, created as if by -make-hash-table using comparator. +make-hash-table using comparator. For each pair of arguments, an association is added to the -new hash table with key as its key and value +new hash table with key as its key and value as its value. -This procedure returns an immutable hash table. -If the same key (in the sense of the equality predicate) is -specified more than once, it is an error.

    -

    -(hash-table-unfold stop? mapper successor seed comparator [ k ])

    +

    Example: -

    -Create a new hash table as if by make-hash-table using -comparator and the args. If the result of applying -the predicate stop? to seed is true, return the hash -table. Otherwise, apply the procedure mapper to seed. -Mapper returns two values, which are inserted into the hash -table as the key and the value respectively. Then get a new seed by -applying the procedure successor to seed, and repeat -this algorithm.

    +
    (define suits-table
    +  (hash-table (make-comparator symbol? symbol=? #f symbol-hash)
    +              'clubs #\x2663
    +              'diamonds #\x2666
    +              'hearts #\x2665
    +              'spades #\x2660))
    +;; suits-table => #<a hash table with four associations,
    +;;                  mapping the names of the suits of cards
    +;;                  as symbols to the Unicode codepoints of
    +;;                  their designs as characters>
    + +

    + (hash-table-unfold stop? mapper successor seed comparator)
    + (hash-table-unfold stop? mapper successor seed comparator k) +

    -(alist->hash-table alist comparator [ k ])

    +Create a new hash table as if by make-hash-table using +comparator (and, if given, the value of k). If +the result of applying the predicate stop? to +seed is true, return the hash table. Otherwise, apply the +procedure mapper to seed. Mapper +returns two values, which are inserted into the hash table as the key +and the value respectively. Then get a new seed by applying the +procedure successor to seed, and repeat this +algorithm. The associations are inserted in left-to-right order +beginning with the result of the first call to mapper. + +

    +If multiple calls to the mapper return the same key, it is +unspecified whether it is an assertion violation or whether one of the +values will be chosen. In this case, the position of the association +for that key in the insertion order is also unspecified. + +

    +This procedure may not be continuation-safe. + +

    Example: + +

    (define alphabet-table
    +  (hash-table-unfold (lambda (c) (char>? c #\z))
    +                     (lambda (c) (values c (char-upcase c)))
    +                     (lambda (c) (integer->char (+ 1 (char->integer c))))
    +                     #\a
    +                     char-comparator
    +                     26))
    +;; alphabet-table => #<a hash table mapping the 26 lowercase
    +                      Basic Latin letters to their uppercase
    +                      counterparts>
    +
    + +

    Because the above example provides a value of k which is + correct for the final size of the hash table, the unfold operation + should be more efficient and the resulting hash table use less + memory than if k had not been provided, or if it had been + too large or too small. (Unless, of course, the implementation’s + default value of k happens to be exactly 26.) + +

    + (alist->hash-table alist comparator)
    + (alist->hash-table alist comparator k) +

    Returns a newly allocated hash-table as if by -make-hash-table using comparator and the -optional k value. It is then initialized from the -associations of alist. Key-value pairings are stored in the -created hash table in reverse order to the one in which they appear -in the input alist, and associations earlier in the list take -precedence over those that come later.

    +make-hash-table using comparator and the +optional k value. It is then initialized from the +associations of alist. Key-value pairings are stored in the +created hash table in reverse order to the one in which they appear in +the input alist, and, in the case of duplicate keys, +associations earlier in the list take precedence over those that come +later.

    + +

    Example: + +

    (define telephone-numbers-table
    +  (alist->hash-table '((116123 . emotional-support)
    +                       (116117 . medical-advice)
    +                       (112 . emergency)
    +                       (112 . ambulance)
    +                       (112 . fire)
    +                       (110 . police))
    +                     (make-comparator exact-integer? = #f number-hash)))
    +;; telephone-numbers-table => #<a hash table mapping some
    +                                useful German telephone numbers
    +                                in ascending order; 112 is mapped
    +                                to the symbol emergency>

    Predicates

    -

    -(hash-table? obj)

    +

    + (hash-table? obj) +

    -Returns #t if obj is a hash table, and +Returns #t if obj is a hash table, and #f otherwise. (R6RS hashtable?; Common Lisp hash-table-p)

    -

    -(hash-table-contains? hash-table key)

    +

    Examples: -

    Returns #t if there is any association to key -in hash-table, and #f otherwise. Must execute +

    (hash-table? tiny-table)  ;=> #t
    +(hash-table? suits-table) ;=> #t
    +(hash-table? '((this-is . an-alist) (not-a . hash-table))) ;=> #f
    + +

    +(hash-table-contains? hash-table key) +

    + +

    Returns #t if there is any association to key +in hash-table, and #f otherwise. Must execute in expected amortized constant time. -(R6RS hashtable-contains?)

    -(hash-table-empty? hash-table)

    +(R6RS hashtable-contains?)

    + +

    Examples: + +

    (hash-table-contains? alphabet-table #\q) ;=> #t
    +(hash-table-contains? alphabet-table #\&) ;=> #f
    + +

    +(hash-table-empty? hash-table) +

    -Returns #t if hash-table contains no associations, +Returns #t if hash-table contains no associations, and #f otherwise.

    -

    -(hash-table= same? hash-table1 hash-table2)

    +

    Example: -

    -Returns #t if hash-table1 and -hash-table2 have the same keys (in the sense -of their common equality predicate) and each key has the same -value (in the sense of the same? procedure), and -#f otherwise.

    +
    (hash-table-empty? tiny-table)  ;=> #t
    +(hash-table-empty? suits-table) ;=> #f
    -

    -(hash-table-mutable? hash-table)

    +

    + (hash-table-mutable? hash-table) +

    Returns #t if the hash table is mutable. (R6RS hashtable-mutable?)

    +

    Examples: + +

    (hash-table-mutable? tiny-table) ;=> #t
    +(hash-table-mutable? (hash-table-copy alphabet-table #f))
    +                                 ;=> #f
    +

    Accessors

    The following procedures, given a key, return the corresponding value.

    -

    -(hash-table-ref hash-table key [ failure [ success ] ])

    +

    + (hash-table-ref hash-table key)
    + (hash-table-ref hash-table key failure)
    + (hash-table-ref hash-table key failure success) +

    -Extracts the value associated to key in hash-table, -invokes the procedure success on it, and returns its result; -if success is not provided, then the value itself is returned. -If key is not contained in hash-table and -failure is supplied, then failure is invoked -on no arguments and its result is returned. Otherwise, it is -an error. Must execute in expected amortized constant time, -not counting the time to call the procedures.

    +Extracts the value associated to key in +hash-table, invokes the procedure success on it, +and returns its result; if success is not provided, then +the value itself is returned. If key is not contained in +hash-table and failure is supplied, then +failure is invoked on no arguments and its result is +returned. Otherwise, it is an assertion violation. Must execute in +expected amortized constant time, not counting the time to call the +success and failure procedures.

    -

    -(hash-table-ref/default hash-table key default)

    +

    Examples: + +

    (hash-table-ref suits-table 'hearts) ;=> #\x2665
    +(hash-table-ref suits-table 'joker)  ; assertion violation
    +(hash-table-ref suits-table 'joker (lambda () 'ha-ha))
    +                                     ;=> ha-ha
    +
    +(hash-table-ref suits-table
    +                'hearts
    +                (lambda () (assertion-violation #f "no love for hearts?!"))
    +                (lambda (char) (show #t (as-red char))))
    +                                     ; shows a red heart character
    +
    +(hash-table-ref suits-table
    +               'joker
    +                (lambda () 'foo!)
    +                (lambda (char) (show #t (as-red char))))
    +                                     ;=> foo!
    +
    + +

    + (hash-table-ref/default hash-table key default) +

    Semantically equivalent to, but may be more efficient than, the following code:

    -(hash-table-ref hash-table key (lambda () default))

    +(hash-table-ref hash-table key (lambda () default))

    (R6RS hashtable-ref; Common Lisp gethash)

    -

    -(hash-table-comparator hash-table)

    +

    Examples: + +

    (hash-table-ref/default suits-table 'joker 'ha-ha)  ;=> ha-ha
    +(hash-table-ref/default suits-table 'hearts 'ha-ha) ;=> #\x2665
    + + +

    + (hash-table-comparator hash-table) +

    Returns a hash comparator whose equality function and hash function @@ -467,99 +728,242 @@

    Accessors

    the R6RS (rnrs hashtables (6)) library in terms of this library without a means of inspection. +

    Example: The hash-table-empty-copy procedure +could be implemented as follows. + +

    (define (hash-table-empty-copy ht)
    +  (make-hash-table (hash-table-comparator ht)
    +                   (hash-table-size ht)))
    +

    Mutators

    The following procedures alter the associations in a hash table either unconditionally or conditionally on the presence or absence -of a specified key. It is an error to add an association to a hash -table whose key does not satisfy the type test predicate of the -comparator used to create the hash table.

    +of a specified key.

    + +

    + (hash-table-add! hash-table key1 value1 ... ...) +

    -(hash-table-set! hash-table arg ...)

    +Repeatedly mutates hash-table, creating new associations in +it by processing the arguments from left to right. For each of the +pairs of keys and values, a new association is +created at the end of the hash table ordering, associating the +key with the value. + +

    Must execute in expected amortized constant time per key.

    -Repeatedly mutates hash-table, creating new associations in +It is an assertion violation if any of the keys already has +an association in the hash-table. It is also an assertion +violation if the same key (in the sense of the hash table’s +hash function and equality predicate) is given multiple times. + +

    Examples: + +

    (define irc-alphabet-table (hash-table-copy alphabet-table #t))
    +
    +;; IRC is a Finnish invention. In Finland, the punctuation characters
    +;; {, |, and } had uppercase variants [, \, and ] until the 1990s, when
    +;; this convention was abolished by ISO 8859 and 10646. However, the IRC
    +;; protocol continues to consider these characters to be case variants
    +;; of one another.
    +(hash-table-add! irc-alphabet-table #\{ #\[
    +                                    #\| #\\
    +                                    #\} #\])
    +
    +(hash-table-add! alphabet-table #\w #\Ƿ) ; assertion violation
    + +

    + (hash-table-replace! hash-table key1 value1 ... ...) +

    + +

    +Repeatedly mutates hash-table, replacing the value in each +association for the given key with the corresponding +value. + +

    Must execute in expected amortized constant time per key. + +

    +It is an assertion violation if any of the keys does not have +an association in the hash-table. + +

    Examples: + +

    ;; Replace the red-coloured suit symbols with their hollow variants
    +(hash-table-replace! suits-table 'hearts #\x2661
    +                                 'diamonds #\x2662)
    +
    +(hash-table-replace! alphabet-table #\þ #\Þ) ; assertion violation
    + +

    + (hash-table-set! hash-table key1 value1 ... ...) +

    + +

    +Repeatedly mutates hash-table, creating new associations in it by processing the arguments from left to right. Newly created -associations are added to the end of the hash table ordering. The -args alternate between keys and values. However, if there is -a previous association for a key, its value is updated and the -corresponding association remains in the same position in the -ordering. It is an error if the type check procedure of the comparator -of hash-table, when invoked on a key, does not return -#t. Likewise, it is an error if a key is not a valid -argument to the equality predicate of hash-table. Returns an -unspecified value. Must execute in expected amortized constant time -per key. R6RS hashtable-set! and Common Lisp (setf -gethash) do not handle multiple associations.

    +associations are added to the end of the hash table ordering. However, +if there is a previous association for any key, its value +is updated to the given value and the corresponding +association remains in the same position in the ordering. + +

    Must execute in expected amortized constant time per key.

    -(hash-table-delete! hash-table key ...)

    +Note: The hash-table-add! and +hash-table-replace! procedures should often be used in +preference to this procedure, in order to more faithfully represent +the expectation either that a new association will be created for each +key, or that each key will already have an extant +association. In cases where an ‘upsert’ operation is intended, the +hash-table-intern! and hash-table-update! +procedures are also often more expressive than the equivalent +operation expressed directly in terms of hash-table-set!. + +

    Examples: The above examples of hash-table-add! +with the irc-alphabet-table and +hash-table-replace! with the suits-table +work identically if those procedures are replaced by +hash-table-set!. In addition, the assertion violation +examples of those procedures with the alphabet-table are not violations if +hash-table-set! is used instead: + +

    (hash-table-set! alphabet-table #\w #\Ƿ)
    +(hash-table-set! alphabet-table #\þ #\Þ)
    + +

    + (hash-table-delete! hash-table key ...) +

    -Deletes any association to each key in hash-table -and returns the number of keys that had associations. Must execute -in expected amortized constant time per key. R6RS -hashtable-delete! and Common Lisp remhash -do not handle multiple associations.

    +Deletes any association to each key in +hash-table and returns the number of keys that had +associations. If any key is given more than once, it is +unspecified whether it is counted more than once in the return value. + +

    Must execute in expected amortized constant time per key.

    -(hash-table-intern! hash-table key failure)

    +R6RS hashtable-delete! and Common Lisp +remhash do not handle multiple associations.

    + +

    Example: + +

    (hash-table-delete! alphabet-table #\a #\e #\i #\o #\u #\{ #\| #\}) ;=> 5
    + +

    + (hash-table-intern! hash-table key failure) +

    Effectively invokes hash-table-ref with the given -arguments and returns what it returns. If key was not -found in hash-table, its value is set to the result of -calling failure. Must execute in expected amortized constant time.

    +arguments and returns what it returns. If key was not found +in hash-table, its value is set to the result of calling +failure and that value is returned. Must execute in expected +amortized constant time.

    -(hash-table-update! hash-table key updater [ failure [ success ] ]) -

    +It is an assertion violation to use this procedure on an immutable +hash table, even if the key has an association in the hash +table. + +

    Examples: + +

    (hash-table-intern! alphabet-table #\z (lambda () #\Ȝ))
    +                      ;=> #\Z, and the hash table is unchanged
    +(hash-table-intern! alphabet-table #\ȝ (lambda () #\Ȝ))
    +                      ;=> #\Ȝ, and the hash table is updated
    +                      ;   with a new association from lowercase
    +                      ;   to uppercase yogh
    + +

    + (hash-table-update! hash-table key updater)
    + (hash-table-update! hash-table key updater failure)
    + (hash-table-update! hash-table key updater failure success) +

    Semantically equivalent to, but may be more efficient than, the following code:

    -(hash-table-set! hash-table key (updater (hash-table-ref hash-table key failure success)))

    +(hash-table-set! hash-table key (updater (hash-table-ref hash-table key failure success)))

    -Must execute in expected amortized constant time. Returns an unspecified value.

    +Must execute in expected amortized constant time. -

    -(hash-table-update!/default hash-table key updater default)

    +

    Example: + +

    (hash-table-update! tiny-table
    +                    12
    +                    (lambda (x) (+ x 1))
    +                    (lambda () (hash-table-size tiny-table))
    +                    values)
    +               ; tiny-table now associates the key 12 with 1
    + +

    + (hash-table-update!/default hash-table key updater default) +

    Semantically equivalent to, but may be more efficient than, the following code:

    -(hash-table-set! hash-table key (updater (hash-table-ref/default hash-table key default)))

    +(hash-table-set! hash-table key (updater (hash-table-ref/default hash-table key default)))

    (R6RS hashtable-update!)

    -Must execute in expected amortized constant time. Returns an unspecified value.

    +Must execute in expected amortized constant time. -

    -(hash-table-pop! hash-table)

    +

    Example: + +

    (hash-table-update!/default tiny-table
    +                            12
    +                            (lambda (x) (+ x 1))
    +                            0)
    +               ; tiny-table now associates the key 12 with 1
    + +

    + (hash-table-pop! hash-table) +

    Chooses the last, most recently added association from -hash-table and removes it, returning the key and value as two +hash-table and removes it, returning the key and value as two values.

    -

    -It is an error if hash-table is empty.

    +

    Must execute in expected amortized constant time.

    -(hash-table-clear! hash-table)

    +It is an assertion violation if hash-table is empty.

    + +

    Examples: + +

    (hash-table-pop! alphabet-table) ;=> #\z #\Z, and alphabet-table no
    +                                 ;   longer contains this association
     
    +(hash-table-pop! tiny-table) ; assertion violation
    + +

    + (hash-table-clear! hash-table) +

    -Delete all the associations from hash-table. -(R6RS hashtable-clear!; Common Lisp clrhash)

    +Delete all the associations from hash-table. The +implementation may assume that the hash-table will later be +re-filled with the same number of associations. (R6RS +hashtable-clear!; Common Lisp clrhash)

    + +

    Example: + +

    (hash-table-clear! suits-table)
    +(hash-table-empty? suits-table) ;=> #t

    The whole hash table

    @@ -567,61 +971,91 @@

    The whole hash table

    These procedures process the associations of the hash table in insertion order.

    -

    -(hash-table-size hash-table)

    +

    + (hash-table-size hash-table) +

    -Returns the number of associations in hash-table as an +Returns the number of associations in hash-table as an exact integer. Must execute in constant time. (R6RS hashtable-size; Common Lisp hash-table-count.)

    -

    -(hash-table-keys hash-table) -
    -(hash-table-key-vector hash-table) +

    Examples: -

    -Returns a newly allocated list/vector of all the keys in hash-table. -R6RS hashtable-keys returns a vector.

    +
    (hash-table-size tiny-table)     ;=> 0
    +(hash-table-size suits-table)    ;=> 4
    +(hash-table-size alphabet-table) ;=> 26
    -

    -(hash-table-values hash-table) -
    -(hash-table-value-vector hash-table) -R6RS hashtable-values returns a vector.

    -

    -Returns a newly allocated list/vector of all the keys in hash-table.

    +

    + (hash-table= same? hash-table1 hash-table2) +

    -(hash-table-entries hash-table) -
    -(hash-table-entry-vectors hash-table)

    +Returns #t if hash-table1 and +hash-table2 have the same keys (in the sense +of their common equality predicate) and each key has the same +value (in the sense of the same? procedure), and +#f otherwise.

    -Returns two values, a newly allocated list/vector of all the keys in -hash-table and a newly allocated list/vector of all the values -in hash-table in the corresponding order. R6RS -hash-table-entries returns vectors.

    +It is an assertion violation if the equality predicates of +hash-table1 and hash-table2 +are not the same in the sense of the eqv? procedure. On +R6RS implementations where eqv? is not usefully defined +on procedures by the implementation, this assertion violation is not +required to be raised. -

    -(hash-table-find proc hash-table failure)

    +

    Examples: -

    -For each association of hash-table, invoke proc -on its key and value. If proc returns true, then -hash-table-find returns what proc returns. -If all the calls to proc return #f, return -the result of invoking the thunk failure.

    +
    (define alphabet-table*
    +  (hash-table-map (lambda (k v) (char-downcase v)) alphabet-table))
    +
    +(hash-table= char-ci=? alphabet-table alphabet-table*) ;=> #t
    +(hash-table= char=? alphabet-table alphabet-table*)    ;=> #f
    + +

    + (hash-table-find proc hash-table failure) +

    -(hash-table-count pred hash-table)

    +For each association of hash-table, invoke proc +on its key and value. If proc returns true, then +hash-table-find returns what proc returns. +If all the calls to proc return #f, return +the result of invoking the thunk failure.

    + +

    Examples: + +

    (hash-table-find (lambda (number service)
    +                   (and (eq? service 'police)
    +                        number))
    +                 telephone-numbers-table
    +                 (lambda () 1312))
    +               ;=> 110
    +
    +(hash-table-find (lambda (number service)
    +                   (and (eq? service 'private-detective)
    +                        number))
    +                 telephone-numbers-table
    +                 (lambda () 'no-such-service))
    +               ;=> no-such-service
    + +

    + (hash-table-count pred hash-table) +

    -For each association of hash-table, invoke pred -on its key and value. Return the number of calls to pred +For each association of hash-table, invoke pred +on its key and value. Return the number of calls to pred which returned true.

    +

    Example: + +

    (hash-table-count (lambda (number service) (even? number))
    +                  telephone-numbers-table)
    +               ;=> 2
    +

    Low-level iteration

    @@ -644,7 +1078,7 @@

    Low-level iteration

    created.
  • Temporal limitation: a hash table cursor object can only be used as long as no associations in the hash table have been created, - deleted, or move since the operation which returned that cursor + deleted, or moved since the operation which returned that cursor object. @@ -652,260 +1086,533 @@

    Low-level iteration

    It is undefined behaviour to use a hash table cursor in any way which violates these limitations. -

    -(hash-table-cursor-first hash-table) +

    Examples: The procedures hash-table-fold-left and hash-table-fold-right could be implemented as follows. + +

    (define (hash-table-fold-left proc seed ht)
    +  (let loop ((cur (hash-table-cursor-first ht))
    +             (acc seed))
    +    (if (hash-table-cursor-at-end? ht cur)
    +        acc
    +        (loop (hash-table-cursor-next ht cur)
    +              (proc acc
    +                    (hash-table-cursor-key ht cur)
    +                    (hash-table-cursor-value ht cur))))))
    +
    +(define (hash-table-fold-right proc seed ht)
    +  (let loop ((cur (hash-table-cursor-last ht))
    +             (acc seed))
    +    (if (hash-table-cursor-at-end? ht cur)
    +        acc
    +        (loop (hash-table-cursor-previous ht cur)
    +              (proc (hash-table-cursor-key ht cur)
    +                    (hash-table-cursor-value ht cur)
    +                    acc)))))
    + +

    The procedure hash-table-map! could be implemented as + follows. + +

    (define (hash-table-map! proc ht)
    +  (let loop ((cur (hash-table-cursor-first ht)))
    +    (if (hash-table-cursor-at-end? ht cur)
    +        ht
    +        (let ((new-val
    +               (call-with-values
    +                   (lambda () (hash-table-cursor-key+value ht cur))
    +                 proc)))
    +          (hash-table-cursor-value-set! ht cur new-val)
    +          (loop (hash-table-cursor-next ht cur))))))
    + +

    +(hash-table-cursor-first hash-table)
    -(hash-table-cursor-last hash-table) -

    +(hash-table-cursor-last hash-table) +

    Return a hash table cursor pointing, respectively, at the first or -last association in the given hash-table. +last association in the given hash-table. + +

    Must execute in expected amortized constant time. + +

    + (hash-table-cursor-for-key hash-table key) +

    -(hash-table-cursor-next hash-table cursor) -

    +Returns a hash table cursor pointing at the association for the given +key in the hash table. If there is no association for the +key, returns a hash table cursor in the end state. + +

    Must execute in expected amortized constant time. + +

    +(hash-table-cursor-next hash-table cursor) +

    Returns a hash table cursor pointing to the association in the given -hash-table which comes immediately after the association -referred to by the input cursor. +hash-table which comes immediately after the association +referred to by the input cursor. Must execute in expected amortized constant time.

    -If the given cursor refers to the last association in the +If the given cursor refers to the last association in the hash table, the returned hash table cursor is in the end state.

    -If the given cursor is already in the end state, it is +If the given cursor is already in the end state, it is undefined behaviour. -

    -(hash-table-cursor-previous hash-table cursor) -

    +

    +(hash-table-cursor-previous hash-table cursor) +

    Returns a hash table cursor pointing to the association in the given -hash-table which comes immediately before the association -referred to by the input cursor. +hash-table which comes immediately before the association +referred to by the input cursor. Must execute in expected +amortized constant time.

    -If the given cursor refers to the first association in the +If the given cursor refers to the first association in the hash table, the returned hash table cursor is in the end state.

    -If the given cursor is already in the end state, it is +If the given cursor is already in the end state, it is undefined behaviour. -

    -(hash-table-cursor-key hash-table cursor) +

    +(hash-table-cursor-key hash-table cursor)
    -(hash-table-cursor-value hash-table cursor) -

    +(hash-table-cursor-value hash-table cursor) +

    Return, respectively, the key or value of the association in the -hash-table to which the given cursor refers. +hash-table to which the given cursor refers. +Must execute in expected amortized constant time.

    -If the given cursor is in the end state, it is undefined +If the given cursor is in the end state, it is undefined behaviour. -

    -(hash-table-cursor-key+value hash-table cursor) -

    +

    +(hash-table-cursor-key+value hash-table cursor) +

    Returns two values: the key and value of the association in the -hash-table to which the given cursor refers. +hash-table to which the given cursor refers. +Must execute in expected amortized constant time. + +

    +If the given cursor is in the end state, it is +undefined behaviour. + +

    + (hash-table-cursor-value-set! hash-table cursor value) +

    + +

    +Replaces the value of the association in the hash-table to +which the given cursor refers with value. Must +execute in expected amortized constant time.

    -(hash-table-cursor-at-end? hash-table cursor) -

    +If the given cursor is in the end state or if +hash-table is immutable, it is undefined behaviour. + +

    +(hash-table-cursor-at-end? hash-table cursor) +

    Returns #t if the given cursor is in the end -state in the given hash-table, and #f otherwise. +state in the given hash-table, and #f +otherwise. Must execute in expected amortized constant time.

    Mapping and folding

    These procedures process the associations of the hash table in -insertion order.

    +insertion order, unless otherwise noted.

    -

    -(hash-table-map proc hash-table)

    +

    + (hash-table-map proc hash-table) +

    Returns a newly allocated hash table as if by -(hash-table-empty-copy hash-table). -Calls proc for every association in hash-table +(hash-table-empty-copy hash-table). +Calls proc for every association in hash-table with the key and value of the association. The key of the association -and the result of invoking proc are entered into the +and the result of invoking proc are entered into the new hash table. Note that this is not the result of lifting mapping over the domain of hash tables, but it is considered more useful.

    -

    If comparator recognizes multiple keys in the hash-table -as equivalent, any one of such associations is taken.

    +

    Example: -

    -(hash-table-for-each proc hash-table)

    +
    (hash-table-map (lambda (k v) (string v k)) alphabet-table)
    +               ;=> #<a hash table mapping each lowercase
    +                     letter of the basic Latin alphabet to
    +                     a string containing that letter in
    +                     uppercase and lowercase forms>
    + +

    + (hash-table-map! proc hash-table) +

    -Calls proc for every association in hash-table +Calls proc for every association in hash-table with two arguments: the key of the association and the value of -the association. The value returned by proc is discarded. -Returns an unspecified value.

    +the association. The value returned by proc is used to +update the value of the association. Returns hash-table.

    -

    -(hash-table-map! proc hash-table)

    +

    Example: -

    -Calls proc for every association in hash-table -with two arguments: the key of the association and the value of -the association. The value returned by proc is used to -update the value of the association. Returns an unspecified value.

    +
    (hash-table-map! (lambda (k v)
    +                   (char-upcase
    +                    (string-ref (symbol->string k) 0)))
    +                 suits-table)
    +               ;=> #<the suits-table, which now maps
    +                    the names of suits as symbols to the
    +                    uppercase first letters of their names
    +                    as characters>
    -

    -(hash-table-map->list proc hash-table)

    +

    + (hash-table-for-each proc hash-table) +

    -Calls proc for every association in hash-table +Calls proc for every association in hash-table with two arguments: the key of the association and the value of -the association. The values returned by the invocations of -proc are accumulated into a list, which is returned.

    +the association. The value returned by proc is discarded. +Returns an unspecified value.

    + +

    Example: + +

    (hash-table-for-each (lambda (k v) (display k)) alphabet-table)
    +               ; displays ‘abcdefghijklmnopqrstuvwxyz’
    + +

    See also the example of hash-table-empty-copy. + +

    + (hash-table-map->list proc hash-table) +

    -(hash-table-fold proc seed hash-table)

    +Calls proc for every association in hash-table +with two arguments: the key of the association and the value of the +association. The values returned by the invocations of proc +are accumulated into a list in the insertion order of the +hash-table, which is returned.

    + +

    Example: + +

    (hash-table-map->list (lambda (k v) (string v k))
    +                      alphabet-table)
    +               ;=> ("Aa" "Bb" "Cc" "Dd" ...)
    + +

    + (hash-table-fold proc seed hash-table) +

    -Calls proc for every association in hash-table +Calls proc for every association in hash-table with three arguments: the key of the association, the value of -the association, and an accumulated value val. -Val is seed for the first invocation of -procedure, and for subsequent invocations of proc, +the association, and an accumulated value val. +Val is seed for the first invocation of +procedure, and for subsequent invocations of proc, the returned value of the previous invocation. The value returned by hash-table-fold is the return value of the last -invocation of proc.

    +invocation of proc.

    -The proc is invoked for the associations in an unspecified +The proc is invoked for the associations in an unspecified order. To fold over the associations in order, see the next entries.

    Rationale: An implementation may be able to provide more efficient iteration in an unspecified order than in insertion order -when the order is not significant for the proc. +when the order is not significant for the proc.

    -(hash-table-fold-left proc seed hash-table) -

    +Example: See the example for hash-table-fold-right, +except that the resulting list when using hash-table-fold +may be in any order and not alphabetical. + +

    +(hash-table-fold-left proc seed hash-table) +

    -Calls proc for every association in hash-table, in +Calls proc for every association in hash-table, in order from oldest to newest, with three arguments: an accumulated -value val, the key of the association, and the value of the -association. Val is seed for the first invocation of -procedure, and for subsequent invocations of proc, +value val, the key of the association, and the value of the +association. Val is seed for the first invocation of +procedure, and for subsequent invocations of proc, the returned value of the previous invocation. The value returned by hash-table-fold is the return value of the last -invocation of proc.

    +invocation of proc.

    -

    -(hash-table-fold-right proc seed hash-table) -

    +

    Example: + +

    (hash-table-fold-left (lambda (ls k v)
    +                        (cons v ls))
    +                      '()
    +                      suits-table)
    +          ;=> (#\x2660 #\x2665 #\x2666 #\x2663)
    + + +

    +(hash-table-fold-right proc seed hash-table) +

    -Calls proc for every association in hash-table, in +Calls proc for every association in hash-table, in order from oldest to newest, with three arguments: the key of the association, the value of the association, and an accumulated value -val. Val is seed for the first invocation -of procedure, and for subsequent invocations of -proc, the returned value of the previous invocation. The +val. Val is seed for the first invocation +of procedure, and for subsequent invocations of +proc, the returned value of the previous invocation. The value returned by hash-table-fold is the return value of -the last invocation of proc.

    +the last invocation of proc.

    -

    -(hash-table-prune! proc hash-table)

    +

    Example: + +

    (hash-table-fold-right (lambda (k v ls)
    +                         (cons k ls))
    +                       '()
    +                       alphabet-table)
    +          ;=> (#\a #\b #\c #\d ...)
    + +

    + (hash-table-prune! proc hash-table) +

    -Calls proc for every association in hash-table +Calls proc for every association in hash-table with two arguments, the key and the value of the association, and -removes all associations from hash-table for which -proc returns true. Returns the number of associations that were removed.

    +removes all associations from hash-table for which +proc returns true. Returns the number of associations that were removed.

    + +

    Example: + +

    (hash-table-prune! (lambda (k v) (even? k))
    +                   telephone-numbers-table)
    +               ;=> 2, and telephone-numbers-table now contains only
    +                   associations for 116117 and 116123

    Copying and conversion

    -

    -(hash-table-copy hash-table [ mutable? ])

    +

    + (hash-table-copy hash-table)
    + (hash-table-copy hash-table mutable?) +

    Returns a newly allocated hash table with the same properties -and associations as hash-table. If the second argument +and associations as hash-table. If the second argument is present and is true, the new hash table is mutable. Otherwise it is immutable provided that the implementation supports immutable hash tables. (R6RS hashtable-copy)

    -

    -(hash-table-empty-copy hash-table)

    +

    Example: See the examples for +hash-table-mutable? and hash-table-add! + +

    + (hash-table-empty-copy hash-table) +

    Returns a newly allocated mutable hash table with the same comparator -as hash-table, but with no associations. The implementation +as hash-table, but with no associations. The implementation may assume that the returned hash table will eventually contain as -many associations as does the original hash-table.

    +many associations as does the original hash-table.

    -

    -(hash-table->alist hash-table)

    +

    Example: The procedure hash-table-map could be + implemented as follows. + +

    (define (hash-table-map proc ht)
    +  (let ((new-ht (hash-table-empty-copy ht)))
    +    (hash-table-for-each (lambda (k v)
    +                           (hash-table-add! new-ht k (proc k v)))
    +                         ht)
    +    new-ht))
    + +

    + (hash-table->alist hash-table) +

    -Returns an alist with the same associations as hash-table +Returns an alist with the same associations as hash-table in reverse insertion order.

    +

    Example: + +

    (hash-table->alist telephone-numbers-table)
    +               ;=> ((116123 . emotional-support)
    +                    (116117 . medical-advice)
    +                    (112 . emergency)
    +                    (110 . police))
    +

    Hash tables as sets

    -

    -(hash-table-union! hash-table1 hash-table2)

    +

    The following examples assume the irc-alphabet-table +from the example of hash-table-add! has been defined. + +

    +(hash-table-union! hash-table1 hash-table2)

    -Adds the associations of hash-table2 to -hash-table1 and returns hash-table1. +Adds the associations of hash-table2 to +hash-table1 and returns hash-table1. If a key appears in both hash tables, its value is set to the value -appearing in hash-table1. Returns -hash-table1.

    +appearing in hash-table1. Returns +hash-table1. -

    -(hash-table-intersection! hash-table1 hash-table2)

    +

    Example: -

    -Deletes the associations from hash-table1 -whose keys don't also appear in hash-table2 and -returns hash-table1.

    +
    (hash-table-union! alphabet-table irc-alphabet-table)
    +                ;=> alphabet-table, now also containing the
    +                ;   additional associations from irc-alphabet-table
    -

    -(hash-table-difference! hash-table1 hash-table2)

    +

    +(hash-table-intersection! hash-table1 hash-table2)

    -Deletes the associations of hash-table1 whose -keys are also present in hash-table2 and returns -hash-table1.

    +Deletes the associations from hash-table1 +whose keys don’t also appear in hash-table2 and +returns hash-table1.

    + +

    Example: + +

    (hash-table-intersection! alphabet-table irc-alphabet-table)
    +                ;=> alphabet-table, now empty
    + + +

    +(hash-table-difference! hash-table1 hash-table2)

    -(hash-table-xor! hash-table1 hash-table2)

    +Deletes the associations of hash-table1 whose +keys are also present in hash-table2 and returns +hash-table1.

    + +

    Example: + +

    (hash-table-difference! alphabet-table irc-alphabet-table)
    +                ; alphabet-table, now only containing associations
    +                ; for #\{, #\|, and #\}
    + +

    +(hash-table-xor! hash-table1 hash-table2)

    -Deletes the associations of hash-table1 whose -keys are also present in hash-table2, and then -adds the associations of hash-table2 whose keys -are not present in hash-table1 to -hash-table1. Returns hash-table1.

    +Deletes the associations of hash-table1 whose +keys are also present in hash-table2, and then +adds the associations of hash-table2 whose keys +are not present in hash-table1 to +hash-table1. Returns hash-table1.

    + +

    Example: + +

    (hash-table-xor! alphabet-table
    +                 (hash-table char-comparator
    +                             #\a #\A
    +                             #\e #\E
    +                             #\i #\I
    +                             #\o #\O
    +                             #\u #\U
    +                             #\{ #\[
    +                             #\| #\\
    +                             #\} #\]))
    +                ;=> alphabet-table, now only containing entries for
    +                ;   consonants and Fennoscandian vocalic punctuation

    Implementation

    -

    -The current sample implementation is unfinished, and is here. When it is finished, it will be moved into this SRFI's repo. It relies upon -SRFI 128.

    +

    The sample implementation is in the repository of this SRFI. + +

    It runs on R7RS and R6RS Schemes and requires a small additional + prelude. Suitable ‘generic’ preludes for any R7RS Small and R6RS + implementations are provided. You can tune or rewrite the prelude + easily for what your particular Scheme implementation provides. As + examples, the R6RS prelude is tuned for Chez Scheme, and there is an + additional alternative prelude for Guile based on the R6RS one. + +

    Running on R7RS Small, it requires the (scheme + case-lambda) library and SRFIs 1 (list library), 128 + (comparators), 151 (bitwise operations), and 160 (homogeneous + numeric vectors). SRFI 160 could also easily be replaced by SRFI 4. + All assertion violations are raised as R7RS Small ‘error objects’. + +

    Running on R6RS, it requires SRFIs 128 (comparators) and 133 + (vector library). A future version may eliminate the minor + dependency on SRFI 133. + +

    In the form in which the sample implementation is distributed in + the repository, the hash tables of the sample implementation are + disjoint from any other existing hash table implementation. However, + it would be trivial to provide the R6RS hash table library or any of + the other SRFI hash table libraries in terms of this SRFI and + therefore make those libraries interoperable within a particular + Scheme implementation. + +

    Implementation techniques

    + +

    The sample implementation uses a technique apparently + due to Raymond Hettinger + and + tested in PyPy + before + being adopted by CPython. + +

    With this strategy, the actual ‘hash table’ in the old-fashioned + sense is an open-addressed array which contains index values into + another array of the actual entries in order. Because the index + values are typically small, they can be stored as 8-bit or 16-bit + numbers in the majority of cases, upgrading their size as more + entries are added, going to 32 and 64 bits for the very largest of + hash tables. This makes the vast majority of real-world hash tables + considerably smaller in memory footprint than in traditional + implementations, and insertion ordering of entries is a nice + side-effect. + +

    In this implementation technique, there is no performance benefit + to using the unordered hash-table-fold over the ordered + variants hash-table-fold-left and + hash-table-fold-right. + +

    A disadvantage of this approach is that deletion of hash table + entries (other than the most recently added) can only be achieved by + replacing an entry in the insertion-ordered array with a special + sentinel value which later needs to be cleaned up. Applications + which often add and delete keys may notice a speed penalty. + Benchmarks show moderate gains in performance compared to classical + hash tables for other access patterns, however, due to better use of + the CPU cache. + +

    Another option is to store hash table entries on a doubly-linked + list. This was the original technique used when ordered hash tables + were made part of the Ruby language in version 1.9, but the + implementation was changed in Ruby 3.0 to use the technique + described above. With this technique, unordered iteration can make + more efficient use of the CPU cache than the ordered variants. + Deletion is no less efficient than on a typical hash table + implementation. Such an implementation could also provide an + extension library allowing adding entries at the beginning of the + insertion order as well as at the end, or even in the middle. + +

    The need to store and manipulate the extra pair of pointers on each + entry to maintain insertion order makes this technique significantly + worse in memory use and slightly worse in speed than a hash table + without insertion ordering. A clever implementation in C might be + able to work around this. + +

    It is not recommended to attempt to implement this SRFI by bundling + an existing non-ordering hash table implementation and a list together + into a record type.

    Acknowledgements

    @@ -917,6 +1624,18 @@

    Acknowledgements

    I also acknowledge the members of the SRFI 250, 125, 126, and 128 mailing lists, especially Takashi Kato, Alex Shinn, Shiro Kawai, and Per Bothner.

    +

    Footnotes

    + +

    * Historical note: The + draft versions of R6RS up to and including R5.92RS also used the + hyphenated hash-table form. It was, puzzlingly, changed + in response to formal + comment 215, which argued for hyphenating the name + bytevector, not against hyphenating + hash-table.

    +

    © 2023 John Cowan, Will Clinger, Daphne Preston-Kendal.

    diff --git a/srfi/250.sld b/srfi/250.sld new file mode 100644 index 0000000..a9b480a --- /dev/null +++ b/srfi/250.sld @@ -0,0 +1,135 @@ +(define-library (srfi 250) + (export + ;; Constructors + make-hash-table + (rename prefilled-hash-table hash-table) + hash-table-unfold + alist->hash-table + ;; Predicates + hash-table? + hash-table-contains? + hash-table-empty? + hash-table-mutable? + ;; Accessors + hash-table-ref + hash-table-ref/default + hash-table-comparator + ;; Mutators + hash-table-add! + hash-table-replace! + hash-table-set! + hash-table-delete! + hash-table-intern! + hash-table-update! + hash-table-update!/default + hash-table-pop! + hash-table-clear! + ;; The whole hash table + hash-table-size + hash-table= + hash-table-find + hash-table-count + ;; Low-level iteration + hash-table-cursor-first + hash-table-cursor-last + hash-table-cursor-for-key + hash-table-cursor-next + hash-table-cursor-previous + hash-table-cursor-key + hash-table-cursor-value + hash-table-cursor-key+value + hash-table-cursor-value-set! + hash-table-cursor-at-end? + ;; Mapping and folding + hash-table-map + hash-table-map! + hash-table-for-each + hash-table-map->list + hash-table-fold + hash-table-fold-left + hash-table-fold-right + hash-table-prune! + ;; Copying and conversion + hash-table-copy + hash-table-empty-copy + hash-table->alist + ;; Hash tables as sets + hash-table-union! + hash-table-intersection! + hash-table-difference! + hash-table-xor!) + (import (scheme base) + (scheme case-lambda) + + (srfi 1) + (srfi 128) + (srfi 151) + (srfi 160 base)) + + (cond-expand (chibi (import (only (chibi ast) make-immutable!))) + (else)) + + (begin + (define-syntax not-on-r6rs + (syntax-rules () + ((_ body_0 body_1 ...) (begin body_0 body_1 ...)))) + + (define (void . ignored) (if #f #f)) + (define (hash-truncate h) (bitwise-and (abs h) #xFFFFFFFF)) + + (define-record-type Hash-Table + (%make-hash-table type-test-function hash-function same?-function + size next-entry compact-index + keys-vector values-vector mutable?) + hash-table? + (type-test-function hash-table-type-test-function) + (hash-function hash-table-hash-function) + (same?-function hash-table-same?-function) + (size hash-table-size hash-table-size-set!) + (next-entry hash-table-next-entry hash-table-next-entry-set!) + (compact-index hash-table-compact-index hash-table-compact-index-set!) + (compact-index-max-fill hash-table-compact-index-max-fill hash-table-compact-index-max-fill-set!) + (keys-vector hash-table-keys-vector hash-table-keys-vector-set!) + (values-vector hash-table-values-vector hash-table-values-vector-set!) + (mutable? hash-table-mutable? hash-table-mutable?-set!)) + + (define *unfilled* + (let () + (define-record-type Unfilled (make-unfilled) unfilled?) + (make-unfilled))) + (define (unfilled? obj) (eq? obj *unfilled*)) + + (define *deletion* + (let () + (define-record-type Deletion (make-deletion) deletion?) + (make-deletion))) + (define (deletion? obj) (eq? obj *deletion*)) + + (define *growth-rate* 3/2) + (define *default-k* 7) + + (define (assertion-violation who message . irritants) + (apply error + (string-append (if (symbol? who) + (symbol->string who) + who) + ": " + message) + irritants))) + + (cond-expand + (chibi + (begin + (define (hash-table-immutablize! ht) + (make-immutable! (hash-table-keys-vector ht)) + (make-immutable! (hash-table-values-vector ht)) + (make-immutable! (hash-table-compact-index ht)) + (hash-table-mutable?-set! ht #f) + (make-immutable! ht)))) + (else + (begin + (define (hash-table-immutablize! ht) + (hash-table-mutable?-set! ht #f))))) + + (include "250/internal/srfi-compact-arrays.scm") + (include "250/hash-tables.scm")) diff --git a/srfi/250/hash-tables.scm b/srfi/250/hash-tables.scm new file mode 100644 index 0000000..0c0120c --- /dev/null +++ b/srfi/250/hash-tables.scm @@ -0,0 +1,748 @@ +;; -*- eldoc-documentation-function: eldoc-documentation-default -*- +;; scheme-complete eldoc is bizarrely slow in this buffer + +(define *nice-n-buckets* + '#(2 2 3 5 5 7 7 11 11 13 13 17 17 19 19 23 23 23 23 29 29 31 31 31 31 + 37 37 41 41 43 43 47 47 47 47 53 53 53 53 59 59 61 61 61 61 67 67 + 71 71 73 73 73 73 79 79 83 83 83 83 89 89 89 89 89 89 97 97 101 + 101 103 103 107 107 109 109 113 113 113 113 113 113 113 113 113 + 113 127 127 131 131 131 131 137 137 139 139 139 139 139 139 149 + 149 151 151 151 151 157 157 157 157 163 163 167 167 167 167 173 + 173 173 173 179 179 181 181 181 181 181 181 191 191 193 193 197 + 197 199 199 199 199 199 199 199 199 211 211 211 211 211 211 211 + 211 223 223 227 227 229 229 233 233 233 233 239 239 241 241 241 + 241 241 241 251 251 251 251 257 257 257 257 263 263 263 263 269 + 269 271 271 271 271 277 277 281 281 283 283 283 283 283 283 293 + 293 293 293 293 293 293 293 293 293 307 307 311 311 313 313 317 + 317 317 317 317 317 317 317 317 317 331 331 331 331 337 337 337 + 337 337 337 347 347 349 349 353 353 353 353 359 359 359 359 359 + 359 367 367 367 367 373 373 373 373 379 379 383 383 383 383 389 + 389 389 389 389 389 397 397 401 401 401 401 401 401 409 409 409 + 409 409 409 419 419 421 421 421 421 421 421 431 431 433 433 433 + 433 439 439 443 443 443 443 449 449 449 449 449 449 457 457 461 + 461 463 463 467 467 467 467 467 467 467 467 479 479 479 479 479 + 479 487 487 491 491 491 491 491 491 499 499 503 503 503 503 509 + 509 509 509 509 509 509 509 521 521 523 523 523 523 523 523 523 + 523 523 523 523 523 541 541 541 541 547 547 547 547 547 547 557 + 557 557 557 563 563 563 563 569 569 571 571 571 571 577 577 577 + 577 577 577 587 587 587 587 593 593 593 593 599 599 601 601 601 + 601 607 607 607 607 613 613 617 617 619 619 619 619 619 619 619 + 619 631 631 631 631 631 631 641 641 643 643 647 647 647 647 653 + 653 653 653 659 659 661 661 661 661 661 661 661 661 673 673 677 + 677 677 677 683 683 683 683 683 683 691 691 691 691 691 691 701 + 701 701 701 701 701 709 709 709 709 709 709 719 719 719 719 719 + 719 727 727 727 727 733 733 733 733 739 739 743 743 743 743 743 + 743 751 751 751 751 757 757 761 761 761 761 761 761 769 769 773 + 773 773 773 773 773 773 773 773 773 787 787 787 787 787 787 797 + 797 797 797 797 797 797 797 809 809 811 811 811 811 811 811 821 + 821 823 823 827 827 829 829 829 829 829 829 839 839 839 839 839 + 839 839 839 839 839 853 853 857 857 859 859 863 863 863 863 863 + 863 863 863 863 863 877 877 881 881 883 883 887 887 887 887 887 + 887 887 887 887 887 887 887 887 887 907 907 911 911 911 911 911 + 911 919 919 919 919 919 919 929 929 929 929 929 929 937 937 941 + 941 941 941 947 947 947 947 953 953 953 953 953 953 953 953 953 + 953 967 967 971 971 971 971 977 977 977 977 983 983 983 983 983 + 983 991 991 991 991 997 997 997 997 997 997 997 997)) + +(define (find-nice-n-buckets k) + (if (< k (vector-length *nice-n-buckets*)) + (vector-ref *nice-n-buckets* k) + (let ((limit (+ 1 (floor (* k 3/2))))) + ;; todo: put a little bit more effort into this + (if (even? limit) + (+ 1 limit) + limit)))) + +;; the actual hash table bit: fundamental constructor +(define make-hash-table + (case-lambda + ((comparator) (make-hash-table comparator *default-k*)) + ((comparator k) + (unless (and (comparator? comparator) (comparator-hashable? comparator)) + (assertion-violation 'make-hash-table + "not a hashable comparator" + comparator)) + (unless (and (integer? k) (exact? k) (not (negative? k))) + (assertion-violation 'make-hash-table + "size is not an exact nonnegative integer" + k)) + (let ((k* (max k 1)) + (n-buckets (find-nice-n-buckets k))) + (%make-hash-table (comparator-type-test-predicate comparator) + (comparator-hash-function comparator) + (comparator-equality-predicate comparator) + 0 + 0 + (make-compact-array n-buckets) + (floor (* n-buckets 2/3)) + (make-vector k* *unfilled*) + (make-vector k* *unfilled*) + #t))))) + +;; construct and pre-fill +(define (prefilled-hash-table comparator . keys-values) + (define keys-values-length (length keys-values)) + (unless (even? keys-values-length) + (assertion-violation 'hash-table + "there must be as many keys as values" + keys-values)) + (let ((ht (make-hash-table comparator (/ keys-values-length 2)))) + (apply hash-table-set! ht keys-values) + ht)) + +(define alist->hash-table + (case-lambda + ((alist comparator) + (alist->hash-table alist comparator (length alist))) + ((alist comparator k) + (let ((ht (make-hash-table comparator k))) + (for-each (lambda (entry) + (let ((key (car entry)) + (value (cdr entry))) + (hash-table-set! ht key value))) + (reverse alist)) + ht)))) + +(define (hash-table-empty-copy ht) + (make-hash-table (hash-table-comparator ht) + (hash-table-size ht))) + +;; internal helpers +(define (hash-table-hash ht key) + (hash-truncate ((hash-table-hash-function ht) key))) + +(define (hash-table-same? ht a b) + ((hash-table-same?-function ht) a b)) + +(define (hash-table-right-type? ht obj) + ((hash-table-type-test-function ht) obj)) + +(define (%hash-table-bucket-for ht hash key) + (let ((n-buckets (compact-array-length (hash-table-compact-index ht)))) + (let loop ((hash hash)) + ;;(display hash) (newline) + (let* ((bucket (modulo hash n-buckets)) + (entry-idx + (compact-array-ref (hash-table-compact-index ht) bucket))) + (if entry-idx + (let ((found-key + (vector-ref (hash-table-keys-vector ht) entry-idx))) + (if (and (not (deletion? found-key)) + (hash-table-same? ht key found-key)) + bucket + (loop (+ hash 1)))) + bucket))))) + +(define (hash-table-bucket-for-key ht key) + (%hash-table-bucket-for ht (hash-table-hash ht key) key)) + +(define (hash-table-grow-entries! ht) + ;; first determine if we actually need to grow the entries arrays, + ;; or if pruning dead entries would suffice + (if (< (hash-table-size ht) + (* (vector-length (hash-table-keys-vector ht)) + (/ 1 *growth-rate*))) + (hash-table-prune-dead-entries! ht #f) + ;; otherwise, actually grow the entries array + (let* ((old-size (vector-length (hash-table-keys-vector ht))) + (new-size (max (floor (* old-size *growth-rate*)) + (+ old-size 1))) + (new-keys (make-vector new-size *unfilled*)) + (new-values (make-vector new-size *unfilled*))) + (vector-copy! new-keys 0 (hash-table-keys-vector ht)) + (vector-copy! new-values 0 (hash-table-values-vector ht)) + (hash-table-keys-vector-set! ht new-keys) + (hash-table-values-vector-set! ht new-values) + (hash-table-prune-dead-entries! ht #f)))) + +(define (hash-table-prune-dead-entries! ht fast?) + ;; NB only set fast? to #t if you are going to be rehashing all + ;; entries anyway! + (unless (eqv? (hash-table-size ht) (hash-table-next-entry ht)) + (let loop ((from-idx 0) + (to-idx 0)) + ;;(display from-idx) (newline) (display to-idx) (newline) (newline) + (cond ((or (>= from-idx (hash-table-next-entry ht)) + (unfilled? (vector-ref (hash-table-keys-vector ht) from-idx))) + (vector-fill! (hash-table-keys-vector ht) + *unfilled* + (hash-table-size ht) + (hash-table-next-entry ht)) + (vector-fill! (hash-table-values-vector ht) + *unfilled* + (hash-table-size ht) + (hash-table-next-entry ht)) + (hash-table-next-entry-set! ht (hash-table-size ht))) + ((deletion? (vector-ref (hash-table-keys-vector ht) from-idx)) + (unless fast? + (compact-array-delete! (hash-table-compact-index ht) + (vector-ref (hash-table-values-vector ht) from-idx))) + (loop (+ from-idx 1) to-idx)) + ((eqv? from-idx to-idx) (loop (+ from-idx 1) (+ to-idx 1))) + (else + (vector-set! (hash-table-keys-vector ht) + to-idx + (vector-ref (hash-table-keys-vector ht) from-idx)) + (vector-set! (hash-table-values-vector ht) + to-idx + (vector-ref (hash-table-values-vector ht) from-idx)) + (unless fast? + (compact-array-set! (hash-table-compact-index ht) + (hash-table-bucket-for-key + ht + (vector-ref (hash-table-keys-vector ht) + to-idx)) + to-idx)) + (loop (+ from-idx 1) (+ to-idx 1))))))) + +(define (hash-table-prune-dead-entries-at-end! ht) + (let loop ((idx (- (hash-table-next-entry ht) 1))) + (when (and (>= idx 0) + (deletion? (vector-ref (hash-table-keys-vector ht) idx))) + (compact-array-delete! (hash-table-compact-index ht) + (vector-ref (hash-table-values-vector ht) idx)) + (vector-set! (hash-table-keys-vector ht) idx *unfilled*) + (vector-set! (hash-table-values-vector ht) idx *unfilled*) + (hash-table-next-entry-set! ht idx) + (loop (- idx 1))))) + +(define (hash-table-grow-compact-index! ht) + ;; this parameter isn't tunable: the compact index is always between + ;; 1/2 and 2/3 full except when the table is brand new + (let* ((new-size (ceiling (* (compact-array-length (hash-table-compact-index ht)) + 4/3))) + (new-compact-index (make-compact-array new-size))) + ;; take the opportunity to prune dead entries, since we have to + ;; iterate all of them anyway + (hash-table-prune-dead-entries! ht #t) + (hash-table-compact-index-set! ht new-compact-index) + (hash-table-compact-index-max-fill-set! ht (floor (* new-size 2/3))) + (let loop ((idx 0)) + (unless (>= idx (vector-length (hash-table-keys-vector ht))) + (let ((key (vector-ref (hash-table-keys-vector ht) idx))) + (unless (unfilled? key) + (compact-array-set! new-compact-index + (hash-table-bucket-for-key ht key) + idx) + (loop (+ idx 1)))))))) + +;; #t if the hash table’s compact index has to grow to accommodate the +;; next association added +(define (hash-table-compact-index-must-grow? ht) + (>= (hash-table-next-entry ht) (hash-table-compact-index-max-fill ht))) + +;; add to the entries arrays, setting the bucket in the compact index +(define (hash-table-add-entry! ht bucket key value) + (if (>= (hash-table-next-entry ht) + (vector-length (hash-table-keys-vector ht))) + (hash-table-grow-entries! ht)) + (when (hash-table-compact-index-must-grow? ht) + (hash-table-grow-compact-index! ht) + (set! bucket (hash-table-bucket-for-key ht key))) + (vector-set! (hash-table-keys-vector ht) + (hash-table-next-entry ht) + key) + (vector-set! (hash-table-values-vector ht) + (hash-table-next-entry ht) + value) + (compact-array-set! (hash-table-compact-index ht) + bucket + (hash-table-next-entry ht)) + (hash-table-size-set! ht (+ (hash-table-size ht) 1)) + (hash-table-next-entry-set! ht (+ (hash-table-next-entry ht) 1))) + +;; basic public interface, getting and setting +(define hash-table-add! + (case-lambda + ((ht key value) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-add! + "hash table is immutable" + ht)) + (unless (hash-table-right-type? ht key) + (assertion-violation 'hash-table-add! + "not the right type for a key in this hash table" + key ht)) + (let* ((bucket (hash-table-bucket-for-key ht key)) + (entry-idx (compact-array-ref (hash-table-compact-index ht) + bucket))) + (if entry-idx + (assertion-violation 'hash-table-add! + "already an association for this key in this hash table" + key ht) + (hash-table-add-entry! ht bucket key value)))) + ((ht) (void)) + ((ht . keys-values) + (unless (even? (length keys-values)) + (assertion-violation 'hash-table-add! + "there must be as many keys as values" + (length keys-values) keys-values)) + (let loop ((key (car keys-values)) + (value (cadr keys-values)) + (more-keys-values (cddr keys-values))) + (hash-table-add! ht key value) + (unless (null? more-keys-values) + (loop (car more-keys-values) + (cadr more-keys-values) + (cddr more-keys-values))))))) + +(define hash-table-replace! + (case-lambda + ((ht key value) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-replace! + "hash table is immutable" + ht)) + (unless (hash-table-right-type? ht key) + (assertion-violation 'hash-table-replace! + "not the right type for a key in this hash table" + key ht)) + (let* ((bucket (hash-table-bucket-for-key ht key)) + (entry-idx (compact-array-ref (hash-table-compact-index ht) + bucket))) + (if entry-idx + (vector-set! (hash-table-values-vector ht) entry-idx value) + (assertion-violation 'hash-table-replace! + "key not in table" + key ht)))) + ((ht) (void)) + ((ht . keys-values) + (unless (even? (length keys-values)) + (assertion-violation 'hash-table-replace! + "there must be as many keys as values" + (length keys-values) keys-values)) + (let loop ((key (car keys-values)) + (value (cadr keys-values)) + (more-keys-values (cddr keys-values))) + (hash-table-replace! ht key value) + (unless (null? more-keys-values) + (loop (car more-keys-values) + (cadr more-keys-values) + (cddr more-keys-values))))))) + +(define hash-table-set! + (case-lambda + ((ht key value) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-set! + "hash table is immutable" + ht)) + (unless (hash-table-right-type? ht key) + (assertion-violation 'hash-table-set! + "not the right type for a key in this hash table" + key ht)) + (let* ((bucket (hash-table-bucket-for-key ht key)) + (entry-idx (compact-array-ref (hash-table-compact-index ht) + bucket))) + (if entry-idx + (vector-set! (hash-table-values-vector ht) entry-idx value) + (hash-table-add-entry! ht bucket key value)))) + ((ht) (void)) + ((ht . keys-values) + (unless (even? (length keys-values)) + (assertion-violation 'hash-table-set! + "there must be as many keys as values" + (length keys-values) keys-values)) + (let loop ((key (car keys-values)) + (value (cadr keys-values)) + (more-keys-values (cddr keys-values))) + (hash-table-set! ht key value) + (unless (null? more-keys-values) + (loop (car more-keys-values) + (cadr more-keys-values) + (cddr more-keys-values))))))) + +(define hash-table-ref + (case-lambda + ((ht key) + (hash-table-ref ht key + (lambda () (assertion-violation 'hash-table-ref + "key not in table" + key ht)) + values)) + ((ht key failure) + (hash-table-ref ht key failure values)) + ((ht key failure success) + (unless (hash-table-right-type? ht key) + (assertion-violation 'hash-table-ref + "not the right type for a key in this hash table" + key ht)) + (let* ((bucket (hash-table-bucket-for-key ht key)) + (entry-idx (compact-array-ref (hash-table-compact-index ht) + bucket))) + (if entry-idx + (success (vector-ref (hash-table-values-vector ht) entry-idx)) + (failure)))))) + +(define (hash-table-ref/default ht key default) + (hash-table-ref ht key (lambda () default))) + +(define (hash-table-delete! ht . keys) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-delete! + "hash table is immutable" + ht)) + (let loop ((n-deleted 0) + (more-keys keys)) + (if (null? more-keys) + (begin + (hash-table-size-set! ht (- (hash-table-size ht) n-deleted)) + (when (> (- (hash-table-next-entry ht) (hash-table-size ht)) + (* 1/3 (hash-table-size ht))) + (hash-table-prune-dead-entries! ht #f)) + n-deleted) + (if (hash-table-delete-one! ht (car more-keys)) + (loop (+ n-deleted 1) (cdr more-keys)) + (loop n-deleted (cdr more-keys)))))) + +(define (hash-table-delete-one! ht key) + (unless (hash-table-right-type? ht key) + (assertion-violation 'hash-table-delete! + "not the right type for a key in this hash table" + key ht)) + (let* ((bucket (hash-table-bucket-for-key ht key)) + (entry-idx (compact-array-ref (hash-table-compact-index ht) + bucket))) + (if entry-idx + (begin + (vector-set! (hash-table-keys-vector ht) + entry-idx *deletion*) + (vector-set! (hash-table-values-vector ht) + entry-idx bucket) + (when (eqv? entry-idx (- (hash-table-next-entry ht) 1)) + (hash-table-prune-dead-entries-at-end! ht)) + #t) + #f))) + +(define (hash-table-pop! ht) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-delete! + "hash table is immutable" + ht)) + (when (hash-table-empty? ht) + (assertion-violation 'hash-table-pop! + "hash table is already empty" + ht)) + (let* ((idx (- (hash-table-next-entry ht) 1)) + (key (vector-ref (hash-table-keys-vector ht) idx)) + (value (vector-ref (hash-table-values-vector ht) idx))) + (vector-set! (hash-table-keys-vector ht) idx *unfilled*) + (vector-set! (hash-table-values-vector ht) idx *unfilled*) + (hash-table-size-set! ht (- (hash-table-size ht) 1)) + (hash-table-next-entry-set! ht idx) + (values key value))) + +(define (hash-table-clear! ht) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-clear! + "hash table is immutable" + ht)) + ;; assumes the hash table is going to be refilled with approximately + ;; the same number of associations as were previously in it + (compact-array-clear! (hash-table-compact-index ht)) + (vector-fill! (hash-table-keys-vector ht) *unfilled*) + (vector-fill! (hash-table-values-vector ht) *unfilled*) + (hash-table-size-set! ht 0) + (hash-table-next-entry-set! ht 0)) + +(define (hash-table-contains? ht key) + (unless (hash-table-right-type? ht key) + (assertion-violation 'hash-table-contains? + "not the right type for a key in this hash table" + key ht)) + (let* ((bucket (hash-table-bucket-for-key ht key)) + (entry-idx (compact-array-ref (hash-table-compact-index ht) + bucket))) + (not (not entry-idx)))) + +(define (hash-table-empty? ht) (zero? (hash-table-size ht))) +(define (hash-table-comparator ht) + (make-comparator (hash-table-type-test-function ht) + (hash-table-same?-function ht) + #f + (hash-table-hash-function ht))) + +;; cursor-based iteration +(define (hash-table-cursor-first ht) + (hash-table-cursor-next ht -1)) +(define (hash-table-cursor-last ht) + (hash-table-cursor-previous ht (vector-length (hash-table-keys-vector ht)))) + +(define (hash-table-cursor-for-key ht key) + (unless (hash-table-right-type? ht key) + (assertion-violation 'hash-table-cursor-for-key + "not the right type for a key in this hash table" + key ht)) + (let* ((bucket (hash-table-bucket-for-key ht key)) + (entry-idx (compact-array-ref (hash-table-compact-index ht) + bucket))) + (if entry-idx + entry-idx + -1))) + +(define (hash-table-cursor-next ht cur) + (let loop ((n (+ cur 1))) + (if (>= n (vector-length (hash-table-keys-vector ht))) + n + (let ((key (vector-ref (hash-table-keys-vector ht) n))) + (cond ((unfilled? key) n) + ((deletion? key) (loop (+ n 1))) + (else n)))))) +(define (hash-table-cursor-previous ht cur) + (let loop ((n (- cur 1))) + (if (< n 0) + n + (let ((key (vector-ref (hash-table-keys-vector ht) n))) + (if (or (unfilled? key) (deletion? key)) + (loop (- n 1)) + n))))) + +(define (hash-table-cursor-key ht cur) + (vector-ref (hash-table-keys-vector ht) cur)) +(define (hash-table-cursor-value ht cur) + (vector-ref (hash-table-values-vector ht) cur)) +(define (hash-table-cursor-key+value ht cur) + (values (hash-table-cursor-key ht cur) + (hash-table-cursor-value ht cur))) + +(define (hash-table-cursor-value-set! ht cur val) + (vector-set! (hash-table-values-vector ht) cur val)) + +(define (hash-table-cursor-at-end? ht cur) + (or (negative? cur) + (>= cur (vector-length (hash-table-keys-vector ht))) + (unfilled? (vector-ref (hash-table-keys-vector ht) cur)))) + +(define (hash-table-fold proc seed ht) + (let loop ((cur (hash-table-cursor-first ht)) + (acc seed)) + (if (hash-table-cursor-at-end? ht cur) + acc + (loop (hash-table-cursor-next ht cur) + (proc (hash-table-cursor-key ht cur) + (hash-table-cursor-value ht cur) + acc))))) + +(define (hash-table-fold-left proc seed ht) + (let loop ((cur (hash-table-cursor-first ht)) + (acc seed)) + (if (hash-table-cursor-at-end? ht cur) + acc + (loop (hash-table-cursor-next ht cur) + (proc acc + (hash-table-cursor-key ht cur) + (hash-table-cursor-value ht cur)))))) + +(define (hash-table-fold-right proc seed ht) + (let loop ((cur (hash-table-cursor-last ht)) + (acc seed)) + (if (hash-table-cursor-at-end? ht cur) + acc + (loop (hash-table-cursor-previous ht cur) + (proc (hash-table-cursor-key ht cur) + (hash-table-cursor-value ht cur) + acc))))) + +(define (hash-table-for-each proc ht) + (let loop ((cur (hash-table-cursor-first ht))) + (unless (hash-table-cursor-at-end? ht cur) + (call-with-values + (lambda () (hash-table-cursor-key+value ht cur)) + proc) + (loop (hash-table-cursor-next ht cur))))) + +(define (hash-table-map proc ht) + (let ((new-ht (hash-table-empty-copy ht))) + (hash-table-for-each (lambda (k v) + (hash-table-add! new-ht k (proc k v))) + ht) + new-ht)) + +(define (hash-table-map! proc ht) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-map! + "hash table is immutable" + ht)) + (let loop ((cur (hash-table-cursor-first ht))) + (if (hash-table-cursor-at-end? ht cur) + ht + (let ((new-val + (call-with-values + (lambda () (hash-table-cursor-key+value ht cur)) + proc))) + (hash-table-cursor-value-set! ht cur new-val) + (loop (hash-table-cursor-next ht cur)))))) + +(define (hash-table-map->list proc ht) + (hash-table-fold-right + (lambda (k v l) + (cons (proc k v) l)) + '() ht)) + +(define (hash-table->alist ht) + (hash-table-fold-left + (lambda (l k v) + (cons (cons k v) l)) + '() ht)) + +(define (hash-table-find proc ht failure) + (let loop ((cur (hash-table-cursor-first ht))) + (cond ((hash-table-cursor-at-end? ht cur) + (failure)) + ((call-with-values + (lambda () (hash-table-cursor-key+value ht cur)) + proc)) + (else (loop (hash-table-cursor-next ht cur)))))) + +(define (hash-table-count pred ht) + (hash-table-fold + (lambda (k v acc) + (if (pred k v) + (+ acc 1) + acc)) + 0 + ht)) + +(define (hash-table-prune! proc ht) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-prune! + "hash table is immutable" + ht)) + (let loop ((cur (hash-table-cursor-first ht)) (n-deleted 0)) + (if (hash-table-cursor-at-end? ht cur) + (begin + (hash-table-size-set! ht (- (hash-table-size ht) n-deleted)) + (hash-table-prune-dead-entries-at-end! ht) + (when (> (- (hash-table-next-entry ht) (hash-table-size ht)) + (* 1/3 (hash-table-size ht))) + (hash-table-prune-dead-entries! ht #f)) + n-deleted) + (let-values (((k v) (hash-table-cursor-key+value ht cur))) + (if (and (proc k v) + (hash-table-delete-one! ht k)) + (loop (hash-table-cursor-next ht cur) (+ n-deleted 1)) + (loop (hash-table-cursor-next ht cur) n-deleted)))))) + +(define (hash-table-copy ht mutable?) + (hash-table-prune-dead-entries! ht #f) + (if mutable? + (%make-hash-table (hash-table-type-test-function ht) + (hash-table-hash-function ht) + (hash-table-same?-function ht) + (hash-table-size ht) + (hash-table-next-entry ht) + (compact-array-copy (hash-table-compact-index ht)) + (hash-table-compact-index-max-fill ht) + (vector-copy (hash-table-keys-vector ht)) + (vector-copy (hash-table-values-vector ht)) + #t) + (let ((out-ht (hash-table-empty-copy ht))) + (hash-table-for-each (lambda (k v) + (hash-table-add! out-ht k v)) + ht) + (hash-table-immutablize! out-ht) + out-ht))) + +;; set-like operations +(define (hash-table-union! ht_1 ht_2) + (hash-table-for-each + (lambda (k v) + (unless (hash-table-contains? ht_1 k) + (hash-table-set! ht_2 k v))) + ht_2) + ht_1) + +(define (hash-table-intersection! ht_1 ht_2) + (hash-table-prune! + (lambda (k v) + (not (hash-table-contains? ht_2 k))) + ht_1) + ht_1) + +(define (hash-table-difference! ht_1 ht_2) + (hash-table-prune! + (lambda (k v) + (hash-table-contains? ht_2 k)) + ht_1) + ht_1) + +(define (hash-table-xor! ht_1 ht_2) + (hash-table-for-each + (lambda (k v) + (if (hash-table-contains? ht_1 k) + (hash-table-delete! ht_1 k) + (hash-table-set! ht_1 k v))) + ht_2) + ht_1) + +(define (hash-table= value=? ht_1 ht_2) + (not-on-r6rs + (unless (eqv? (hash-table-same?-function ht_1) + (hash-table-same?-function ht_2)) + (assertion-violation 'hash-table= + "hash tables have different equality predicates" + ht_1 ht_2))) + (and + ;; check every association in ht_1 has a corresponding association + ;; in ht_2 + (let loop ((cur (hash-table-cursor-first ht_1))) + (cond ((hash-table-cursor-at-end? ht_1 cur) #t) + ((and (hash-table-contains? ht_2 + (hash-table-cursor-key ht_1 cur)) + (value=? (hash-table-cursor-value ht_1 cur) + (hash-table-ref ht_2 (hash-table-cursor-key ht_1 cur)))) + (loop (hash-table-cursor-next ht_1 cur))) + (else #f))) + ;; also check there are no entries in ht_2 absent in ht_1 + (let loop ((cur (hash-table-cursor-first ht_2))) + (cond ((hash-table-cursor-at-end? ht_2 cur) #t) + ((hash-table-contains? ht_1 + (hash-table-cursor-key ht_2 cur)) + (loop (hash-table-cursor-next ht_1 cur))) + (else #f))))) + +;; public utility procedures, with implementations closely adapted +;; from Will Clinger’s original implementation + +;; not continuation-safe :-/ +(define hash-table-unfold + (case-lambda + ((stop? mapper successor seed comparator) + (hash-table-unfold stop? mapper successor seed comparator *default-k*)) + ((stop? mapper successor seed comparator k) + (let ((ht (make-hash-table comparator k))) + (let loop ((seed seed)) + (if (stop? seed) + ht + (call-with-values + (lambda () (mapper seed)) + (lambda (key val) + (hash-table-set! ht key val) + (loop (successor seed)))))))))) + +(define (hash-table-intern! ht key failure) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-intern! + "hash table is immutable" + ht)) + (if (hash-table-contains? ht key) + (hash-table-ref ht key) + (let ((val (failure))) + (hash-table-set! ht key val) + val))) + +(define (hash-table-update! ht key updater . rest) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-update! + "hash table is immutable" + ht)) + (hash-table-set! ht + key + (updater (apply hash-table-ref ht key rest)))) + +(define (hash-table-update!/default ht key updater default) + (unless (hash-table-mutable? ht) + (assertion-violation 'hash-table-update!/default + "hash table is immutable" + ht)) + (hash-table-set! ht key (updater (hash-table-ref/default ht key default)))) diff --git a/srfi/250/internal/r6rs-compact-arrays.scm b/srfi/250/internal/r6rs-compact-arrays.scm new file mode 100644 index 0000000..5d256c1 --- /dev/null +++ b/srfi/250/internal/r6rs-compact-arrays.scm @@ -0,0 +1,68 @@ +(define (make-compact-array size) + (cond ((fx (max-to #xFF)) + ((and (u16vector? sa) (u16vector-ref sa idx)) + => (max-to #xFFFF)) + ((and (u32vector? sa) (u32vector-ref sa idx)) + => (max-to #xFFFFFFFF)) + ((and (u64vector? sa) (u64vector-ref sa idx)) + => (max-to #xFFFFFFFFFFFFFFFF)))) + +(define (compact-array-set? sa idx) + (not (not (compact-array-ref sa idx)))) + +(define (compact-array-set! sa idx val) + (cond ((bytevector? sa) (bytevector-u8-set! sa idx val)) + ((u16vector? sa) (u16vector-set! sa idx val)) + ((u32vector? sa) (u32vector-set! sa idx val)) + ((u64vector? sa) (u64vector-set! sa idx val)))) + +(define (compact-array-delete! sa idx) + (cond ((bytevector? sa) + (bytevector-u8-set! sa idx #xFF)) + ((u16vector? sa) + (u16vector-set! sa idx #xFFFF)) + ((u32vector? sa) + (u32vector-set! sa idx #xFFFFFFFF)) + ((u64vector? sa) + (u64vector-set! sa idx #xFFFFFFFFFFFFFFFF)))) + +(define (compact-array-clear! sa) + (define len (compact-array-length sa)) + (cond ((bytevector? sa) + (let loop ((idx 0)) + (when (< idx len) + (bytevector-u8-set! sa idx #xFF) + (loop (+ idx 1))))) + ((u16vector? sa) + (let loop ((idx 0)) + (when (< idx len) + (u16vector-set! sa idx #xFFFF) + (loop (+ idx 1))))) + ((u32vector? sa) + (let loop ((idx 0)) + (when (< idx len) + (u32vector-set! sa idx #xFFFFFFFF) + (loop (+ idx 1))))) + ((u64vector? sa) + (let loop ((idx 0)) + (when (< idx len) + (u64vector-set! sa idx #xFFFFFFFFFFFFFFFF) + (loop (+ idx 1))))))) + +(define (compact-array-copy sa) + (define len (compact-array-length sa)) + (cond ((bytevector? sa) (bytevector-copy sa)) + ((u16vector? sa) + (let ((out (make-u16vector len))) + (let loop ((idx 0)) + (when (< idx len) + (u16vector-set! out idx (u16vector-ref sa idx)))))) + ((u32vector? sa) + (let ((out (make-u32vector len))) + (let loop ((idx 0)) + (when (< idx len) + (u32vector-set! out idx (u32vector-ref sa idx)))))) + ((u64vector? sa) + (let ((out (make-u64vector len))) + (let loop ((idx 0)) + (when (< idx len) + (u64vector-set! out idx (u64vector-ref sa idx)))))))) + +(define (compact-array-length sa) + (cond ((bytevector? sa) + (bytevector-length sa)) + ((u16vector? sa) + (u16vector-length sa)) + ((u32vector? sa) + (u32vector-length sa)) + ((u64vector? sa) + (u64vector-length sa)))) diff --git a/srfi/:250.sls b/srfi/:250.sls new file mode 100644 index 0000000..7c489ac --- /dev/null +++ b/srfi/:250.sls @@ -0,0 +1,61 @@ +(library (srfi :250) + (export + ;; Constructors + make-hash-table + hash-table + hash-table-unfold + alist->hash-table + ;; Predicates + hash-table? + hash-table-contains? + hash-table-empty? + hash-table-mutable? + ;; Accessors + hash-table-ref + hash-table-ref/default + hash-table-comparator + ;; Mutators + hash-table-add! + hash-table-replace! + hash-table-set! + hash-table-delete! + hash-table-intern! + hash-table-update! + hash-table-update!/default + hash-table-pop! + hash-table-clear! + ;; The whole hash table + hash-table-size + hash-table= + hash-table-find + hash-table-count + ;; Low-level iteration + hash-table-cursor-first + hash-table-cursor-last + hash-table-cursor-for-key + hash-table-cursor-next + hash-table-cursor-previous + hash-table-cursor-key + hash-table-cursor-value + hash-table-cursor-key+value + hash-table-cursor-value-set! + hash-table-cursor-at-end? + ;; Mapping and folding + hash-table-map + hash-table-map! + hash-table-for-each + hash-table-map->list + hash-table-fold + hash-table-fold-left + hash-table-fold-right + hash-table-prune! + ;; Copying and conversion + hash-table-copy + hash-table-empty-copy + hash-table->alist + ;; Hash tables as sets + hash-table-union! + hash-table-intersection! + hash-table-difference! + hash-table-xor!) + (import (srfi :250 hash-tables))) diff --git a/srfi/:250/hash-tables.sls b/srfi/:250/hash-tables.sls new file mode 100644 index 0000000..16cde66 --- /dev/null +++ b/srfi/:250/hash-tables.sls @@ -0,0 +1,124 @@ +(library (srfi :250 hash-tables) + (export + ;; Constructors + make-hash-table + (rename (prefilled-hash-table hash-table)) + hash-table-unfold + alist->hash-table + ;; Predicates + hash-table? + hash-table-contains? + hash-table-empty? + hash-table-mutable? + ;; Accessors + hash-table-ref + hash-table-ref/default + hash-table-comparator + ;; Mutators + hash-table-add! + hash-table-replace! + hash-table-set! + hash-table-delete! + hash-table-intern! + hash-table-update! + hash-table-update!/default + hash-table-pop! + hash-table-clear! + ;; The whole hash table + hash-table-size + hash-table= + hash-table-find + hash-table-count + ;; Low-level iteration + hash-table-cursor-first + hash-table-cursor-last + hash-table-cursor-for-key + hash-table-cursor-next + hash-table-cursor-previous + hash-table-cursor-key + hash-table-cursor-value + hash-table-cursor-key+value + hash-table-cursor-value-set! + hash-table-cursor-at-end? + ;; Mapping and folding + hash-table-map + hash-table-map! + hash-table-for-each + hash-table-map->list + hash-table-fold + hash-table-fold-left + hash-table-fold-right + hash-table-prune! + ;; Copying and conversion + hash-table-copy + hash-table-empty-copy + hash-table->alist + ;; Hash tables as sets + hash-table-union! + hash-table-intersection! + hash-table-difference! + hash-table-xor!) + + (import (except (rnrs (6)) vector-fill!) + (only (rnrs r5rs (6)) modulo) + (srfi :128 comparators) + (only (srfi :133 vectors) + vector-copy! + vector-fill!) + (srfi :250 internal include) + (srfi :250 internal immutable)) + + (define-syntax not-on-r6rs + (syntax-rules () + ((_ body_0 body_1 ...) (begin)))) + (define (void . ignored) (if #f #f)) + (define (hash-truncate h) (bitwise-and (abs h) #xFFFFFFFF)) + + (define-record-type (hash-table %make-hash-table hash-table?) + (fields (immutable type-test-function) + (immutable hash-function) + (immutable same?-function) + (mutable size) + (mutable next-entry) + (mutable compact-index) + (mutable compact-index-max-fill) + (mutable keys-vector) + (mutable values-vector) + (mutable mutable?)) + (opaque #t) + (sealed #t) + (nongenerative Hash-Table-BE0AFTGAdcwHkSOxhWtxQF+Ai1g)) + + (define-syntax define-sentinel + (syntax-rules () + ((_ name pred) + (begin + (define-record-type the-sentinel-type) + (define name (make-the-sentinel-type)) + (define (pred obj) (eq? obj name)))))) + (define-sentinel *unfilled* unfilled?) + (define-sentinel *deletion* deletion?) + + (define *default-k* 7) + (define *growth-rate* 3/2) + + (define (hash-table-immutablize! ht) + (hash-table-keys-vector-set! ht (vector->immutable-vector + (hash-table-keys-vector ht))) + (hash-table-values-vector-set! ht (vector->immutable-vector + (hash-table-values-vector ht))) + (hash-table-compact-index-set! ht (bytevector->immutable-bytevector + (hash-table-compact-index ht))) + (hash-table-mutable?-set! ht #f)) + + (define (vector-copy vec) + (define len (vector-length vec)) + (let ((out-vec (make-vector len))) + (let loop ((idx 0)) + (when (< idx len) + (vector-set! out-vec idx (vector-ref vec idx)) + (loop (+ idx 1)))) + out-vec)) + + (include "srfi/250/internal/r6rs-compact-arrays.scm") + (include "srfi/250/hash-tables.scm")) diff --git a/srfi/:250/internal/immutable.chezscheme.sls b/srfi/:250/internal/immutable.chezscheme.sls new file mode 100644 index 0000000..efd8943 --- /dev/null +++ b/srfi/:250/internal/immutable.chezscheme.sls @@ -0,0 +1,4 @@ +(library (srfi :250 internal immutable) + (export vector->immutable-vector + bytevector->immutable-bytevector) + (import (chezscheme))) diff --git a/srfi/:250/internal/immutable.sls b/srfi/:250/internal/immutable.sls new file mode 100644 index 0000000..b293adb --- /dev/null +++ b/srfi/:250/internal/immutable.sls @@ -0,0 +1,4 @@ +(library (srfi :250 internal immutable) + (export (rename (values vector->immutable-vector) + (values bytevector->immutable-bytevector))) + (import (rnrs (6)))) diff --git a/srfi/:250/internal/include.chezscheme.sls b/srfi/:250/internal/include.chezscheme.sls new file mode 100644 index 0000000..2f39b09 --- /dev/null +++ b/srfi/:250/internal/include.chezscheme.sls @@ -0,0 +1,5 @@ +;; prefer Chez’s native ‘include’, which tracks source location +;; information correctly +(library (srfi :250 internal include) + (export include) + (import (chezscheme))) diff --git a/srfi/:250/internal/include.sls b/srfi/:250/internal/include.sls new file mode 100644 index 0000000..029d866 --- /dev/null +++ b/srfi/:250/internal/include.sls @@ -0,0 +1,24 @@ +(library (srfi :250 internal include) + (export include) + (import (rnrs base) + (rnrs syntax-case) + (rnrs io ports)) + (define-syntax include + (lambda (x) + (define read-file + (lambda (fn k) + (let ((p (open-file-input-port fn + (file-options no-create) + 'block + (native-transcoder)))) + (let f ((x (get-datum p))) + (if (eof-object? x) + (begin (close-port p) '()) + (cons (datum->syntax k x) + (f (get-datum p)))))))) + (syntax-case x () + ((k filename) + (let ((fn (syntax->datum #'filename))) + (with-syntax (((exp ...) + (read-file fn #'k))) + #'(begin exp ...)))))))) diff --git a/srfi/srfi-250.scm b/srfi/srfi-250.scm new file mode 100644 index 0000000..2be33e7 --- /dev/null +++ b/srfi/srfi-250.scm @@ -0,0 +1,154 @@ +;; SRFI 250 implementation (with prelude) for Guile + +(define-module (srfi srfi-250) + #:use-module ((rnrs) + #:version (6)) + #:use-module ((scheme base) + #:select (modulo + vector-copy + vector-copy! + vector-fill!)) + #:use-module ((guile) #:select (include + procedure-name)) + #:use-module (ice-9 format) + #:use-module ((srfi srfi-9 gnu) #:select (set-record-type-printer!)) + #:use-module (srfi srfi-128) ; https://codeberg.org/pukkamustard/guile-srfi-128 + #:duplicates (last) + #:pure + #:declarative? #t + #:export (;; Constructors + make-hash-table + (prefilled-hash-table . hash-table) + hash-table-unfold + alist->hash-table + ;; Predicates + hash-table? + hash-table-contains? + hash-table-empty? + hash-table-mutable? + ;; Accessors + hash-table-ref + hash-table-ref/default + hash-table-comparator + ;; Mutators + hash-table-add! + hash-table-replace! + hash-table-set! + hash-table-delete! + hash-table-intern! + hash-table-update! + hash-table-update!/default + hash-table-pop! + hash-table-clear! + ;; The whole hash table + hash-table-size + hash-table= + hash-table-find + hash-table-count + ;; Low-level iteration + hash-table-cursor-first + hash-table-cursor-last + hash-table-cursor-for-key + hash-table-cursor-next + hash-table-cursor-previous + hash-table-cursor-key + hash-table-cursor-value + hash-table-cursor-key+value + hash-table-cursor-value-set! + hash-table-cursor-at-end? + ;; Mapping and folding + hash-table-map + hash-table-map! + hash-table-for-each + hash-table-map->list + hash-table-fold + hash-table-fold-left + hash-table-fold-right + hash-table-prune! + ;; Copying and conversion + hash-table-copy + hash-table-empty-copy + hash-table->alist + ;; Hash tables as sets + hash-table-union! + hash-table-intersection! + hash-table-difference! + hash-table-xor!)) + +(define-syntax not-on-r6rs + (syntax-rules () + ((_ body_0 body_1 ...) (begin body_0 body_1 ...)))) + +(define (void . ignored) (if #f #f)) +(define (hash-truncate h) (bitwise-and (abs h) #xFFFFFFFF)) +(define-record-type (hash-table %make-hash-table hash-table?) + (fields (immutable type-test-function) + (immutable hash-function) + (immutable same?-function) + (mutable size) + (mutable next-entry) + (mutable compact-index) + (mutable compact-index-max-fill) + (mutable keys-vector) + (mutable values-vector) + (mutable mutable?)) + (opaque #t) + (sealed #t) + (nongenerative Hash-Table-BE0AFTGAdcwHkSOxhWtxQF+Ai1g)) + +(define *unfilled* + (let () + (define-record-type (Unfilled make-unfilled unfilled?)) + (make-unfilled))) +(define (unfilled? obj) (eq? obj *unfilled*)) + +(define *deletion* + (let () + (define-record-type (Deletion make-deletion deletion?)) + (make-deletion))) +(define (deletion? obj) (eq? obj *deletion*)) + +(define *default-k* 7) +(define *growth-rate* 3/2) + +(define (hash-table-immutablize! ht) + (hash-table-mutable?-set! ht #f)) + +(include "250/internal/r6rs-compact-arrays.scm") +(include "250/hash-tables.scm") + +(set-record-type-printer! (record-type-descriptor hash-table) + (lambda (ht port) + (format port + "#= n 3) + (not (hash-table-cursor-at-end? ht cur))) + (display " ...)" port)) + ((hash-table-cursor-at-end? ht cur) + (display ")" port)) + (else + (if (> n 0) (display #\space port)) + (let ((pair (call-with-values + (lambda () (hash-table-cursor-key+value ht cur)) + cons))) + (write pair port) + (loop (+ n 1) + (hash-table-cursor-previous ht cur)))))) + (cond ((procedure-name (hash-table-type-test-function ht)) + => (lambda (name) + (format port ", key type ~s" name)))) + (cond ((procedure-name (hash-table-hash-function ht)) + => (lambda (name) + (format port ", hash fn ~s" name)))) + (cond ((procedure-name (hash-table-same?-function ht)) + => (lambda (name) + (format port ", equiv proc ~s" name)))) + (format port + ", load ~1,2f, ~d deleted>" + (/ (hash-table-next-entry ht) + (compact-array-length (hash-table-compact-index ht))) + (- (hash-table-next-entry ht) + (hash-table-size ht)))))