Skip to content

Efficiently serializing HashMaps and HashSetsΒ #25

@basvandijk

Description

@basvandijk

Hi Johan,

I would like to have an efficient way of serializing (with cereal) HashMaps and HashSets. I think going through lists is inefficient because the hashes have to be recomputed (although I have not benchmarked this yet, so I could be wrong):

instance (Serialize a, Hashable a, Eq a) => Serialize (HS.HashSet a) where
    get = HS.fromList <$> get
    put = put . HS.toList

instance (Serialize k, Serialize v, Eq k, Hashable k) => Serialize (HMS.HashMap k v) where
    get = HMS.fromList <$> get
    put = put . HMS.toList

So ideally I would like the following instances:

instance (Serialize k, Serialize v) => Serialize (FullList k v) where
    put (FL k v fl) = put k >> put v >> put fl
    get = liftM3 FL get get get

instance (Serialize k, Serialize v) => Serialize (List k v) where
    put Nil          = putWord8 0
    put (Cons k v l) = putWord8 1 >> put k >> put v >> put l
    get = do tag <- getWord8
             case tag of
               0 -> return Nil 
               1 -> liftM3 Cons get get get
               _ -> fail "Data.FullList.Lazy: Decoding error: unknown tag"

instance (Serialize k, Serialize v) => Serialize (HashMap k v) where
    put (Bin sm l r) = putWord8 0 >> put sm >> put l >> put r
    put (Tip h fl)   = putWord8 1 >> put h >> put fl
    put Nil          = putWord8 2

    get = do tag <- getWord8
             case tag of
               0 -> liftM3 Bin get get get
               1 -> liftM2 Tip get get
               2 -> return Nil
               _ -> fail "Data.HashMap.Common: Decoding error: unknown tag"

However this requires access to the constructors which is only possible if:

  1. You add these instances to unordered-containers. Which unfortunately adds a dependency on cereal.
  2. Export the constructors from a new Data.HashMap.Unsafe module. This module can be marked as {-# LANGUAGE Unsafe #-} so that users of Safe modules can't use this module. Then users like me can write the (orphan) instances in their own applications.

Do you have a preference?

Regards,

Bas

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions