Skip to content

Explore using weak pointers to clean up unused versions #16

@noughtmare

Description

@noughtmare

I've been experimenting with using weak pointers to remove unused versions from the history chain. If a sequence versions somewhere in the middle of the chain is unused, their diffs could be combined into an IntMap (see also #13) which has logarithmic time lookup, so this could significantly improve the time it takes to read from old versions of the array.

I tried to start with a simple experiment: implement a list whose elements disappear when they are no longer directly referenced. I think that nicely models the behaviour we want. Unfortunately, it does not work yet and I don't have any more time this weekend. Here's my attempt:

{-# LANGUAGE LambdaCase #-}
import System.Mem.Weak
import System.Mem
import System.IO
import Control.Concurrent
import Data.IORef
import Control.Monad
import Data.Coerce

newtype WeakList a = WL (IORef (WeakListData a))
data WeakListData a = Nil | Cons a !(Weak (WeakList a))

nil :: IO (WeakList a)
nil = WL <$> newIORef Nil

cons :: Show a => a -> WeakList a -> IO (WeakList a)
cons x xs@(WL ref) = do
  ref' <- newIORef Nil
  w <- readIORef ref >>= \case
    Nil -> mkWeakIORef ref (pure ())
    Cons y ys -> mkWeakIORef ref $ print y >> writeIORef ref' (Cons x ys) 
  writeIORef ref' (Cons x (coerce w))
  pure (WL ref')

toList :: WeakList a -> IO [a]
toList (WL ref) = readIORef ref >>= \case
  Nil -> pure []
  Cons x w -> deRefWeak w >>= \case
    Nothing -> pure [x]
    Just xs -> (x :) <$> toList xs

main = do
  hSetBuffering stdout LineBuffering
  xs <- cons 3 =<< cons 2 =<< cons (1 :: Int) =<< nil
  print =<< toList xs
  ys <- cons 6 =<< cons 5 =<< cons 4 xs
  print =<< toList ys
  performGC
  threadDelay 100000
  print =<< toList xs
  print =<< toList ys

It currently prints:

[3,2,1]
[6,5,4,3,2,1]
2
1
[3]
[6,5,4,3]

I wanted it to print:

[3,2,1]
[6,5,4,3,2,1]
5
4
2
1
[3]
[6,3]

Of course this probably would have way too much overhead, but this could show that it is possible to get the GC to help us in theory.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions