-
Notifications
You must be signed in to change notification settings - Fork 269
Lock hash tables during augmented assignment #4017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
Lock hash tables during augmented assignment #4017
Conversation
|
I feel like it would be cleaner to handle The think I can't remember is if we could have a circumstance where "#" or "." is overridden with something weird. |
|
I don't think # or . can be overridden at top-level (yet!) but I consider this to be higher priority than allowing top-level to override. This is extremely troublesome for me right now so I'll take any solution that works! |
|
. and # can't be overridden. I used to not like that, but now I kind of do. We have other operators (_, and I use @@ in the Python package for .) that we can use, keeping these two nice and low level for getting at the underlying list or hash table. I like Jay's idea. I'll play around with it soon |
|
(I deeply dislike @@ btw, you should just move the relevant types to the interpreter and use . I think) |
|
Also, I didn't mention it earlier, you have to make sure to evaluate the right hand side before locking on the table. This is a bit subtle about order of operations, but if the right hand side also uses the same hash table (which isn't crazy) then you would have the same recursive lock issue. I think this makes This is a compromise that I think is fine, but I want to point this out. In theory a custom operator can make this even worse if it looks into the table somehow, but that shouldn't happen with sensible operators. |
|
@d-torrance I can't help until after the semester but any luck with this? |
|
I haven't taken a stab at it yet, but hopefully soon! |
|
@jkyang92 in principle I agree with everything in your comment, but I struggle to see a clean way to implement it. Do you have a any suggestions? |
|
Inside of the In the hash table case, you can then:
The only danger with this is if the operator has a user defined method, it could in theory relock the hash table in step 4. You could also add a bit of code before step 1 to take a read lock on the hash table and check if the entry even exists before proceeding, I don't know that this expense is worth it. |
|
Another approach that would have fewer worries about deadlock/relocking would be to "black hole" the entry in the hash table when we are modifying it. This entails putting a placeholder entry into the hash table that indicates "this entry is being modified". The black hole then should indicate which thread is currently modifying it, and have appropriate condition variables/mutexes such that other threads can wait on it. (If creating one per black hole seems expensive you can do one per thread). It's then an error (which can be sent back to the user) if you try to modify (but not read?) an entry which the current thread has black holed, but any other thread will just wait on the black hole. To avoid deadlock, you'd need special handling of waiting on black holed entries while having created black holed entries, but as long as you allow reads through the black hole, this is not a huge issue. The advantages is that this allows not holding the lock while doing the actual operation and only when actually writing the entry back to the hash table (you still have to take a write lock to add the black hole). There are complicated questions about what sequencing of operations this corresponds to, and annoyingly, it requires a check for black holes at every write into the hash table. |
This is the beginnings of an attempt at fixing #3928.
Here's the idea: We'd like something like
x.a += 1to be atomic whenxis a mutable hash table. So at the beginning of augmented assignment, we look to see if the left hand side is binary code with.or#as its binary operator, and then check if the left hand side of that is a hash table. If so, go ahead and lock it, and stuff the hash table in a variable so we know how to unlock it when we're done.The big problem, however, is that we currently implement hash table mutexes with a rwlock, which isn't recursive. In particular, if we lock the mutex, we can't lock it again in the same thread or we deadlock. But we're definitely going to lock it at least one more time when we call
lookupto get the value at the given key.So in the current draft, M2 immediately aborts because it runs into this problem on startup.
A possible idea would be to switch from rwlock's to recursive mutexes for hash tables. But hash tables are so fundamental in M2 that I'd rather wait until after the release to try this to give us plenty of time to test any repercussions.
Cc: @mahrud