Skip to content

Persistent hashing for DAGs #547

@inducer

Description

@inducer

This is a follow-up to #546. I sympathize with wanting to hash DAGs persistently, I don't see a good reason why we shouldn't. At the same time, needing to hash data that physically sits on a GPU is weird and fraught with all sorts of complication, as #546 demonstrates.

For a moment, I was tempted to say that we should insist that DataWrappers should only contain data on the host, but that's nonsense, because it'll destroy any type of eager-ish thaw/freeze operation by overwhelming it with data transfer cost of questionable utility. (How to cache those compiles is a separate question; let's have that discussion but maybe not now.)

The main place in which this hashing is relevant is pytato-to-loopy transforms/code generation. I previously assumed this was quick, but recent data from @matthiasdiener and @majosm perhaps suggests otherwise.

So why don't we just introduce a small mapper that transfers all the array data to the host before transforms/codegen and another that transfers them back after? Then, at least in between those two transfers, we can use all the hashing we need.

cc @matthiasdiener @kaushikcfd

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions