Replies: 1 comment
-
This code looks worked: def dask_worker2():
ddf = dd.read_csv(manifest_path)
ddf = ddf.repartition(npartitions=4)
def mff_wrapper(dfd):
df = dfd.compute()
return df.smiles.apply(make_fingerprint_feature)
futures = client.map(mff_wrapper, ddf.to_delayed())
results = client.gather(futures)
return results Is this a typical way to assign partitioned dataframe to distribued client? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Here is my trials:
Firstly I've tried to do the above using
dask_worker1
but realized this is an anti-pattern for large-rows dataframes.So I made another one as
dask_worker2
but it complainsIs there any good way to use numpy array as the return type?
Beta Was this translation helpful? Give feedback.
All reactions