Is it possible to use JAX with CPU device as "host"? #30903

mjo22 · 2025-08-11T19:46:43Z

mjo22
Aug 11, 2025

I have the following situation, roughly:

I have a loop where I need to load some parameters and data on every iteration. In the loop, the parameters are cast to JAX arrays by a function that I have limited control over (i.e. it is wrapped up in library code). The data is loaded as numpy arrays.

There are two issues:

I have to do some light preprocessing on the parameters, which should be in general faster using numpy
I eventually want to add sharding, and loading uncommitted on the default device may be less efficient than loading on numpy and then directly calling device_put.

Here is a code block demonstrating the situation:

 
def load_parameters(dataframe: pd.DataFrame, index: int)
    # Load parameters from dataframe as numpy
    parameters_numpy = f(dataframe, index)
    return jnp.asarray(parameters_numpy)

def preprocess_parameters(parameters: Array)
    # Preprocess parameters; low computational burden and
    # better suited for numpy
    parameters_preprocessed = g(parameters)
    return parameters_preprocessed

def shard_parameters(parameters: Array)
    # Run jax.device_put
    shard = ...
    return jax.device_put(parameters, shard)

def load_data(filename):
    data = np.asarray(…) # load data as numpy array 
    return data

@partial(jax.jit, …) # add in_sharding?
def do_compute(parameters, data, …)
    output = g(parameters, data)
    return output

dataframe = …
filenames = …
for index, filename in enumerate(filenames):
    parameters_jax = load_parameters(dataframe, index)
    parameters_jax = preprocess_parameters(parameters_jax)
    parameters_jax = shard_parameters(parameters_jax)
    data_numpy = load_data(filename)
   output = do_compute(parameters_jax, data_numpy, …)

The only thing I can think of, which feels like a hack, is to use the jax.default_device context manager. I could then either preprocess the parameters on JAX CPU or convert to numpy then preprocess. Is this the right direction, or is there a better way of doing this? It doesn't seem like there is an "API" for using CPU JAX as a "host" as the below code block attempts, so I am wary.

#
# Either load then convert back to numpy (are there array copies?)
#

dataframe = …
filenames = …
for index, filename in enumerate(filenames):
    with jax.default_device("cpu"): 
        parameters = load_parameters(dataframe, index)
        parameters = jax.tree.map(lambda x: np.asarray(x), parameters_jax)
    parameters = preprocess_parameters(parameters)
    parameters = shard_parameters(parameters)
    data_numpy = load_data(filename)
   # … or add jax.device_put
   output = do_compute(parameters, data_numpy, …)

#
# Or use CPU JAX for preprocessing
#

dataframe = …
filenames = …
for index, filename in enumerate(filenames):
    with jax.default_device("cpu"): 
        parameters = load_parameters(dataframe, index)
        parameters = preprocess_parameters(parameters)
    parameters = shard_parameters(parameters)
    data_numpy = load_data(filename)
   # … or add jax.device_put
   output = do_compute(parameters, data_numpy, …)

Questions

Answered by jakevdp

Aug 12, 2025

I think using jax.default_device('cpu') is a good approach for what you describe.

Another option would be to use jax.experimental.io_callback within your main program to call back to the host and do any data loading and/or preprocessing with NumPy.

View full answer

jakevdp · 2025-08-12T15:41:56Z

jakevdp
Aug 12, 2025
Maintainer

I think using jax.default_device('cpu') is a good approach for what you describe.

Another option would be to use jax.experimental.io_callback within your main program to call back to the host and do any data loading and/or preprocessing with NumPy.

5 replies

mjo22 Aug 12, 2025
Author

Thanks, I'll take a look at io_callback. For the default device override, will the jax -> numpy and numpy -> jax asarray calls create copies? Or will the same underlying buffers be used on the CPU?

jakevdp Aug 12, 2025
Maintainer

On CPU in eager mode, jax->numpy via np.asarray or device_get should always be zero-copy, while numpy->jax via jnp.asarray will always be zero-copy as long as the buffer layout is compatible with XLA (i.e. it must be contiguous and byte-aligned).

One way to check this is by comparing the buffer pointers and checking for memory sharing after a round trip:

import numpy as np
import jax.numpy as jnp

x_jnp = jnp.arange(10.0)
x_np = np.asarray(x_jnp)
x_jnp_2 = jnp.asarray(x_np)
x_np_2 = np.asarray(x_jnp_2)

assert x_jnp.unsafe_buffer_pointer() == x_jnp_2.unsafe_buffer_pointer()
print(np.shares_memory(x_np, x_np_2))  # True

mjo22 Aug 12, 2025
Author

Ok great, thanks so much.

mjo22 Aug 12, 2025
Author

Sorry, last question: what nuances exist with the dtype argument to asarray? What if I were to add dtype=float in your above example?

jakevdp Aug 12, 2025
Maintainer

If dtype is specified and the output dtype differs from the input dtype, it will result in a copy in general.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is it possible to use JAX with CPU device as "host"? #30903

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Is it possible to use JAX with CPU device as "host"? #30903

Uh oh!

Uh oh!

mjo22 Aug 11, 2025

Replies: 1 comment · 5 replies

Uh oh!

jakevdp Aug 12, 2025 Maintainer

Uh oh!

mjo22 Aug 12, 2025 Author

Uh oh!

Uh oh!

jakevdp Aug 12, 2025 Maintainer

Uh oh!

mjo22 Aug 12, 2025 Author

Uh oh!

mjo22 Aug 12, 2025 Author

Uh oh!

jakevdp Aug 12, 2025 Maintainer

mjo22
Aug 11, 2025

Replies: 1 comment 5 replies

jakevdp
Aug 12, 2025
Maintainer

mjo22 Aug 12, 2025
Author

jakevdp Aug 12, 2025
Maintainer

mjo22 Aug 12, 2025
Author

mjo22 Aug 12, 2025
Author

jakevdp Aug 12, 2025
Maintainer