|
17 | 17 |
|
18 | 18 | #' Experimental Arrow encoded arrays as R vectors |
19 | 19 | #' |
| 20 | +#' This experimental vctr class allows zero or more Arrow arrays to |
| 21 | +#' present as an R vector without converting them. This is useful for arrays |
| 22 | +#' with types that do not have a non-lossy R equivalent, and helps provide an |
| 23 | +#' intermediary object type where the default conversion is prohibitively |
| 24 | +#' expensive (e.g., a nested list of data frames). These objects will not |
| 25 | +#' survive many vctr transformations; however, they can be sliced without |
| 26 | +#' copying the underlying arrays. |
| 27 | +#' |
| 28 | +#' The nanoarrow_vctr is currently implemented similarly to `factor()`: its |
| 29 | +#' storage type is an `integer()` that is a sequence along the total length |
| 30 | +#' of the vctr and there are attributes that are required to resolve these |
| 31 | +#' indices to an array + offset. Sequences typically have a very compact |
| 32 | +#' representation in recent version of R such that this has a cheap storage |
| 33 | +#' footprint even for large arrays. The attributes are currently: |
| 34 | +#' |
| 35 | +#' - `schema`: The [nanoarrow_schema][as_nanoarrow_schema] shared by each chunk. |
| 36 | +#' - `chunks`: A `list()` of `nanoarrow_array`. |
| 37 | +#' - `offsets`: An `integer()` vector beginning with `0` and followed by the |
| 38 | +#' cumulative length of each chunk. This allows the chunk index + offset |
| 39 | +#' to be resolved from a logical index with `log(n)` complexity. |
| 40 | +#' |
| 41 | +#' This implementation is preliminary and may change; however, the result of |
| 42 | +#' `as_nanoarrow_array_stream(some_vctr[begin:end])` should remain stable. |
| 43 | +#' |
20 | 44 | #' @param x An object that works with [as_nanoarrow_array_stream()]. |
21 | 45 | #' @param subclass An optional subclass of nanoarrow_vctr to prepend to the |
22 | 46 | #' final class name. |
|
0 commit comments