ChunkedArray¶
arro3.core.ChunkedArray ¶
An Arrow ChunkedArray.
__arrow_c_stream__ ¶
An implementation of the Arrow PyCapsule Interface. This dunder method should not be called directly, but enables zero-copy data transfer to other Python libraries that understand Arrow memory.
For example (as of pyarrow v16), you can call
pyarrow.chunked_array() to convert this array into a
pyarrow array, without copying memory.
cast ¶
cast(target_type: ArrowSchemaExportable) -> ChunkedArray
Cast array values to another data type
Parameters:
-
target_type(ArrowSchemaExportable) –Type to cast array to.
chunk ¶
combine_chunks ¶
combine_chunks() -> Array
Flatten this ChunkedArray into a single non-chunked array.
equals ¶
equals(other: ArrowStreamExportable) -> bool
Return whether the contents of two chunked arrays are equal.
from_arrow
classmethod
¶
from_arrow(input: ArrowArrayExportable | ArrowStreamExportable) -> ChunkedArray
Construct this from an existing Arrow object.
It can be called on anything that exports the Arrow stream interface (has an
__arrow_c_stream__ method). All batches from the stream will be materialized
in memory.
from_arrow_pycapsule
classmethod
¶
from_arrow_pycapsule(capsule) -> ChunkedArray
Construct this object from a bare Arrow PyCapsule
rechunk ¶
rechunk(*, max_chunksize: int | None = None) -> ChunkedArray
Rechunk a ChunkedArray with a maximum number of rows per chunk.
Parameters:
-
max_chunksize(int | None, default:None) –The maximum number of rows per internal array. Defaults to None, which rechunks into a single array.
Returns:
-
ChunkedArray–The rechunked ChunkedArray.
slice ¶
slice(offset: int = 0, length: int | None = None) -> ChunkedArray
Compute zero-copy slice of this ChunkedArray
Parameters:
-
offset(int, default:0) –Offset from start of array to slice. Defaults to 0.
-
length(int | None, default:None) –Length of slice (default is until end of batch starting from offset).
Returns:
-
ChunkedArray–New ChunkedArray