Skip to content

ChunkedArray

arro3.core.ChunkedArray

An Arrow ChunkedArray.

chunks property

chunks: list[Array]

Convert to a list of single-chunked arrays.

nbytes property

nbytes: int

Total number of bytes consumed by the elements of the chunked array.

null_count property

null_count: int

Number of null entries

num_chunks property

num_chunks: int

Number of underlying chunks.

type property

type: DataType

Return data type of a ChunkedArray.

__arrow_c_stream__

__arrow_c_stream__(requested_schema: object | None = None) -> object

An implementation of the Arrow PyCapsule Interface. This dunder method should not be called directly, but enables zero-copy data transfer to other Python libraries that understand Arrow memory.

For example (as of pyarrow v16), you can call pyarrow.chunked_array() to convert this array into a pyarrow array, without copying memory.

cast

cast(target_type: ArrowSchemaExportable) -> ChunkedArray

Cast array values to another data type

Parameters:

chunk

chunk(i: int) -> Array

Select a chunk by its index.

Parameters:

  • i (int) –

    chunk index.

Returns:

combine_chunks

combine_chunks() -> Array

Flatten this ChunkedArray into a single non-chunked array.

equals

equals(other: ArrowStreamExportable) -> bool

Return whether the contents of two chunked arrays are equal.

from_arrow classmethod

Construct this from an existing Arrow object.

It can be called on anything that exports the Arrow stream interface (has an __arrow_c_stream__ method). All batches from the stream will be materialized in memory.

from_arrow_pycapsule classmethod

from_arrow_pycapsule(capsule) -> ChunkedArray

Construct this object from a bare Arrow PyCapsule

length

length() -> int

Return length of a ChunkedArray.

rechunk

rechunk(*, max_chunksize: int | None = None) -> ChunkedArray

Rechunk a ChunkedArray with a maximum number of rows per chunk.

Parameters:

  • max_chunksize (int | None, default: None ) –

    The maximum number of rows per internal array. Defaults to None, which rechunks into a single array.

Returns:

slice

slice(offset: int = 0, length: int | None = None) -> ChunkedArray

Compute zero-copy slice of this ChunkedArray

Parameters:

  • offset (int, default: 0 ) –

    Offset from start of array to slice. Defaults to 0.

  • length (int | None, default: None ) –

    Length of slice (default is until end of batch starting from offset).

Returns:

to_numpy

to_numpy() -> NDArray

Copy this array to a numpy NDArray