Skip to content

ChunkedArray

arro3.core.ChunkedArray

An Arrow ChunkedArray.

chunks property

chunks: list[Array]

Convert to a list of single-chunked arrays.

field property

field: Field

Access the field stored on this ChunkedArray.

Note that this field usually will not have a name associated, but it may have metadata that signifies that this array is an extension (user-defined typed) array.

nbytes property

nbytes: int

Total number of bytes consumed by the elements of the chunked array.

null_count property

null_count: int

Number of null entries

num_chunks property

num_chunks: int

Number of underlying chunks.

type property

type: DataType

Return data type of a ChunkedArray.

__arrow_c_schema__

__arrow_c_schema__() -> object

An implementation of the Arrow PyCapsule Interface. This dunder method should not be called directly, but enables zero-copy data transfer to other Python libraries that understand Arrow memory.

This allows Arrow consumers to inspect the data type of this ChunkedArray. Then the consumer can ask the producer (in __arrow_c_stream__) to cast the exported data to a supported data type.

__arrow_c_stream__

__arrow_c_stream__(requested_schema: object | None = None) -> object

An implementation of the Arrow PyCapsule Interface. This dunder method should not be called directly, but enables zero-copy data transfer to other Python libraries that understand Arrow memory.

For example (as of pyarrow v16), you can call pyarrow.chunked_array() to convert this array into a pyarrow array, without copying memory.

cast

cast(target_type: ArrowSchemaExportable) -> ChunkedArray

Cast array values to another data type

Parameters:

chunk

chunk(i: int) -> Array

Select a chunk by its index.

Parameters:

  • i (int) –

    chunk index.

Returns:

combine_chunks

combine_chunks() -> Array

Flatten this ChunkedArray into a single non-chunked array.

equals

equals(other: ArrowStreamExportable) -> bool

Return whether the contents of two chunked arrays are equal.

from_arrow classmethod

Construct this from an existing Arrow object.

It can be called on anything that exports the Arrow stream interface (has an __arrow_c_stream__ method). All batches from the stream will be materialized in memory.

from_arrow_pycapsule classmethod

from_arrow_pycapsule(capsule) -> ChunkedArray

Construct this object from a bare Arrow PyCapsule

length

length() -> int

Return length of a ChunkedArray.

rechunk

rechunk(*, max_chunksize: int | None = None) -> ChunkedArray

Rechunk a ChunkedArray with a maximum number of rows per chunk.

Parameters:

  • max_chunksize (int | None, default: None ) –

    The maximum number of rows per internal array. Defaults to None, which rechunks into a single array.

Returns:

slice

slice(offset: int = 0, length: int | None = None) -> ChunkedArray

Compute zero-copy slice of this ChunkedArray

Parameters:

  • offset (int, default: 0 ) –

    Offset from start of array to slice. Defaults to 0.

  • length (int | None, default: None ) –

    Length of slice (default is until end of batch starting from offset).

Returns:

to_numpy

to_numpy() -> NDArray

Copy this array to a numpy NDArray

to_pylist

to_pylist() -> NDArray

Convert to a list of native Python objects.