Skip to content

Table

arro3.core.Table

A collection of top-level named, equal length Arrow arrays.

chunk_lengths property

chunk_lengths: list[int]

The number of rows in each internal chunk.

column_names property

column_names: list[str]

Names of the Table or RecordBatch columns.

Returns:

columns property

columns: list[ChunkedArray]

List of all columns in numerical order.

Returns:

nbytes property

nbytes: int

Total number of bytes consumed by the elements of the table.

num_columns property

num_columns: int

Number of columns in this table.

num_rows property

num_rows: int

Number of rows in this table.

Due to the definition of a table, all columns have the same number of rows.

schema property

schema: Schema

Schema of the table and its columns.

Returns:

shape property

shape: tuple[int, int]

Dimensions of the table or record batch

Returns:

  • tuple[int, int]

    (number of rows, number of columns)

__arrow_c_stream__

__arrow_c_stream__(requested_schema: object | None = None) -> object

An implementation of the Arrow PyCapsule Interface. This dunder method should not be called directly, but enables zero-copy data transfer to other Python libraries that understand Arrow memory.

For example, you can call pyarrow.table() to convert this array into a pyarrow table, without copying memory.

add_column

add_column(
    i: int, field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table

Add column to Table at position.

A new table is returned with the column added, the original table object is left unchanged.

Parameters:

Returns:

  • Table

    New table with the passed column added.

append_column

append_column(
    field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table

Append column at end of columns.

Parameters:

Returns:

  • Table

    New table or record batch with the passed column added.

column

column(i: int | str) -> ChunkedArray

Select single column from Table or RecordBatch.

Parameters:

  • i (int | str) –

    The index or name of the column to retrieve.

Returns:

combine_chunks

combine_chunks() -> Table

Make a new table by combining the chunks this table has.

All the underlying chunks in the ChunkedArray of each column are concatenated into zero or one chunk.

Returns:

  • Table

    new Table with one or zero chunks.

field

field(i: int | str) -> Field

Select a schema field by its column name or numeric index.

Parameters:

  • i (int | str) –

    The index or name of the field to retrieve.

Returns:

from_arrays classmethod

from_arrays(
    arrays: Sequence[ArrowArrayExportable | ArrowStreamExportable],
    *,
    names: Sequence[str] | None = None,
    schema: ArrowSchemaExportable | None = None,
    metadata: dict[str, str] | dict[bytes, bytes] | None = None
) -> Table

Construct a Table from Arrow arrays.

Parameters:

  • arrays (Sequence[ArrowArrayExportable | ArrowStreamExportable]) –

    Equal-length arrays that should form the table.

  • names (Sequence[str] | None, default: None ) –

    Names for the table columns. If not passed, schema must be passed. Defaults to None.

  • schema (ArrowSchemaExportable | None, default: None ) –

    Schema for the created table. If not passed, names must be passed. Defaults to None.

  • metadata (dict[str, str] | dict[bytes, bytes] | None, default: None ) –

    Optional metadata for the schema (if inferred). Defaults to None.

Returns:

from_arrow classmethod

Construct this object from an existing Arrow object.

It can be called on anything that exports the Arrow stream interface (__arrow_c_stream__) and yields a StructArray for each item. This Table will materialize all items from the iterator in memory at once. Use [RecordBatchReader] if you don't wish to materialize all batches in memory at once.

Parameters:

Returns:

from_arrow_pycapsule classmethod

from_arrow_pycapsule(capsule) -> Table

Construct this object from a bare Arrow PyCapsule

Parameters:

  • capsule

    description

Returns:

from_batches classmethod

from_batches(
    batches: Sequence[ArrowArrayExportable],
    *,
    schema: ArrowSchemaExportable | None = None
) -> Table

Construct a Table from a sequence of Arrow RecordBatches.

Parameters:

  • batches (Sequence[ArrowArrayExportable]) –

    Sequence of RecordBatch to be converted, all schemas must be equal.

  • schema (ArrowSchemaExportable | None, default: None ) –

    If not passed, will be inferred from the first RecordBatch. Defaults to None.

Returns:

from_pydict classmethod

from_pydict(
    mapping: dict[str, ArrowArrayExportable | ArrowStreamExportable],
    *,
    schema: ArrowSchemaExportable | None = None,
    metadata: dict[str, str] | dict[bytes, bytes] | None = None
) -> Table

Construct a Table or RecordBatch from Arrow arrays or columns.

Parameters:

Returns:

rechunk

rechunk(*, max_chunksize: int | None = None) -> Table

Rechunk a table with a maximum number of rows per chunk.

Parameters:

  • max_chunksize (int | None, default: None ) –

    The maximum number of rows per internal RecordBatch. Defaults to None, which rechunks into a single batch.

Returns:

  • Table

    The rechunked table.

remove_column

remove_column(i: int) -> Table

Create new Table with the indicated column removed.

Parameters:

  • i (int) –

    Index of column to remove.

Returns:

  • Table

    New table without the column.

rename_columns

rename_columns(names: Sequence[str]) -> Table

Create new table with columns renamed to provided names.

Parameters:

Returns:

select

select(columns: Sequence[int] | Sequence[str]) -> Table

Select columns of the Table.

Returns a new Table with the specified columns, and metadata preserved.

Parameters:

Returns:

set_column

set_column(
    i: int, field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table

Replace column in Table at position.

Parameters:

Returns:

slice

slice(offset: int = 0, length: int | None = None) -> Table

Compute zero-copy slice of this table.

Parameters:

  • offset (int, default: 0 ) –

    Defaults to 0.

  • length (int | None, default: None ) –

    Defaults to None.

Returns:

  • Table

    The sliced table

to_batches

to_batches() -> list[RecordBatch]

Convert Table to a list of RecordBatch objects.

Note that this method is zero-copy, it merely exposes the same data under a different API.

Returns:

to_reader

to_reader() -> RecordBatchReader

Convert the Table to a RecordBatchReader.

Note that this method is zero-copy, it merely exposes the same data under a different API.

Returns:

to_struct_array

to_struct_array() -> ChunkedArray

Convert to a chunked array of struct type.

Returns:

with_schema

with_schema(schema: ArrowSchemaExportable) -> Table

Assign a different schema onto this table.

The new schema must be compatible with the existing data; this does not cast the underlying data to the new schema. This is primarily useful for changing the schema metadata.

Parameters:

Returns: