Table¶

arro3.core.Table ¶

A collection of top-level named, equal length Arrow arrays.

chunk_lengths `property` ¶

chunk_lengths: list[int]

The number of rows in each internal chunk.

column_names `property` ¶

column_names: list[str]

Names of the Table or RecordBatch columns.

Returns:

list[str] –

description

columns `property` ¶

columns: list[ChunkedArray]

List of all columns in numerical order.

Returns:

list[ChunkedArray] –

description

nbytes `property` ¶

nbytes: int

Total number of bytes consumed by the elements of the table.

num_columns `property` ¶

num_columns: int

Number of columns in this table.

num_rows `property` ¶

num_rows: int

Number of rows in this table.

Due to the definition of a table, all columns have the same number of rows.

schema `property` ¶

schema: Schema

Schema of the table and its columns.

Returns:

Schema –

description

shape `property` ¶

shape: tuple[int, int]

Dimensions of the table or record batch

Returns:

tuple[int, int] –

(number of rows, number of columns)

__arrow_c_stream__ ¶

__arrow_c_stream__(requested_schema: object | None = None) -> object

An implementation of the Arrow PyCapsule Interface. This dunder method should not be called directly, but enables zero-copy data transfer to other Python libraries that understand Arrow memory.

For example, you can call pyarrow.table() to convert this array into a pyarrow table, without copying memory.

add_column ¶

add_column(
    i: int, field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table

Add column to Table at position.

A new table is returned with the column added, the original table object is left unchanged.

Parameters:

i (int) –

Index to place the column at.
field (str | ArrowSchemaExportable) –

description
column (ArrowStreamExportable) –

Column data.

Returns:

Table –

New table with the passed column added.

append_column ¶

append_column(
    field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table

Append column at end of columns.

Parameters:

field (str | ArrowSchemaExportable) –

description
column (ArrowStreamExportable) –

Column data.

Returns:

Table –

New table or record batch with the passed column added.

column ¶

column(i: int | str) -> ChunkedArray

Select single column from Table or RecordBatch.

Parameters:

i (int | str) –

The index or name of the column to retrieve.

Returns:

ChunkedArray –

description

combine_chunks ¶

combine_chunks() -> Table

Make a new table by combining the chunks this table has.

All the underlying chunks in the ChunkedArray of each column are concatenated into zero or one chunk.

Returns:

Table –

new Table with one or zero chunks.

field ¶

field(i: int | str) -> Field

Select a schema field by its column name or numeric index.

Parameters:

i (int | str) –

The index or name of the field to retrieve.

Returns:

Field –

description

from_arrays `classmethod` ¶

from_arrays(
    arrays: Sequence[ArrowArrayExportable | ArrowStreamExportable],
    *,
    names: Sequence[str] | None = None,
    schema: ArrowSchemaExportable | None = None,
    metadata: dict[str, str] | dict[bytes, bytes] | None = None
) -> Table

Construct a Table from Arrow arrays.

Parameters:

arrays (Sequence[ArrowArrayExportable | ArrowStreamExportable]) –

Equal-length arrays that should form the table.
names (Sequence[str] | None, default: None ) –

Names for the table columns. If not passed, schema must be passed. Defaults to None.
schema (ArrowSchemaExportable | None, default: None ) –

Schema for the created table. If not passed, names must be passed. Defaults to None.
metadata (dict[str, str] | dict[bytes, bytes] | None, default: None ) –

Optional metadata for the schema (if inferred). Defaults to None.

Returns:

Table –

new table

from_arrow `classmethod` ¶

from_arrow(input: ArrowArrayExportable | ArrowStreamExportable) -> Table

Construct this object from an existing Arrow object.

It can be called on anything that exports the Arrow stream interface (__arrow_c_stream__) and yields a StructArray for each item. This Table will materialize all items from the iterator in memory at once. Use [RecordBatchReader] if you don't wish to materialize all batches in memory at once.

Parameters:

input (ArrowArrayExportable | ArrowStreamExportable) –

Arrow stream to use for constructing this object

Returns:

Table –

Self

from_arrow_pycapsule `classmethod` ¶

from_arrow_pycapsule(capsule) -> Table

Construct this object from a bare Arrow PyCapsule

Parameters:

capsule –

description

Returns:

Table –

description

from_batches `classmethod` ¶

from_batches(
    batches: Sequence[ArrowArrayExportable],
    *,
    schema: ArrowSchemaExportable | None = None
) -> Table

Construct a Table from a sequence of Arrow RecordBatches.

Parameters:

batches (Sequence[ArrowArrayExportable]) –

Sequence of RecordBatch to be converted, all schemas must be equal.
schema (ArrowSchemaExportable | None, default: None ) –

If not passed, will be inferred from the first RecordBatch. Defaults to None.

Returns:

Table –

New Table.

from_pydict `classmethod` ¶

from_pydict(
    mapping: dict[str, ArrowArrayExportable | ArrowStreamExportable],
    *,
    schema: ArrowSchemaExportable | None = None,
    metadata: dict[str, str] | dict[bytes, bytes] | None = None
) -> Table

Construct a Table or RecordBatch from Arrow arrays or columns.

Parameters:

mapping (dict[str, ArrowArrayExportable | ArrowStreamExportable]) –

A mapping of strings to Arrays.
schema (ArrowSchemaExportable | None, default: None ) –

If not passed, will be inferred from the Mapping values. Defaults to None.
metadata (dict[str, str] | dict[bytes, bytes] | None, default: None ) –

Optional metadata for the schema (if inferred). Defaults to None.

Returns:

Table –

new table

rechunk ¶

rechunk(*, max_chunksize: int | None = None) -> Table

Rechunk a table with a maximum number of rows per chunk.

Parameters:

max_chunksize (int | None, default: None ) –

The maximum number of rows per internal RecordBatch. Defaults to None, which rechunks into a single batch.

Returns:

Table –

The rechunked table.

remove_column ¶

remove_column(i: int) -> Table

Create new Table with the indicated column removed.

Parameters:

i (int) –

Index of column to remove.

Returns:

Table –

New table without the column.

rename_columns ¶

rename_columns(names: Sequence[str]) -> Table

Create new table with columns renamed to provided names.

Parameters:

names (Sequence[str]) –

List of new column names.

Returns:

Table –

description

select ¶

select(columns: Sequence[int] | Sequence[str]) -> Table

Select columns of the Table.

Returns a new Table with the specified columns, and metadata preserved.

Parameters:

columns (Sequence[int] | Sequence[str]) –

The column names or integer indices to select.

Returns:

Table –

description

set_column ¶

set_column(
    i: int, field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table

Replace column in Table at position.

Parameters:

i (int) –

Index to place the column at.
field (str | ArrowSchemaExportable) –

description
column (ArrowStreamExportable) –

Column data.

Returns:

Table –

description

slice ¶

slice(offset: int = 0, length: int | None = None) -> Table

Compute zero-copy slice of this table.

Parameters:

offset (int, default: 0 ) –

Defaults to 0.
length (int | None, default: None ) –

Defaults to None.

Returns:

Table –

The sliced table

to_batches ¶

to_batches() -> list[RecordBatch]

Convert Table to a list of RecordBatch objects.

Note that this method is zero-copy, it merely exposes the same data under a different API.

Returns:

list[RecordBatch] –

description

to_reader ¶

to_reader() -> RecordBatchReader

Convert the Table to a RecordBatchReader.

Note that this method is zero-copy, it merely exposes the same data under a different API.

Returns:

RecordBatchReader –

description

to_struct_array ¶

to_struct_array() -> ChunkedArray

Convert to a chunked array of struct type.

Returns:

ChunkedArray –

description

with_schema ¶

with_schema(schema: ArrowSchemaExportable) -> Table

Assign a different schema onto this table.

The new schema must be compatible with the existing data; this does not cast the underlying data to the new schema. This is primarily useful for changing the schema metadata.

Parameters:

schema (ArrowSchemaExportable) –

description

Returns:

Table –

description

Table¶

arro3.core.Table ¶

chunk_lengths property ¶

column_names property ¶

columns property ¶

nbytes property ¶

num_columns property ¶

num_rows property ¶

schema property ¶

shape property ¶

__arrow_c_stream__ ¶

add_column ¶

append_column ¶

column ¶

combine_chunks ¶

field ¶

from_arrays classmethod ¶

from_arrow classmethod ¶

from_arrow_pycapsule classmethod ¶

from_batches classmethod ¶

from_pydict classmethod ¶

rechunk ¶

remove_column ¶

rename_columns ¶

select ¶

set_column ¶

slice ¶

to_batches ¶

to_reader ¶

to_struct_array ¶

with_schema ¶

chunk_lengths `property` ¶

column_names `property` ¶

columns `property` ¶

nbytes `property` ¶

num_columns `property` ¶

num_rows `property` ¶

schema `property` ¶

shape `property` ¶

from_arrays `classmethod` ¶

from_arrow `classmethod` ¶

from_arrow_pycapsule `classmethod` ¶

from_batches `classmethod` ¶

from_pydict `classmethod` ¶