Table¶
arro3.core.Table ¶
A collection of top-level named, equal length Arrow arrays.
column_names
property
¶
columns
property
¶
columns: list[ChunkedArray]
num_rows
property
¶
num_rows: int
Number of rows in this table.
Due to the definition of a table, all columns have the same number of rows.
shape
property
¶
__arrow_c_stream__ ¶
An implementation of the Arrow PyCapsule Interface. This dunder method should not be called directly, but enables zero-copy data transfer to other Python libraries that understand Arrow memory.
For example, you can call pyarrow.table()
to convert this
array into a pyarrow table, without copying memory.
add_column ¶
add_column(
i: int, field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table
Add column to Table at position.
A new table is returned with the column added, the original table object is left unchanged.
Parameters:
-
i
(int
) –Index to place the column at.
-
field
(str | ArrowSchemaExportable
) –description
-
column
(ArrowStreamExportable
) –Column data.
Returns:
-
Table
–New table with the passed column added.
append_column ¶
append_column(
field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table
Append column at end of columns.
Parameters:
-
field
(str | ArrowSchemaExportable
) –description
-
column
(ArrowStreamExportable
) –Column data.
Returns:
-
Table
–New table or record batch with the passed column added.
column ¶
column(i: int | str) -> ChunkedArray
combine_chunks ¶
combine_chunks() -> Table
Make a new table by combining the chunks this table has.
All the underlying chunks in the ChunkedArray of each column are concatenated into zero or one chunk.
Returns:
-
Table
–new Table with one or zero chunks.
field ¶
from_arrays
classmethod
¶
from_arrays(
arrays: Sequence[ArrowArrayExportable | ArrowStreamExportable],
*,
names: Sequence[str] | None = None,
schema: ArrowSchemaExportable | None = None,
metadata: dict[str, str] | dict[bytes, bytes] | None = None
) -> Table
Construct a Table from Arrow arrays.
Parameters:
-
arrays
(Sequence[ArrowArrayExportable | ArrowStreamExportable]
) –Equal-length arrays that should form the table.
-
names
(Sequence[str] | None
, default:None
) –Names for the table columns. If not passed,
schema
must be passed. Defaults to None. -
schema
(ArrowSchemaExportable | None
, default:None
) –Schema for the created table. If not passed,
names
must be passed. Defaults to None. -
metadata
(dict[str, str] | dict[bytes, bytes] | None
, default:None
) –Optional metadata for the schema (if inferred). Defaults to None.
Returns:
-
Table
–new table
from_arrow
classmethod
¶
from_arrow(input: ArrowArrayExportable | ArrowStreamExportable) -> Table
Construct this object from an existing Arrow object.
It can be called on anything that exports the Arrow stream interface
(__arrow_c_stream__
) and yields a StructArray for each item. This Table will
materialize all items from the iterator in memory at once. Use
[RecordBatchReader
] if you don't wish to materialize all batches in memory at
once.
Parameters:
-
input
(ArrowArrayExportable | ArrowStreamExportable
) –Arrow stream to use for constructing this object
Returns:
-
Table
–Self
from_arrow_pycapsule
classmethod
¶
from_arrow_pycapsule(capsule) -> Table
Construct this object from a bare Arrow PyCapsule
Parameters:
-
capsule
–description
Returns:
-
Table
–description
from_batches
classmethod
¶
from_batches(
batches: Sequence[ArrowArrayExportable],
*,
schema: ArrowSchemaExportable | None = None
) -> Table
Construct a Table from a sequence of Arrow RecordBatches.
Parameters:
-
batches
(Sequence[ArrowArrayExportable]
) –Sequence of RecordBatch to be converted, all schemas must be equal.
-
schema
(ArrowSchemaExportable | None
, default:None
) –If not passed, will be inferred from the first RecordBatch. Defaults to None.
Returns:
-
Table
–New Table.
from_pydict
classmethod
¶
from_pydict(
mapping: dict[str, ArrowArrayExportable | ArrowStreamExportable],
*,
schema: ArrowSchemaExportable | None = None,
metadata: dict[str, str] | dict[bytes, bytes] | None = None
) -> Table
Construct a Table or RecordBatch from Arrow arrays or columns.
Parameters:
-
mapping
(dict[str, ArrowArrayExportable | ArrowStreamExportable]
) –A mapping of strings to Arrays.
-
schema
(ArrowSchemaExportable | None
, default:None
) –If not passed, will be inferred from the Mapping values. Defaults to None.
-
metadata
(dict[str, str] | dict[bytes, bytes] | None
, default:None
) –Optional metadata for the schema (if inferred). Defaults to None.
Returns:
-
Table
–new table
rechunk ¶
remove_column ¶
rename_columns ¶
select ¶
set_column ¶
set_column(
i: int, field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table
Replace column in Table at position.
Parameters:
-
i
(int
) –Index to place the column at.
-
field
(str | ArrowSchemaExportable
) –description
-
column
(ArrowStreamExportable
) –Column data.
Returns:
-
Table
–description
slice ¶
to_batches ¶
to_batches() -> list[RecordBatch]
Convert Table to a list of RecordBatch objects.
Note that this method is zero-copy, it merely exposes the same data under a different API.
Returns:
-
list[RecordBatch]
–description
to_reader ¶
to_reader() -> RecordBatchReader
Convert the Table to a RecordBatchReader.
Note that this method is zero-copy, it merely exposes the same data under a different API.
Returns:
-
RecordBatchReader
–description
to_struct_array ¶
to_struct_array() -> ChunkedArray
with_schema ¶
with_schema(schema: ArrowSchemaExportable) -> Table
Assign a different schema onto this table.
The new schema must be compatible with the existing data; this does not cast the underlying data to the new schema. This is primarily useful for changing the schema metadata.
Parameters:
-
schema
(ArrowSchemaExportable
) –description
Returns:
-
Table
–description