Table¶
arro3.core.Table ¶
A collection of top-level named, equal length Arrow arrays.
column_names
property
¶
columns
property
¶
columns: list[ChunkedArray]
num_rows
property
¶
num_rows: int
Number of rows in this table.
Due to the definition of a table, all columns have the same number of rows.
shape
property
¶
__arrow_c_schema__ ¶
__arrow_c_schema__() -> object
An implementation of the Arrow PyCapsule Interface. This dunder method should not be called directly, but enables zero-copy data transfer to other Python libraries that understand Arrow memory.
This allows Arrow consumers to inspect the data type of this Table. Then the
consumer can ask the producer (in __arrow_c_stream__
) to cast the exported
data to a supported data type.
__arrow_c_stream__ ¶
An implementation of the Arrow PyCapsule Interface. This dunder method should not be called directly, but enables zero-copy data transfer to other Python libraries that understand Arrow memory.
For example, you can call pyarrow.table()
to convert this
array into a pyarrow table, without copying memory.
add_column ¶
add_column(
i: int, field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table
Add column to Table at position.
A new table is returned with the column added, the original table object is left unchanged.
Parameters:
-
i
(int
) –Index to place the column at.
-
field
(str | ArrowSchemaExportable
) –description
-
column
(ArrowStreamExportable
) –Column data.
Returns:
-
Table
–New table with the passed column added.
append_column ¶
append_column(
field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table
Append column at end of columns.
Parameters:
-
field
(str | ArrowSchemaExportable
) –description
-
column
(ArrowStreamExportable
) –Column data.
Returns:
-
Table
–New table or record batch with the passed column added.
column ¶
column(i: int | str) -> ChunkedArray
combine_chunks ¶
combine_chunks() -> Table
Make a new table by combining the chunks this table has.
All the underlying chunks in the ChunkedArray of each column are concatenated into zero or one chunk.
Returns:
-
Table
–new Table with one or zero chunks.
field ¶
from_arrays
classmethod
¶
from_arrays(
arrays: Sequence[ArrayInput | ArrowStreamExportable],
*,
names: Sequence[str],
schema: None = None,
metadata: dict[str, str] | dict[bytes, bytes] | None = None
) -> Table
from_arrays(
arrays: Sequence[ArrayInput | ArrowStreamExportable],
*,
names: None = None,
schema: ArrowSchemaExportable,
metadata: None = None
) -> Table
from_arrays(
arrays: Sequence[ArrayInput | ArrowStreamExportable],
*,
names: Sequence[str] | None = None,
schema: ArrowSchemaExportable | None = None,
metadata: dict[str, str] | dict[bytes, bytes] | None = None
) -> Table
Construct a Table from Arrow arrays.
Parameters:
-
arrays
(Sequence[ArrayInput | ArrowStreamExportable]
) –Equal-length arrays that should form the table.
-
names
(Sequence[str] | None
, default:None
) –Names for the table columns. If not passed,
schema
must be passed. Defaults to None. -
schema
(ArrowSchemaExportable | None
, default:None
) –Schema for the created table. If not passed,
names
must be passed. Defaults to None. -
metadata
(dict[str, str] | dict[bytes, bytes] | None
, default:None
) –Optional metadata for the schema (if inferred). Defaults to None.
Returns:
-
Table
–new table
from_arrow
classmethod
¶
from_arrow(input: ArrowArrayExportable | ArrowStreamExportable) -> Table
Construct this object from an existing Arrow object.
It can be called on anything that exports the Arrow stream interface
(__arrow_c_stream__
) and yields a StructArray for each item. This Table will
materialize all items from the iterator in memory at once. Use
[RecordBatchReader
] if you don't wish to materialize all batches in memory at
once.
Parameters:
-
input
(ArrowArrayExportable | ArrowStreamExportable
) –Arrow stream to use for constructing this object
Returns:
-
Table
–Self
from_arrow_pycapsule
classmethod
¶
from_arrow_pycapsule(capsule) -> Table
Construct this object from a bare Arrow PyCapsule
Parameters:
-
capsule
–description
Returns:
-
Table
–description
from_batches
classmethod
¶
from_batches(
batches: Sequence[ArrowArrayExportable],
*,
schema: ArrowSchemaExportable | None = None
) -> Table
Construct a Table from a sequence of Arrow RecordBatches.
Parameters:
-
batches
(Sequence[ArrowArrayExportable]
) –Sequence of RecordBatch to be converted, all schemas must be equal.
-
schema
(ArrowSchemaExportable | None
, default:None
) –If not passed, will be inferred from the first RecordBatch. Defaults to None.
Returns:
-
Table
–New Table.
from_pydict
classmethod
¶
from_pydict(
mapping: dict[str, ArrayInput | ArrowStreamExportable],
*,
schema: None = None,
metadata: dict[str, str] | dict[bytes, bytes] | None = None
) -> Table
from_pydict(
mapping: dict[str, ArrayInput | ArrowStreamExportable],
*,
schema: ArrowSchemaExportable,
metadata: None = None
) -> Table
from_pydict(
mapping: dict[str, ArrayInput | ArrowStreamExportable],
*,
schema: ArrowSchemaExportable | None = None,
metadata: dict[str, str] | dict[bytes, bytes] | None = None
) -> Table
Construct a Table or RecordBatch from Arrow arrays or columns.
Parameters:
-
mapping
(dict[str, ArrayInput | ArrowStreamExportable]
) –A mapping of strings to Arrays.
-
schema
(ArrowSchemaExportable | None
, default:None
) –If not passed, will be inferred from the Mapping values. Defaults to None.
-
metadata
(dict[str, str] | dict[bytes, bytes] | None
, default:None
) –Optional metadata for the schema (if inferred). Defaults to None.
Returns:
-
Table
–new table
rechunk ¶
remove_column ¶
rename_columns ¶
select ¶
set_column ¶
set_column(
i: int, field: str | ArrowSchemaExportable, column: ArrowStreamExportable
) -> Table
Replace column in Table at position.
Parameters:
-
i
(int
) –Index to place the column at.
-
field
(str | ArrowSchemaExportable
) –description
-
column
(ArrowStreamExportable
) –Column data.
Returns:
-
Table
–description
slice ¶
to_batches ¶
to_batches() -> list[RecordBatch]
Convert Table to a list of RecordBatch objects.
Note that this method is zero-copy, it merely exposes the same data under a different API.
Returns:
-
list[RecordBatch]
–description
to_reader ¶
to_reader() -> RecordBatchReader
Convert the Table to a RecordBatchReader.
Note that this method is zero-copy, it merely exposes the same data under a different API.
Returns:
-
RecordBatchReader
–description
to_struct_array ¶
to_struct_array() -> ChunkedArray
with_schema ¶
with_schema(schema: ArrowSchemaExportable) -> Table
Assign a different schema onto this table.
The new schema must be compatible with the existing data; this does not cast the underlying data to the new schema. This is primarily useful for changing the schema metadata.
Parameters:
-
schema
(ArrowSchemaExportable
) –description
Returns:
-
Table
–description