DataType¶
arro3.core.DataType ¶
An Arrow DataType.
bit_width
property
¶
bit_width: Literal[8, 16, 32, 64] | None
Returns the bit width of this type if it is a primitive type
Returns None
if not a primitive type
list_size
property
¶
list_size: int | None
The size of the list in the case of fixed size lists.
This will return None
if the data type is not a fixed size list.
Examples:
from arro3.core import DataType
DataType.list(DataType.int32(), 2).list_size
# 2
Returns:
-
int | None
–description
time_unit
property
¶
time_unit: Literal['s', 'ms', 'us', 'ns'] | None
The time unit, if the data type has one.
__arrow_c_schema__ ¶
__arrow_c_schema__() -> object
An implementation of the Arrow PyCapsule Interface. This dunder method should not be called directly, but enables zero-copy data transfer to other Python libraries that understand Arrow memory.
For example, you can call pyarrow.field()
to convert this
array into a pyarrow field, without copying memory.
binary
classmethod
¶
date32
classmethod
¶
date32() -> DataType
Create instance of 32-bit date (days since UNIX epoch 1970-01-01).
date64
classmethod
¶
date64() -> DataType
Create instance of 64-bit date (milliseconds since UNIX epoch 1970-01-01).
decimal128
classmethod
¶
Create decimal type with precision and scale and 128-bit width.
Arrow decimals are fixed-point decimal numbers encoded as a scaled integer. The precision is the number of significant digits that the decimal type can represent; the scale is the number of digits after the decimal point (note the scale can be negative).
As an example, decimal128(7, 3)
can exactly represent the numbers 1234.567 and
-1234.567 (encoded internally as the 128-bit integers 1234567 and -1234567,
respectively), but neither 12345.67 nor 123.4567.
decimal128(5, -3)
can exactly represent the number 12345000 (encoded
internally as the 128-bit integer 12345), but neither 123450000 nor 1234500.
If you need a precision higher than 38 significant digits, consider using
decimal256
.
Parameters:
-
precision
(int
) –Must be between 1 and 38 scale: description
decimal256
classmethod
¶
Create decimal type with precision and scale and 256-bit width.
dictionary
classmethod
¶
dictionary(
index_type: ArrowSchemaExportable, value_type: ArrowSchemaExportable
) -> DataType
Dictionary (categorical, or simply encoded) type.
Parameters:
-
index_type
(ArrowSchemaExportable
) –description
-
value_type
(ArrowSchemaExportable
) –description
Returns:
-
DataType
–description
duration
classmethod
¶
equals ¶
equals(other: ArrowSchemaExportable, *, check_metadata: bool = False) -> bool
Return true if type is equivalent to passed value.
Parameters:
-
other
(ArrowSchemaExportable
) –description
-
check_metadata
(bool
, default:False
) –Whether nested Field metadata equality should be checked as well. Defaults to False.
Returns:
-
bool
–description
from_arrow
classmethod
¶
from_arrow(input: ArrowSchemaExportable) -> DataType
Construct this from an existing Arrow object.
It can be called on anything that exports the Arrow schema interface
(has an __arrow_c_schema__
method).
from_arrow_pycapsule
classmethod
¶
from_arrow_pycapsule(capsule) -> DataType
Construct this object from a bare Arrow PyCapsule
large_list
classmethod
¶
large_list(value_type: ArrowSchemaExportable) -> DataType
Create LargeListType instance from child data type or field.
This data type may not be supported by all Arrow implementations. Unless you
need to represent data larger than 2**31 elements, you should prefer list()
.
Parameters:
-
value_type
(ArrowSchemaExportable
) –description
Returns:
-
DataType
–description
large_list_view
classmethod
¶
large_list_view(value_type: ArrowSchemaExportable) -> DataType
Create LargeListViewType instance from child data type or field.
This data type may not be supported by all Arrow implementations because it is an alternative to the ListType.
Parameters:
-
value_type
(ArrowSchemaExportable
) –description
Returns:
-
DataType
–description
large_string
classmethod
¶
large_string() -> DataType
Create large UTF8 variable-length string type.
list
classmethod
¶
list(
value_type: ArrowSchemaExportable, list_size: int | None = None
) -> DataType
Create ListType instance from child data type or field.
Parameters:
-
value_type
(ArrowSchemaExportable
) –description
-
list_size
(int | None
, default:None
) –If length is
None
then return a variable length list type. If length is provided then return a fixed size list type.
Returns:
-
DataType
–description
list_view
classmethod
¶
list_view(value_type: ArrowSchemaExportable) -> DataType
Create ListViewType instance from child data type or field.
This data type may not be supported by all Arrow implementations because it is an alternative to the ListType.
map
classmethod
¶
map(
key_type: ArrowSchemaExportable,
item_type: ArrowSchemaExportable,
keys_sorted: bool,
) -> DataType
Create MapType instance from key and item data types or fields.
Parameters:
-
key_type
(ArrowSchemaExportable
) –description
-
item_type
(ArrowSchemaExportable
) –description
-
keys_sorted
(bool
) –description
Returns:
-
DataType
–description
month_day_nano_interval
classmethod
¶
month_day_nano_interval() -> DataType
Create instance of an interval type representing months, days and nanoseconds between two dates.
run_end_encoded
classmethod
¶
run_end_encoded(
run_end_type: ArrowSchemaExportable, value_type: ArrowSchemaExportable
) -> DataType
Create RunEndEncodedType from run-end and value types.
Parameters:
-
run_end_type
(ArrowSchemaExportable
) –The integer type of the run_ends array. Must be
'int16'
,'int32'
, or'int64'
. -
value_type
(ArrowSchemaExportable
) –The type of the values array.
Returns:
-
DataType
–description
struct
classmethod
¶
struct(fields: Sequence[ArrowSchemaExportable]) -> DataType
Create StructType instance from fields.
A struct is a nested type parameterized by an ordered sequence of types (which can all be distinct), called its fields.
Parameters:
-
fields
(Sequence[ArrowSchemaExportable]
) –Each field must have a UTF8-encoded name, and these field names are part of the type metadata.
Returns:
-
DataType
–description
time32
classmethod
¶
time64
classmethod
¶
timestamp
classmethod
¶
Create instance of timestamp type with resolution and optional time zone.
Parameters:
-
unit
(Literal['s', 'ms', 'us', 'ns']
) –one of
's'
[second],'ms'
[millisecond],'us'
[microsecond], or'ns'
[nanosecond] -
tz
(str | None
, default:None
) –Time zone name. None indicates time zone naive. Defaults to None.
Returns:
-
DataType
–description