Skip to content

JSON

arro3.io.infer_json_schema

infer_json_schema(
    file: IO[bytes] | Path | str, *, max_records: int | None = None
) -> Schema

Infer the schema of a JSON file by reading the first n records of the buffer, with max_records controlling the maximum number of records to read.

Parameters:

  • file (IO[bytes] | Path | str) –

    The input JSON path or buffer.

  • max_records (int | None, default: None ) –

    The maximum number of records to read to infer schema. If not provided, will read the entire file to deduce field types. Defaults to None.

Returns:

  • Schema

    Inferred Arrow Schema

arro3.io.read_json

read_json(
    file: IO[bytes] | Path | str,
    schema: ArrowSchemaExportable,
    *,
    batch_size: int | None = None
) -> RecordBatchReader

Reads JSON data with a known schema into Arrow

Parameters:

  • file (IO[bytes] | Path | str) –

    The JSON file or buffer to read from.

  • schema (ArrowSchemaExportable) –

    The Arrow schema representing the JSON data.

  • batch_size (int | None, default: None ) –

    Set the batch size (number of records to load at one time). Defaults to None.

Returns:

arro3.io.write_json

write_json(
    data: ArrowStreamExportable | ArrowArrayExportable,
    file: IO[bytes] | Path | str,
    *,
    explicit_nulls: bool | None = None
) -> None

Write Arrow data to JSON.

By default the writer will skip writing keys with null values for backward compatibility.

Parameters:

  • data (ArrowStreamExportable | ArrowArrayExportable) –

    the Arrow Table, RecordBatchReader, or RecordBatch to write.

  • file (IO[bytes] | Path | str) –

    the output file or buffer to write to

  • explicit_nulls (bool | None, default: None ) –

    Set whether to keep keys with null values, or to omit writing them. Defaults to skipping nulls.

arro3.io.write_ndjson

write_ndjson(
    data: ArrowStreamExportable | ArrowArrayExportable,
    file: IO[bytes] | Path | str,
    *,
    explicit_nulls: bool | None = None
) -> None

Write Arrow data to newline-delimited JSON.

By default the writer will skip writing keys with null values for backward compatibility.

Parameters:

  • data (ArrowStreamExportable | ArrowArrayExportable) –

    the Arrow Table, RecordBatchReader, or RecordBatch to write.

  • file (IO[bytes] | Path | str) –

    the output file or buffer to write to

  • explicit_nulls (bool | None, default: None ) –

    Set whether to keep keys with null values, or to omit writing them. Defaults to skipping nulls.