All functions

ArrayData

ArrayData class

Buffer

Buffer class

ChunkedArray

ChunkedArray class

Codec

Compression Codec class

CsvFileFormat

CSV dataset file format

CsvReadOptions CsvWriteOptions CsvParseOptions TimestampParser CsvConvertOptions JsonReadOptions JsonParseOptions

File reader options

CsvTableReader JsonTableReader

Arrow CSV and JSON table reader classes

DataType

DataType class

Dataset FileSystemDataset UnionDataset InMemoryDataset DatasetFactory FileSystemDatasetFactory

Multi-file datasets

DictionaryType

class DictionaryType

Expression

Arrow expressions

ExtensionArray

ExtensionArray class

ExtensionType

ExtensionType class

FeatherReader

FeatherReader class

Field

Field class

field()

Create a Field

FileFormat ParquetFileFormat IpcFileFormat

Dataset file formats

FileInfo

FileSystem entry info

FileSelector

file selector

FileSystem LocalFileSystem S3FileSystem GcsFileSystem SubTreeFileSystem

FileSystem classes

FileWriteOptions

Format-specific write options

FixedWidthType

FixedWidthType class

FragmentScanOptions CsvFragmentScanOptions ParquetFragmentScanOptions JsonFragmentScanOptions

Format-specific scan options

InputStream RandomAccessFile MemoryMappedFile ReadableFile BufferReader

InputStream classes

JsonFileFormat

JSON dataset file format

Message

Message class

MessageReader

MessageReader class

OutputStream FileOutputStream BufferOutputStream

OutputStream classes

ParquetArrowReaderProperties

ParquetArrowReaderProperties class

ParquetFileReader

ParquetFileReader class

ParquetFileWriter

ParquetFileWriter class

ParquetReaderProperties

ParquetReaderProperties class

ParquetWriterProperties

ParquetWriterProperties class

Partitioning DirectoryPartitioning HivePartitioning DirectoryPartitioningFactory HivePartitioningFactory

Define Partitioning for a Dataset

RecordBatch

RecordBatch class

RecordBatchReader RecordBatchStreamReader RecordBatchFileReader

RecordBatchReader classes

RecordBatchWriter RecordBatchStreamWriter RecordBatchFileWriter

RecordBatchWriter classes

Scalar

Arrow scalars

Scanner ScannerBuilder

Scan the contents of a dataset

Schema

Schema class

Table

Table class

acero arrow-functions arrow-verbs arrow-dplyr

Functions available in Arrow dplyr queries

Array DictionaryArray StructArray ListArray LargeListArray FixedSizeListArray MapArray

Array Classes

arrow_array()

Create an Arrow Array

arrow_info() arrow_available() arrow_with_acero() arrow_with_dataset() arrow_with_substrait() arrow_with_parquet() arrow_with_s3() arrow_with_gcs() arrow_with_json()

Report information on the package's capabilities

as_arrow_array()

Convert an object to an Arrow Array

as_arrow_table()

Convert an object to an Arrow Table

as_chunked_array()

Convert an object to an Arrow ChunkedArray

as_data_type()

Convert an object to an Arrow DataType

as_record_batch()

Convert an object to an Arrow RecordBatch

as_record_batch_reader()

Convert an object to an Arrow RecordBatchReader

as_schema()

Convert an object to an Arrow Schema

buffer()

Create a Buffer

call_function()

Call an Arrow compute function

chunked_array()

Create a Chunked Array

codec_is_available()

Check whether a compression codec is available

compression CompressedOutputStream CompressedInputStream

Compressed stream classes

concat_arrays() c(<Array>)

Concatenate zero or more Arrays

concat_tables()

Concatenate one or more Tables

copy_files()

Copy files between FileSystems

cpu_count() set_cpu_count()

Manage the global CPU thread pool in libarrow

create_package_with_all_dependencies()

Create a source bundle that includes all thirdparty dependencies

csv_convert_options()

CSV Convert Options

csv_parse_options()

CSV Parsing Options

csv_read_options()

CSV Reading Options

csv_write_options()

CSV Writing Options

int8() int16() int32() int64() uint8() uint16() uint32() uint64() float16() halffloat() float32() float() float64() boolean() bool() utf8() large_utf8() binary() large_binary() fixed_size_binary() string() date32() date64() time32() time64() duration() null() timestamp() decimal() decimal128() decimal256() struct() list_of() large_list_of() fixed_size_list_of() map_of()

Create Arrow data types

dataset_factory()

Create a DatasetFactory

dictionary()

Create a dictionary type

flight_connect()

Connect to a Flight server

flight_disconnect()

Explicitly close a Flight client

flight_get()

Get data from a Flight server

flight_put()

Send data to a Flight server

gs_bucket()

Connect to a Google Cloud Storage (GCS) bucket

hive_partition()

Construct Hive partitioning

infer_schema()

Extract a schema from an object

infer_type() type()

Infer the arrow Array type from an R object

install_arrow()

Install or upgrade the Arrow library

install_pyarrow()

Install pyarrow for use with reticulate

io_thread_count() set_io_thread_count()

Manage the global I/O thread pool in libarrow

list_compute_functions()

List available Arrow C++ compute functions

list_flights() flight_path_exists()

See available resources on a Flight server

load_flight_server()

Load a Python Flight server

map_batches()

Apply a function to a stream of RecordBatches

match_arrow() is_in()

Value matching for Arrow objects

mmap_create()

Create a new read/write memory mapped file of a given size

mmap_open()

Open a memory mapped file

new_extension_type() new_extension_array() register_extension_type() reregister_extension_type() unregister_extension_type()

Extension types

open_dataset()

Open a multi-file dataset

open_delim_dataset() open_csv_dataset() open_tsv_dataset()

Open a multi-file dataset of CSV or other delimiter-separated format

read_delim_arrow() read_csv_arrow() read_csv2_arrow() read_tsv_arrow()

Read a CSV or other delimited file with Arrow

read_feather() read_ipc_file()

Read a Feather file (an Arrow IPC file)

read_ipc_stream()

Read Arrow IPC stream format

read_json_arrow()

Read a JSON file

read_message()

Read a Message from a stream

read_parquet()

Read a Parquet file

read_schema()

Read a Schema from a stream

record_batch()

Create a RecordBatch

register_scalar_function()

Register user-defined functions

s3_bucket()

Connect to an AWS S3 bucket

scalar()

Create an Arrow Scalar

schema()

Create a schema or extract one from an object.

show_exec_plan()

Show the details of an Arrow Execution Plan

arrow_table()

Create an Arrow Table

to_arrow()

Create an Arrow object from a DuckDB connection

to_duckdb()

Create a (virtual) DuckDB table from an Arrow object

unify_schemas()

Combine and harmonize schemas

value_counts()

table for Arrow objects

vctrs_extension_array() vctrs_extension_type()

Extension type for generic typed vectors

write_csv_arrow()

Write CSV file to disk

write_dataset()

Write a dataset

write_delim_dataset() write_csv_dataset() write_tsv_dataset()

Write a dataset into partitioned flat files.

write_feather() write_ipc_file()

Write a Feather file (an Arrow IPC file)

write_ipc_stream()

Write Arrow IPC stream format

write_parquet()

Write Parquet file to disk

write_to_raw()

Write Arrow data to a raw vector