| Top-Level | Data Types | Schema | Table | Column |
# tableFromIPC(data[, options])
Decode Apache Arrow IPC data and return a new Table. The input binary data may be either an ArrayBuffer or Uint8Array. For Arrow data in the IPC ‘stream’ format, an array of Uint8Array values is also supported.
By default Flechette assumes input data is uncompressed. If input IPC data contains compressed buffers, an appropriate compression codec (for CompressionType.LZ4_FRAME or CompressionType.ZSTD) must be registered ahead of time using the setCompressionCodec method. Otherwise, an error is thrown.
ArrayBuffer | Uint8Array | Uint8Array[]): The source byte buffer, or an array of buffers. If an array, each byte array may contain one or more self-contained messages. Messages may NOT span multiple byte arrays.ExtractionOptions): Options for controlling how values are transformed when extracted from an Arrow binary representation.
boolean): If true, extract 64-bit integers as JavaScript BigInt values. Otherwise, coerce long integers to JavaScript number values (default false), raising an error if the integer can not be represented as a double precision floating point number.boolean): If true, extract dates and timestamps as JavaScript Date objects. Otherwise, return numerical timestamp values (default false).boolean): If true, extract decimal-type data as scaled integer values, where fractional digits are scaled to integer positions. Returned integers are BigInt values for decimal bit widths of 64 bits or higher and 32-bit integers (as JavaScript number) otherwise. If false, decimals are lossily converted to floating-point numbers (default).boolean): If true, extract Arrow ‘Map’ values as JavaScript Map instances. Otherwise, return an array of [key, value] pairs compatible with both Map and Object.fromEntries (default false).boolean): If true, extract Arrow ‘Struct’ values and table row objects using zero-copy proxy objects that extract data from underlying Arrow batches. The proxy objects can improve performance and reduce memory usage, but do not support property enumeration (Object.keys, Object.values, Object.entries) or spreading ({ ...object }). Otherwise (default false), use standard JS objects for structs and table rows.import { tableFromIPC } from '@uwdata/flechette';
const url = 'https://vega.github.io/vega-datasets/data/flights-200k.arrow';
const ipc = await fetch(url).then(r => r.arrayBuffer());
const table = tableFromIPC(ipc);
# tableToIPC(table[, options])
Encode an Arrow table into Arrow IPC binary format and return the result as a Uint8Array. Both the IPC ‘stream’ and ‘file’ formats are supported. By default Flechette encodes uncompressed data. To perform compression, register a compression codec plugin using setCompressionCodec and then set the codec option.
Table): The Arrow table to encode.object): Encoding options object.
string): Arrow 'stream' (the default) or 'file' (a.k.a. Feather V2) format.number): The compression codec type to apply. By default no compression is applied. If specified, this option must be one of the CompressionType values, and a corresponding codec plugin must already be registered using the setCompressionCodec method.import { tableToIPC } from '@uwdata/flechette';
const bytes = tableToIPC(table, { format: 'stream' });
import { CompressionType, setCompressionCodec, tableToIPC } from '@uwdata/flechette';
import { ZstdCodec } from 'zstd-codec';
// register zstd compression codec
await new Promise((resolve) => {
ZstdCodec.run((zstd) => {
const codec = new zstd.Simple();
setCompressionCodec(CompressionType.ZSTD, {
encode: (data) => codec.compress(data),
decode: (data) => codec.decompress(data)
});
resolve();
});
});
// generate bytes in IPC stream format, with zstd compression of buffers
const bytes = tableToIPC(table, { codec: CompressionType.ZSTD });
import { CompressionType, setCompressionCodec, tableToIPC } from '@uwdata/flechette';
import * as lz4 from 'lz4js';
// register lz4_frame compression codec
setCompressionCodec(CompressionType.LZ4_FRAME, {
encode: (data) => lz4.compress(data),
decode: (data) => lz4.decompress(data)
});
// generate bytes in IPC stream format, with lz4_frame compression of buffers
const bytes = tableToIPC(table, { codec: CompressionType.LZ4_FRAME });
# tableFromArrays(data[, options])
Create a new table from a set of named arrays. Data types for the resulting Arrow columns can be automatically inferred or specified using the types option. If the types option provides data types for only a subset of columns, the rest are inferred. Each input array must have the same length.
object | array): The input data as a collection of named arrays. If object-valued, the objects keys are column names and the values are arrays or typed arrays. If array-valued, the data should consist of [name, array] pairs in the style of Object.entries.object): Options for building new tables and controlling how values are transformed when extracted from an Arrow binary representation.
object): An object mapping column names to data types.number): The maximum number of rows to include in a single record batch. If the array lengths exceed this number, the resulting table will consist of multiple record batches.import { tableFromArrays } from '@uwdata/flechette';
// create table with inferred types
const table = tableFromArrays({
ints: [1, 2, null, 4, 5],
floats: [1.1, 2.2, 3.3, 4.4, 5.5],
bools: [true, true, null, false, true],
strings: ['a', 'b', 'c', 'b', 'a']
});
import {
bool, dictionary, float32, int32, tableFromArrays, tableToIPC, utf8
} from '@uwdata/flechette';
// create table with specified types
const table = tableFromArrays({
ints: [1, 2, null, 4, 5],
floats: [1.1, 2.2, 3.3, 4.4, 5.5],
bools: [true, true, null, false, true],
strings: ['a', 'b', 'c', 'b', 'a']
}, {
types: {
ints: int32(),
floats: float32(),
bools: bool(),
strings: dictionary(utf8())
}
});
# columnFromArray(array[, type, options])
Create a new column from a provided data array. The data types for the column can be automatically inferred or specified using the type argument.
Array | TypedArray): The input data as an Array or TypedArray.DataType): The data type for the column. If not specified, type inference is attempted.object): Options for building new columns and controlling how values are transformed when extracted from an Arrow binary representation.
number): The maximum number of rows to include in a single record batch. If the array length exceeds this number, the resulting column will consist of multiple record batches.import { columnFromArray, float32, int64 } from '@uwdata/flechette';
// create column with inferred type (here, float64)
columnFromArray([1.1, 2.2, 3.3, 4.4, 5.5]);
// create column with specified type
columnFromArray([1.1, 2.2, 3.3, 4.4, 5.5], float32());
// create column with specified type and options
columnFromArray(
[1n, 32n, 2n << 34n], int64(),
{ maxBatchRows: 1000, useBigInt: true }
);
# columnFromValues(values[, type, options])
Create a new column by iterating over provided values. The values argument can either be an iterable collection or a visitor function that applies a callback to successive data values (akin to Array.forEach). The data types for the column can be automatically inferred or specified using the type argument.
Iterable | (callback: (value: any) => void) => void): An iterable object or a visitor function that applies a callback to successive data values (akin to Array.forEach). In most cases, providing a visitor function will be more efficient than an iterator, so is recommended for larger datasets.DataType): The data type for the column. If not specified, type inference is attempted.object): Options for building new columns and controlling how values are transformed when extracted from an Arrow binary representation.
number): The maximum number of rows to include in a single record batch. If the number of values exceeds this number, the resulting column will consist of multiple record batches.import { columnFromValues, float32, int64 } from '@uwdata/flechette';
// an iterable set of values
const set = new Set([1.1, 1.1, 2.2, 3.4]);
// create column with inferred type (here, float64)
columnFromValues(set);
// create column with specified type
columnFromValues(set, float32());
// create column using only values with odd-numbered indices
const values = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
columnFromValues(callback => {
values.forEach((v, i) => { if (i % 2) callback(v); })
});
# tableFromColumns(columns[, useProxy])
Create a new table from a collection of columns. This method is useful for creating new tables using one or more pre-existing column instances. Otherwise, tableFromArrays should be preferred. Input columns are assumed to have the same record batch sizes.
object | array): The input columns as an object with name keys, or an array of [name, column] pairs.boolean): Flag indicating if row proxy objects should be used to represent table rows (default false). Typically this should match the value of the useProxy extraction option used for column generation.import { columnFromArray, tableFromColumns } from '@uwdata/flechette';
// create table from columns with inferred types (here, bool and float64)
const table = tableFromColumns({
bools: columnFromArray([true, true, null, false, true]),
floats: columnFromArray([1.1, 2.2, 3.3, 4.4, 5.5])
});
# setCompressionCodec(type, codec)
Register a compression codec for compressing or decompressing Arrow bufferdata. Flechette does not include compression codecs by default, but can be extended via plugins to handle compressed data. If an appropriate codec implementation is not registered, an error is thrown when attempting to compress or decompress data.
number): The codec type id, one of the values of the CompressionType object.object): The codec implementation as an object with encode (to compress) and decode (to decompress) functions, each of which take a Uint8Array as input and return a Uint8Array as output.import { CompressionType, setCompressionCodec } from '@uwdata/flechette';
import { ZstdCodec } from 'zstd-codec';
// register zstd compression codec
await new Promise((resolve) => {
ZstdCodec.run((zstd) => {
const codec = new zstd.Simple();
setCompressionCodec(CompressionType.ZSTD, {
encode: (data) => codec.compress(data),
decode: (data) => codec.decompress(data)
});
resolve();
});
});
import { CompressionType, setCompressionCodec } from '@uwdata/flechette';
import * as lz4 from 'lz4js';
// register lz4_frame compression codec
setCompressionCodec(CompressionType.LZ4_FRAME, {
encode: (data) => lz4.compress(data),
decode: (data) => lz4.decompress(data)
});