arquero

Arquero API Reference

Top-Level Table Verbs Op Functions Expressions Extensibility


Standard Functions

Standard library of table expression functions. The op object exports these as standard JavaScript functions that behave the same whether invoked inside or outside a table expression context.

Array Functions


# op.compact(array) · Source

Returns a new compacted array with invalid values (null, undefined, NaN) removed.

Examples

op.compact([1, null, 2, undefined, NaN, 3]) // [ 1, 2, 3 ]

# op.concat(…values) · Source

Merges two or more arrays in sequence, returning a new array.


# op.includes(array, value[, index]) · Source

Determines whether an array includes a certain value among its entries, returning true or false as appropriate.


# op.indexof(sequence, value) · Source

Returns the first index at which a given value can be found in the sequence (array or string), or -1 if it is not present.


# op.join(array[, delimiter]) · Source

Creates and returns a new string by concatenating all of the elements in an array (or an array-like object), separated by commas or a specified delimiter string. If the array has only one item, then that item will be returned without using the delimiter.


# op.lastindexof(sequence, value) · Source

Returns the last index at which a given value can be found in the sequence (array or string), or -1 if it is not present.


# op.length(sequence) · Source

Returns the length of the input sequence (array or string).


# op.pluck(array, property) · Source

Returns a new array in which the given property has been extracted for each element in the input array.


# op.reverse(sequence) · Source

Returns a new array or string with the element order reversed: the first sequence element becomes the last, and the last sequence element becomes the first. The input sequence is unchanged.


# op.sequence([start,] stop[, step]) · Source

Returns an array containing an arithmetic sequence from the start value to the stop value, in step increments. If step is positive, the last element is the largest start + i * step less than stop; if step is negative, the last element is the smallest start + i * step greater than stop. If the returned array would contain an infinite number of values, an empty range is returned.


# op.slice(sequence[, start, end]) · Source

Returns a copy of a portion of the input sequence (array or string) selected from start to end (end not included) where start and end represent the index of items in the sequence.


Date Functions


# op.now() · Source

Returns the current time as the number of milliseconds elapsed since January 1, 1970 00:00:00 UTC.


# op.timestamp(date) · Source

Returns the timestamp for a date as the number of milliseconds elapsed since January 1, 1970 00:00:00 UTC.


# op.datetime([year, month, date, hours, minutes, seconds, milliseconds]) · Source

Creates and returns a new Date value. If no arguments are provided, the current date and time are used.


# op.year(date) · Source

Returns the year of the specified date according to local time.


# op.quarter(date) · Source

Returns the zero-based quarter of the specified date according to local time.


# op.month(date) · Source

Returns the zero-based month of the specified date according to local time. A value of 0 indicates January, 1 indicates February, and so on.


# op.week(date[, firstday]) · Source

Returns the week number of the year (0-53) for the specified date according to local time. By default, Sunday is used as the first day of the week. All days in a new year preceding the first Sunday are considered to be in week 0.


# op.date(date) · Source

Returns the date (day of month) of the specified date according to local time.


# op.dayofyear(date) · Source

Returns the day of the year (1-366) of the specified date according to local time.


# op.dayofweek(date) · Source

Returns the Sunday-based day of the week (0-6) of the specified date according to local time. A value of 0 indicates Sunday, 1 indicates Monday, and so on.


# op.hours(date) · Source

Returns the hour of the day for the specified date according to local time.


# op.minutes(date) · Source

Returns the minute of the hour for the specified date according to local time.


# op.seconds(date) · Source

Returns the seconds of the minute for the specified date according to local time.


# op.milliseconds(date) · Source

Returns the milliseconds of the second for the specified date according to local time.


# op.utcdatetime([year, month, date, hours, minutes, seconds, milliseconds]) · Source

Creates and returns a new Date value using Coordinated Universal Time (UTC). If no arguments are provided, the current date and time are used.


# op.utcyear(date) · Source

Returns the year of the specified date according to Coordinated Universal Time (UTC).


# op.utcquarter(date) · Source

Returns the zero-based quarter of the specified date according to Coordinated Universal Time (UTC).


# op.utcmonth(date) · Source

Returns the zero-based month of the specified date according to Coordinated Universal Time (UTC). A value of 0 indicates January, 1 indicates February, and so on.


# op.utcweek(date[, firstday]) · Source

Returns the week number of the year (0-53) for the specified date according to Coordinated Universal Time (UTC). By default, Sunday is used as the first day of the week. All days in a new year preceding the first Sunday are considered to be in week 0.


# op.utcdate(date) · Source

Returns the date (day of month) of the specified date according to Coordinated Universal Time (UTC).


# op.utcdayofyear(date) · Source

Returns the day of the year (1-366) of the specified date according to Coordinated Universal Time (UTC).


# op.utcdayofweek(date) · Source

Returns the Sunday-based day of the week (0-6) of the specified date according to Coordinated Universal Time (UTC). A value of 0 indicates Sunday, 1 indicates Monday, and so on.


# op.utchours(date) · Source

Returns the hour of the day for the specified date according to Coordinated Universal Time (UTC).


# op.utcminutes(date) · Source

Returns the minute of the hour for the specified date according to Coordinated Universal Time (UTC).


# op.utcseconds(date) · Source

Returns the seconds of the minute for the specified date according to Coordinated Universal Time (UTC).


# op.utcmilliseconds(date) · Source

Returns the milliseconds of the second for the specified date according to Coordinated Universal Time (UTC).


# op.format_date(date[, shorten]) · Source

Returns an ISO 8601 formatted string for the given date in local timezone. The resulting string is compatible with parse_date and JavaScript’s built-in Date.parse.


# op.format_utcdate(date[, shorten]) · Source

Returns an ISO 8601 formatted string for the given date in Coordinated Universal Time (UTC). The resulting string is compatible with parse_date and JavaScript’s built-in Date.parse.


JSON Functions

Functions for parsing and generating strings formatted using JavaScript Object Notation (JSON).


# op.parse_json(value) · Source

Parses a string value in JSON format, constructing the JavaScript value or object described by the string.


# op.to_json(value) · Source

Converts a JavaScript object or value to a JSON string.


Math Functions


# op.bin(value, min, max, step[, offset]) · Source

Truncate a value to a bin boundary. Useful for creating equal-width histograms. Values outside the [min, max] range will be mapped to -Infinity (value < min) or +Infinity (value > max).


# op.random() · Source

Return a random floating point number between 0 (inclusive) and 1 (exclusive). By default uses Math.random. Use the seed method to instead use a seeded random number generator.


# op.is_nan(value) · Source

Tests if the input value is not a number (NaN); equivalent to Number.isNaN. The method will return true only if the input value is an actual numeric NaN value; it will return false for other types (booleans, strings, etc.).

Examples

op.is_nan(NaN) // true
op.is_nan(0/0) // true
op.is_nan(op.sqrt(-1)) // true
op.is_nan('foo') // false
op.is_nan(+'foo') // true, coerce to number first
op.is_nan(true) // false
op.is_nan(+true) // false, booleans coerce to numbers
op.is_nan(undefined) // false
op.is_nan(+undefined) // true, coerce to number first
op.is_nan(null) // false
op.is_nan(+null) // false, null coerces to zero

# op.is_finite(value) · Source

Tests if the input value is finite; equivalent to Number.isFinite.


# op.abs(value) · Source

Returns the absolute value of the input value; equivalent to Math.abs.


# op.cbrt(value) · Source

Returns the cube root value of the input value; equivalent to Math.cbrt.


# op.ceil(value) · Source

Returns the ceiling of the input value, the nearest integer equal to or greater than the input; equivalent to Math.ceil.


# op.clz32(value) · Source

Returns the number of leading zero bits in the 32-bit binary representation of a number value; equivalent to Math.clz32.


# op.exp(value) · Source

Returns evalue, where e is Euler’s number, the base of the natural logarithm; equivalent to Math.exp.


# op.expm1(value) · Source

Returns evalue - 1, where e is Euler’s number, the base of the natural logarithm; equivalent to Math.expm1.


# op.floor(value) · Source

Returns the floor of the input value, the nearest integer equal to or less than the input; equivalent to Math.floor.


# op.fround(value) · Source

Returns the nearest 32-bit single precision float representation of the input number value; equivalent to Math.fround. Useful for translating between 64-bit Number values and values from a Float32Array.


# op.greatest(…values) · Source

Returns the greatest (maximum) value among the input values; equivalent to Math.max. This is not an aggregate function, see op.max to compute a maximum value across multiple rows.


# op.least(…values) · Source

Returns the least (minimum) value among the input values; equivalent to Math.min. This is not an aggregate function, see op.min to compute a minimum value across multiple rows.


# op.log(value) · Source

Returns the natural logarithm (base e) of a number value; equivalent to Math.log.


# op.log10(value) · Source

Returns the base 10 logarithm of a number value; equivalent to Math.log10.


# op.log1p(value) · Source

Returns the natural logarithm (base e) of 1 + a number value; equivalent to Math.log1p.


# op.log2(value) · Source

Returns the base 2 logarithm of a number value; equivalent to Math.log2.


# op.pow(base, exponent) · Source

Returns the base raised to the exponent power, that is, baseexponent; equivalent to Math.pow.


# op.round(value) · Source

Returns the value of a number rounded to the nearest integer; ; equivalent to Math.round.


# op.sign(value) · Source

Returns either a positive or negative +/- 1, indicating the sign of the input value; equivalent to Math.sign.


# op.sqrt(value) · Source

Returns the square root of the input value; equivalent to Math.sqrt.


# op.trunc(value) · Source

Returns the integer part of a number by removing any fractional digits; equivalent to Math.trunc.


# op.degrees(radians) · Source

Converts the input radians value to degrees.


# op.radians(radians) · Source

Converts the input degrees value to radians.


# op.acos(value) · Source

Returns the arc-cosine (in radians) of a number value; equivalent to Math.acos.


# op.acosh(value) · Source

Returns the hyperbolic arc-cosine of a number value; equivalent to Math.acosh.


# op.asin(value) · Source

Returns the arc-sine (in radians) of a number value; equivalent to Math.asin.


# op.asinh(value) · Source

Returns the hyperbolic arc-sine of a number value; equivalent to Math.asinh.


# op.atan(value) · Source

Returns the arc-tangent (in radians) of a number value; equivalent to Math.atan.


# op.atan2(y, x) · Source

Returns the angle in the plane (in radians) between the positive x-axis and the ray from (0, 0) to the point (x, y); ; equivalent to Math.atan2.


# op.atanh(value) · Source

Returns the hyperbolic arc-tangent of a number value; equivalent to Math.atanh.


# op.cos(value) · Source

Returns the cosine (in radians) of a number value; equivalent to Math.cos.


# op.cosh(value) · Source

Returns the hyperbolic cosine of a number value; equivalent to Math.cosh.


# op.sin(value) · Source

Returns the sine (in radians) of a number value; equivalent to Math.sin.


# op.sinh(value) · Source

Returns the hyperbolic sine of a number value; equivalent to Math.sinh.


# op.tan(value) · Source

Returns the tangent (in radians) of a number value; equivalent to Math.tan.


# op.tanh(value) · Source

Returns the hyperbolic tangent of a number value; equivalent to Math.tanh.


Object Functions


# op.equal(a, b) · Source

Compare two values for equality, using join semantics in which null !== null. If the inputs are object-valued, a deep equality check of array entries or object key-value pairs is performed. The method is helpful within custom join condition expressions.


# op.has(object, key) · Source

Returns a boolean indicating whether the object has the specified key as its own property (as opposed to inheriting it). If the object is a Map or Set instance, the has method will be invoked directly on the object, otherwise Object.hasOwnProperty is used.


# op.keys(object) · Source

Returns an array of a given object’s own enumerable property names. If the object is a Map instance, the keys method will be invoked directly on the object, otherwise Object.keys is used.


# op.values(object) · Source

Returns an array of a given object’s own enumerable property values. If the object is a Map or Set instance, the values method will be invoked directly on the object, otherwise Object.values is used.


# op.entries(object) · Source

Returns an array of a given object’s own enumerable string-keyed property [key, value] pairs. If the object is a Map or Set instance, the entries method will be invoked directly on the object, otherwise Object.entries is used.


# op.object(entries) · Source

Returns a new object given an iterable entries argument of [key, value] pairs. This method is Arquero’s version of the standard Object.fromEntries method.


# op.recode(value, map[, fallback]) · Source

Recodes an input value to an alternative value, based on a provided value map object. If a fallback value is specified, it will be returned when the input value is not found in the map; otherwise, the input value is returned unchanged.

Examples

// recode values in a derive statement
table.derive({ val: d => op.recode(d.val, { 'opt:a': 'A', 'opt:b': 'B' }) })
// define value map externally, bind as parameter
const map = { 'opt:a': 'A', 'opt:b': 'B' };
table
  .params({ map })
  .derive({ val: (d, $) => op.recode(d.val, $.map, '?') })
// using a Map object, bind as parameter
const map = new Map().set('opt:a', 'A').set('opt:b', 'B');
table
  .params({ map })
  .derive({ val: (d, $) => op.recode(d.val, $.map, '?') })

# op.row_object([…columns]) · Source

Generate a new object containing the data for the current table row. The new object maps from column name keys to table values for the current row. The optional columns list indicates which columns to include in the object; if unspecified, all columns are included by default.

This method can only be invoked within a single-table expression. Calling this method in a multi-table expression (such as for a join) results in an error. An error will also result if any provided column names are specified using dynamic lookups of table column values.

Examples

aq.table({ a: [1, 3], b: [2, 4] })
  .derive({ row: op.row_object() })
  .get('row', 0); // { a: 1, b: 2 }
// rollup a table into an array of row objects
table.rollup({ rows: d => op.array_agg(op.row_object()) })


String Functions


# op.parse_date(value) · Source

Parses a string value and returns a Date instance. Beware: this method uses JavaScript’s Date.parse() functionality, which is inconsistently implemented across browsers. That said, ISO 8601 formatted strings such as those produced by format_date and format_utcdate should be supported across platforms. Note that “bare” ISO date strings such as "2001-01-01" are interpreted by JavaScript as indicating midnight of that day in Coordinated Universal Time (UTC), not local time. To indicate the local timezone, an ISO string can include additional time components and no Z suffix: "2001-01-01T00:00".


# op.parse_float(value) · Source

Parses a string value and returns a floating point number.


# op.parse_int(value[, radix]) · Source

Parses a string value and returns an integer of the specified radix (the base in mathematical numeral systems).


# op.endswith(value, search[, length]) · Source

Determines whether a string value ends with the characters of a specified search string, returning true or false as appropriate.


# op.match(value, regexp[, index]) · Source

Retrieves the result of matching a string value against a regular expression regexp. If no index is specified, returns an array whose contents depend on the presence or absence of the regular expression global (g) flag, or null if no matches are found. If the g flag is used, all results matching the complete regular expression will be returned, but capturing groups will not. If the g flag is not used, only the first complete match and its related capturing groups are returned.

If specified, the index looks up a value of the resulting match. If index is a number, the corresponding index of the result array is returned. If index is a string, the value of the corresponding named capture group is returned, or null if there is no such group.

Examples

// returns ['1', '2', '3']
op.match('1 2 3', /\d+/g)
// returns '2' (index into match array)
op.match('1 2 3', /\d+/g, 1)
// returns '3' (index of capture group)
op.match('1 2 3', /\d+ \d+ (\d+)/, 1)
// returns '2' (named capture group)
op.match('1 2 3', /\d+ (?<digit>\d+)/, 'digit')

# op.normalize(value[, form]) · Source

Returns the Unicode normalization form of the string value.


# op.padend(value, length[, fill]) · Source

Pad a string value with a given fill string (applied from the end of value and repeated, if needed) so that the resulting string reaches a given length.


# op.padstart(value, length[, fill]) · Source

Pad a string value with a given fill string (applied from the start of value and repeated, if needed) so that the resulting string reaches a given length.


# op.lower(value) · Source

Returns the string value converted to lower case.


# op.upper(value) · Source

Returns the string value converted to upper case.


# op.repeat(value, number) · Source

Returns a new string which contains the specified number of copies of the value string concatenated together.


# op.replace(value, pattern, replacement) · Source

Returns a new string with some or all matches of a pattern replaced by a replacement. The pattern can be a string or a regular expression, and the replacement must be a string. If pattern is a string, only the first occurrence will be replaced; to make multiple replacements, use a regular expression pattern with a g (global) flag.


# op.split(value, separator[, limit]) · Source

Divides a string value into an ordered list of substrings based on a separator pattern, puts these substrings into an array, and returns the array.


# op.startswith(value, search[, position]) · Source

Determines whether a string value starts with the characters of a specified search string, returning true or false as appropriate.


# op.substring(value[, start, end]) · Source

Returns the part of the string value between the start and end indexes, or to the end of the string.


# op.trim(value) · Source

Returns a new string with whitespace removed from both ends of the input value string. Whitespace in this context is all the whitespace characters (space, tab, no-break space, etc.) and all the line terminator characters (LF, CR, etc.).


Aggregate Functions

Aggregate table expression functions for summarizing values. If invoked outside a table expression context, column (field) inputs must be column name strings, and the operator will return a corresponding table expression.


# op.any(field) · Source

Aggregate function returning an arbitrary observed value (typically the first encountered).


# op.bins(field[, maxbins, nice, minstep, step]) · Source

Aggregate function for calculating a binning scheme in terms of the minimum bin boundary, maximum bin boundary, and step size.


# op.count() · Source

Aggregate function to count the number of records (rows).


# op.distinct(field) · Source

Aggregate function to count the number of distinct values.


# op.valid(field) · Source

Aggregate function to count the number of valid values. Invalid values are null, undefined, or NaN.


# op.invalid(field) · Source

Aggregate function to count the number of invalid values. Invalid values are null, undefined, or NaN.


# op.max(field) · Source

Aggregate function for the maximum value. For a non-aggregate version, see op.greatest.


# op.min(field) · Source

Aggregate function for the minimum value. For a non-aggregate version, see op.least.


# op.sum(field) · Source

Aggregate function to sum values.


# op.product(field) · Source

Aggregate function to multiply values.


# op.mean(field) · Source

Aggregate function for the mean (average) value. This operator is a synonym for average.


# op.average(field) · Source

Aggregate function for the average (mean) value. This operator is a synonym for mean.


# op.mode(field) · Source

Aggregate function to determine the mode (most frequent) value.


# op.median(field) · Source

Aggregate function for the median value. This operation is a shorthand for the quantile value at p = 0.5.


# op.quantile(field, p) · Source

Aggregate function to compute the quantile boundary of a data field for a probability threshold. The median is the value of quantile at p = 0.5.


# op.stdev(field) · Source

Aggregate function for the sample standard deviation.


# op.stdevp(field) · Source

Aggregate function for the population standard deviation.


# op.variance(field) · Source

Aggregate function for the sample variance.


# op.variancep(field) · Source

Aggregate function for the population variance.


# op.corr(field1, field2) · Source

Aggregate function for the product-moment correlation between two variables. To instead compute a rank correlation, compute the average ranks for each variable and then apply this function to the result.


# op.covariance(field1, field2) · Source

Aggregate function for the sample covariance between two variables.


# op.covariancep(field1, field2) · Source

Aggregate function for the population covariance between two variables.


# op.array_agg(field) · Source

Aggregate function to collect an array of field values. The resulting aggregate is an array (one per group) containing all observed values. The order of values is sensitive to any orderby criteria.

Examples

aq.table({ v: [1, 2, 3, 1] })
  .rollup({ a: op.array_agg('v') }) // a: [ [1, 2, 3, 1] ]

# op.array_agg_distinct(field) · Source

Aggregate function to collect an array of distinct (unique) field values. The resulting aggregate is an array (one per group) containing all unique values. The order of values is sensitive to any orderby criteria.

Examples

aq.table({ v: [1, 2, 3, 1] })
  .rollup({ a: op.array_agg_distinct('v') }) // a: [ [1, 2, 3] ]

# op.object_agg(key, value) · Source

Aggregate function to create an object given input key and value fields. The resulting aggregate is an object (one per group) with keys and values defined by the input fields. For any keys that occur multiple times in a group, the most recently observed value is used. The order in which keys and values are observed is sensitive to any orderby criteria.

Examples

aq.table({ k: ['a', 'b', 'a'], v: [1, 2, 3] })
  .rollup({ o: op.object_agg('k', 'v') }) // o: [ { a: 3, b: 2 } ]

# op.map_agg(key, value) · Source

Aggregate function to create a Map given input key and value fields. The resulting aggregate is a Map (one per group) with keys and values defined by the input fields. For any keys that occur multiple times in a group, the most recently observed value is used. The order in which keys and values are observed is sensitive to any orderby criteria.

Examples

aq.table({ k: ['a', 'b', 'a'], v: [1, 2, 3] })
  .rollup({ m: op.map_agg('k', 'v') }) // m: [ new Map([['a', 3], ['b', 2]]) ]

# op.entries_agg(key, value) · Source

Aggregate function to create an array in the style of Object.entries given input key and value fields. The resulting aggregate is an array (one per group) with [key, value] arrays defined by the input fields, and may include duplicate keys. The order of entries is sensitive to any orderby criteria.

Examples

aq.table({ k: ['a', 'b', 'a'], v: [1, 2, 3] })
  .rollup({ e: op.entries_agg('k', 'v') }) // e: [ [['a', 1], ['b', 2], ['a', 3]] ]


Window Functions

Window table expression functions applicable over ordered table rows. If invoked outside a table expression context, column (field) inputs must be column name strings, and the operator will return a corresponding table expression.


# op.row_number() · Source

Window function to assign consecutive row numbers, starting from 1.


# op.rank() · Source

Window function to assign a rank to each value in a group, starting from 1. Peer values are assigned the same rank. Subsequent ranks reflect the number of prior values: if the first two values tie for rank 1, the third value is assigned rank 3.


# op.avg_rank() · Source

Window function to assign a fractional (average) rank to each value in a group, starting from 1. Peer values are assigned the average of their indices: if the first two values tie, both will be assigned rank 1.5.


# op.dense_rank() · Source

Window function to assign a dense rank to each value in a group, starting from 1. Peer values are assigned the same rank. Subsequent ranks do not reflect the number of prior values: if the first two values tie for rank 1, the third value is assigned rank 2.


# op.percent_rank() · Source

Window function to assign a percentage rank to each value in a group. The percent is calculated as (rank - 1) / (group_size - 1).


# op.cume_dist() · Source

Window function to assign a cumulative distribution value between 0 and 1 to each value in a group.


# op.ntile(num) · Source

Window function to assign a quantile (e.g., percentile) value to each value in a group. Accepts an integer parameter indicating the number of buckets to use (e.g., 100 for percentiles, 5 for quintiles).


# op.lag(field[, offset, defaultValue]) · Source

Window function to assign a value that precedes the current value by a specified number of positions. If no such value exists, returns a default value instead.


# op.lead(field[, offset, defaultValue]) · Source

Window function to assign a value that follows the current value by a specified number of positions. If no such value exists, returns a default value instead.


# op.first_value(field) · Source

Window function to assign the first value in a sliding window frame.


# op.last_value(field) · Source

Window function to assign the last value in a sliding window frame.


# op.nth_value(field[, nth]) · Source

Window function to assign the nth value in a sliding window frame (counting from 1), or undefined if no such value exists.


# op.fill_down(field[, defaultValue]) · Source

Window function to fill in missing values with preceding values. Returns the value at the current window position if it is valid (not null, undefined, or NaN), otherwise returns the first preceding valid value. If no such value exists, returns the default value.


# op.fill_up(field[, defaultValue]) · Source

Window function to fill in missing values with subsequent values. Returns the value at the current window position if it is valid (not null, undefined, or NaN), otherwise returns the first subsequent valid value. If no such value exists, returns the default value.