Top-Level | Table | Verbs | Op Functions | Expressions | Extensibility |
# table.derive(values[, options]) · Source
Derive new column values based on the provided expressions.
false
) indicating if the original columns should be dropped, leaving only the derived columns. If true
, the before and after options are ignored.Examples
table.derive({ sumXY: d => d.x + d.y })
table.derive({ z: d => d.x * d.y }, { before: 'x' })
# table.filter(criteria) · Source
Filter a table to a subset of rows based on the input criteria. The resulting table provides a filtered view over the original data; no data copy is made. To create a table that copies only filtered data to new data structures, call reify on the output table.
Examples
table.filter(d => op.abs(d.value) < 5)
# table.slice([start, end]) · Source
Extract rows with indices from start to end (end not included), where start and end represent per-group ordered row numbers in the table. The table row indices are determined by the current orderby settings. The start and end arguments are applied separately to each group, as determined by groupby.
Examples
// slice the table to include all rows except for the first and last
table.slice(1, -1)
// extract (up to) the first two rows of each group
table.groupby('colA').slice(0, 2)
# table.groupby(…keys) · Source
Group table rows based on a set of column values. Subsequent operations that are sensitive to grouping (such as aggregate functions) will operate over the grouped rows. To undo grouping, use ungroup.
Examples
table.groupby('colA', 'colB')
table.groupby({ key: d => d.colA + d.colB })
Ungroup a table, removing any grouping criteria. Undoes the effects of groupby.
Examples
table.ungroup()
# table.orderby(…keys) · Source
Order table rows based on a set of column values. Subsequent operations sensitive to ordering (such as window functions) will operate over sorted values. The resulting table provides an view over the original data, without any copying. To create a table with sorted data copied to new data strucures, call reify on the result of this method. To undo ordering, use unorder.
Examples
// order by column 'a' in ascending order, than 'b' in descending order
table.orderby('a', aq.desc('b'))
// same as above, but with object syntax
// key order is significant, but the key names are ignored
table.orderby({ a: 'a', b: aq.desc('b') )})
// orderby accepts table expressions as well as column names
table.orderby(aq.desc(d => d.a))
Unorder a table, removing any sorting criteria. Undoes the effects of orderby.
Examples
table.unorder()
# table.rollup(values) · Source
Rollup a table to produce an aggregate summary. Often used in conjunction with groupby. To produce counts only, count provides a convenient shortcut.
Examples
table.groupby('colA').rollup({ mean: d => op.mean(d.colB) })
table.groupby('colA').rollup({ mean: op.median('colB') })
# table.count([options]) · Source
Count the number of values in a group. This method is a shorthand for rollup with a count aggregate function.
'count'
).Examples
table.groupby('colA').count()
table.groupby('colA').count({ as: 'num' })
# table.sample(size[, options]) · Source
Generate a table from a random sample of rows. If the table is grouped, perform stratified sampling by sampling separately from each group.
false
) to sample with replacement.true
) to ensure randomly ordered rows.Examples
// sample 50 rows without replacement
table.sample(50)
// sample 100 rows with replacement
table.sample(100, { replace: true })
// stratified sampling with dynamic sample size
table.groupby('colA').sample(aq.frac(0.5))
// sample twice the number of records in each group, with replacement
table.groupby('colA').sample(aq.frac(2), { replace: true })
# table.select(…columns) · Source
Select a subset of columns into a new table, potentially renaming them.
Examples
table.select('colA', 'colB')
table.select(aq.not('colB', 'colC'))
table.select({ colA: 'newA', colB: 'newB' })
# table.relocate(columns, options) · Source
Relocate a subset of columns to change their positions, also potentially renaming them.
Examples
// place colY and colZ immediately after colX
table.relocate(['colY', 'colZ'], { after: 'colX' })
// place all columns but colB and colC immediately before
// the position of colA prior to relocation
table.relocate(not('colB', 'colC'), { before: 'colA' })
// place colA and colB immediately after colC, while also
// respectively renaming them as newA and newB
table.relocate({ colA: 'newA', colB: 'newB' }, { after: 'colC' })
# table.rename(columns) · Source
Rename one or more columns, preserving column order. The columns input should be an object or Map instance that maps existing column names to new column names. Use the names()
helper function to create a rename map based on integer column indices.
Examples
// rename colA to colA2
table.rename({ colA: 'colA2' })
// rename 'old col' to 'new col'
table.rename({ 'old col': 'new col' })
// rename colA and colB
table.rename({ colA: 'colA2', colB: 'colB2' })
// rename colA and colB, alternate syntax
table.rename({ colA: 'colA2' }, { colB: 'colB2' })
// rename the first two columns (by index) to 'colA2' and 'colB2'
table.rename(aq.names('colA2', 'colB2'))
# table.reify([indices]) · Source
Create a new fully-materialized instance of this table. All filter and orderby settings are removed from the new table. Instead, the data itself is filtered and ordered as needed to produce new backing data columns.
Examples
// materialize any internal filtering and ordering
table.reify()
# table.cross(other[, values, options]) · Source
Produce the Cartesian cross product of two tables. The output table has one row for every pair of input table rows. Beware that outputs may be quite large, as the number of output rows is the product of the input row counts. This method is a convenient shorthand for a join in which the join criteria is always true.
['_1', '_2']
).Examples
table.cross(other)
table.cross(other, [['leftKey', 'leftVal'], ['rightVal']])
# table.join(other[, on, values, options]) · Source
Join two tables, extending the columns of one table with values from the other table. The current table is considered the “left” table in the join, and the new table input is considered the “right” table in the join. By default an inner join is performed, removing all rows that do not match the join criteria. To perform left, right, or full outer joins, use the join_left, join_right, or join_full methods, or provide an options argument.
null !== null
). Use the op.equal function to handle these cases.false
) indicating a left outer join. If both left and right are true, indicates a full outer join.false
) indicating a right outer join. If both the left and right are true, indicates a full outer join.['_1', '_2']
).Examples
table.join(other, 'keyShared')
table.join(other, ['keyL', 'keyR'])
table.join(other, (a, b) => op.equal(a.keyL, b.keyR))
# table.join_left(other[, on, values, options]) · Source
Perform a left outer join on two tables. Rows in the left table that do not match a row in the right table will be preserved. This method is a convenient shorthand with fixed options {left: true, right: false}
passed to join.
null !== null
). Use the op.equal function to handle these cases.['_1', '_2']
).Examples
table.join_left(other, 'keyShared')
table.join_left(other, ['keyL', 'keyR'])
table.join_left(other, (a, b) => op.equal(a.keyL, b.keyR))
# table.join_right(other[, on, values, options]) · Source
Perform a right outer join on two tables. Rows in the right table that do not match a row in the left table will be preserved. This method is a convenient shorthand with fixed options {left: false, right: true}
passed to join.
null !== null
). Use the op.equal function to handle these cases.['_1', '_2']
).Examples
table.join_right(other, 'keyShared')
table.join_right(other, ['keyL', 'keyR'])
table.join_right(other, (a, b) => op.equal(a.keyL, b.keyR))
# table.join_full(other[, on, values, options]) · Source
Perform a full outer join on two tables. Rows in either the left or right table that do not match a row in the other will be preserved. This method is a convenient shorthand with fixed options {left: true, right: true}
passed to join.
null !== null
). Use the op.equal function to handle these cases.['_1', '_2']
).Examples
table.join_full(other, 'keyShared')
table.join_full(other, ['keyL', 'keyR'])
table.join_full(other, (a, b) => op.equal(a.keyL, b.keyR))
# table.lookup(other, on, …values) · Source
Lookup values from a secondary table and add them as new columns. A lookup occurs upon matching key values for rows in both tables. If the secondary table has multiple rows with the same key, only the last observed instance will be considered in the lookup. Lookup is similar to join_left, but with a streamlined syntax and the added constraint of allowing at most one match only.
Example
table.lookup(other, ['key1', 'key2'], 'value1', 'value2')
# table.semijoin(other[, on]) · Source
Perform a semi-join, filtering the left table to only rows that match a row in the right table.
Similar to the filter verb, the resulting table provides a filtered view over the original data; no data copy is made. To create a table that copies only semi-joined data to new data structures, call reify on the output table.
null !== null
). Use the op.equal function to handle these cases.Examples
table.semijoin(other)
table.semijoin(other, 'keyShared')
table.semijoin(other, ['keyL', 'keyR'])
table.semijoin(other, (a, b) => op.equal(a.keyL, b.keyR))
# table.antijoin(other[, on]) · Source
Perform an anti-join, filtering the left table to only rows that do not match a row in the right table.
Similar to the filter verb, the resulting table provides a filtered view over the original data; no data copy is made. To create a table that copies only anti-joined data to new data structures, call reify on the output table.
null !== null
). Use the op.equal function to handle these cases.Examples
table.antijoin(other)
table.antijoin(other, 'keyShared')
table.antijoin(other, ['keyL', 'keyR'])
table.antijoin(other, (a, b) => op.equal(a.keyL, b.keyR))
# table.dedupe(…keys) · Source
De-duplicate table rows by removing repeated row values.
Examples
// remove rows that duplicate all column values
table.dedupe()
// remove rows that duplicate the 'a' and 'b' columns
table.dedupe('a', 'b')
// remove rows that duplicate the absolute value of column 'a'
table.dedupe({ abs: d => op.abs(d.a) })
# table.impute(values[, options]) · Source
Impute missing values or rows. Any of null
, undefined
, or NaN
are considered missing values.
The expand option additionally imputes new rows for missing combinations of values. All combinations of expand values (the full cross product) are considered for each group (if specified by groupby). New rows are added for any combination of expand and groupby values not already contained in the table; the additional columns are populated with imputed values (if specified in values) or are otherwise undefined
.
The output table persists a groupby specification. If the expand option is specified, a reified table is returned without any filter or orderby settings.
null
, undefined
, or NaN
).Examples
// replace missing values in column 'v' with zeros
table.impute({ v: () => 0 })
// replace missing values in column 'v' with the mean of non-missing values
table.impute({ v: d => op.mean(d.v) })
// replace missing values in column 'v' with zeros
// impute rows based on all combinations of values in columns 'x' and 'y'
table.impute({ v: () => 0 }, { expand: ['x', 'y'] })
# table.fold(values[, options]) · Source
Fold one or more columns into two key-value pair columns. The fold transform is an inverse of the pivot transform. The resulting table has two new columns, one containing the column names (named “key”) and the other the column values (named “value”). The number of output rows equals the original row count multiplied by the number of folded columns.
['key', 'value']
.Examples
table.fold('colA')
table.fold(['colA', 'colB'])
table.fold(aq.range(5, 8))
# table.pivot(keys, values[, options]) · Source
Pivot columns into a cross-tabulation. The pivot transform is an inverse of the fold transform. The resulting table has new columns for each unique combination of the provided keys, populated with the provided values. The provided values must be aggregates, as a single set of keys may include more than one row. If string-valued, the any aggregate is used. If only one values column is defined, the new pivoted columns are named using key values directly. Otherwise, input value column names are included as a component of the output column names.
Infinity
).'_'
).'_'
).true
) for alphabetical sorting of new column names.Examples
// pivot the values in the 'key' column to be new column names
// using the 'value' column as the new column values
// the any() aggregate combines multiple values with the same key
table.pivot('key', 'value')
// pivot lowercase values of the 'key' column to be new column names
// use the sum of corresponding 'value' entris as new column values
table.pivot(
{ key: d => op.lower(d.key) },
{ value: d => op.sum(d.value) }
)
// pivot on key column 'type' and value columns ['x', 'y']
// generates: { x_a: [1], x_b: [2], y_a: [3], y_b: [4] }
aq.table({ type: ['a', 'b'], x: [1, 2], y: [3, 4 ]})
.pivot('type', ['x', 'y'])
// pivot on the combination of the keys 'foo' and 'bar' for the values of 'x' and 'y'
aq.table({ foo: ['a', 'b'], bar: ['u', 'v'], x: [1, 2], y: [3, 4 ]})
.pivot(['foo', 'bar'], ['x', 'y'])
# table.spread(values[, options]) · Source
Spread array elements into a set of new columns. Output columns are named either according to the as option or using a combination of the input colum names and array index.
true
) indicating if input columns to the spread operation should be dropped in the output table.Infinity
).Examples
// generate new columns 'text_1', 'text_2', etc. by splitting on whitespace
// the input column 'text' is dropped from the output
table.spread({ text: d => op.split(d.text, ' ') })
// generate new columns 'text_1', 'text_2', etc. by splitting on whitespace
// the input column 'text' is retained in the output
table.spread({ text: d => op.split(d.text, ' ') }, { drop: false })
// spread the 'arrayCol' column across a maximum of 100 new columns
table.spread('arrayCol', { limit: 100 })
// extract the first two 'arrayCol' entries into 'value1', 'value2' columns
table.spread('arrayCol', { as: ['value1', 'value2'] })
# table.unroll(values[, options]) · Source
Unroll one or more array-valued columns into new rows. If more than one array value is used, the number of new rows is the smaller of the limit and the largest length. Values for all other columns are copied over.
Infinity
).false
) or column name for adding zero-based array index values as an output column. If true
, a new column named “index” will be added. If string-valued, a new column with the given name will be added.Examples
table.unroll('colA', { limit: 1000 })
table.unroll('colA', { limit: 1000, index: 'idxnum' })
# table.concat(…tables) · Source
Concatenate multiple tables into a single table, preserving all rows. This transformation mirrors the UNION_ALL operation in SQL. It is similar to union but retains all rows, without removing duplicates. Only named columns in this table are included in the output.
Examples
table.concat(other)
table.concat(other1, other2)
table.concat([other1, other2])
# table.union(…tables) · Source
Union multiple tables into a single table, deduplicating all rows. This transformation mirrors the UNION operation in SQL. It is similar to concat but suppresses duplicate rows with values identical to another row. Only named columns in this table are included in the output.
Examples
table.union(other)
table.union(other1, other2)
table.union([other1, other2])
# table.intersect(…tables) · Source
Intersect multiple tables, keeping only rows with matching values for all columns in all tables, and deduplicates the rows. This transformation is similar to a series of one or more semijoin calls, but additionally suppresses duplicate rows.
Examples
table.intersect(other)
table.intersect(other1, other2)
table.intersect([other1, other2])
# table.except(…tables) · Source
Compute the set difference with multiple tables, keeping only rows in this table whose values do not occur in the other tables. This transformation is similar to a series of one or more antijoin calls, but additionally suppresses duplicate rows.
Examples
table.except(other)
table.except(other1, other2)
table.except([other1, other2])