Aggregate Functions
SQL aggregate function expressions.
AggregateNode
The AggregateNode
class represents a SQL AST node for an aggregate function call. Users should not need to instantiate AggregateNode
instances directly, but instead should use aggregate function methods such as count()
, sum()
, etc.
distinct
AggregateNode.distinct()
Returns a new AggregateNode instance that applies the aggregation over distinct values only.
where
AggregateNode.where(filter)
Returns a new AggregateNode instance filtered according to a Boolean-valied filter expression.
window
AggregateNode.window()
Returns a windowed version of this aggregate function as a new WindowNode instance.
partitionby
AggregateNode.partitionby(...expressions)
Provide one or more expressions by which to partition a windowed version of this aggregate function and returns a new WindowNode instance.
orderby
AggregateNode.orderby(...expressions)
Provide one or more expressions by which to sort a windowed version of this aggregate function and returns a new WindowNodw instance.
rows
AggregateNode.rows(expression)
Provide a window "rows" frame specification as an array or array-valued expression and returns a windowed version of this aggregate function as a new WindowNode instance. A "rows" window frame is insensitive to peer rows (those that are tied according to the orderby criteria). The frame expression should evaluate to a two-element array indicating the number of preceding or following rows. A zero value (0
) indicates the current row. A non-finite value (including null
and undefined
) indicates either unbounded preceding row (for the first array entry) or unbounded following rows (for the second array entry).
range
AggregateNode.range(expression)
Provide a window "range" frame specification as an array or array-valued expression and returns a windowed version of this aggregate function as a new WindowNode instance. A "range" window grows to include peer rows (those that are tied according to the orderby criteria). The frame expression should evaluate to a two-element array indicating the number of preceding or following rows. A zero value (0
) indicates the current row. A non-finite value (including null
and undefined
) indicates either unbounded preceding row (for the first array entry) or unbounded following rows (for the second array entry).
count
count()
Create an aggregate function that counts the number of records.
avg
avg(expression)
Create an aggregate function that calculates the average of the input expression.
mad
mad(expression)
Create an aggregate function that calculates the median absolute deviation (MAD) of the input expression.
max
max(expression)
Create an aggregate function that calculates the maximum of the input expression.
min
min(expression)
Create an aggregate function that calculates the minimum of the input expression.
sum
sum(expression)
Create an aggregate function that calculates the sum of the input expression.
product
product(expression)
Create an aggregate function that calculates the product of the input expression.
median
median(expression)
Create an aggregate function that calculates the average of the input expression.
quantile
quantile(expression, p)
Create an aggregate function that calculates the p-th quantile of the input expression. For example, p = 0.5 computes the median, while 0.25 computes the lower inter-quartile range boundary.
mode
mode(expression)
Create an aggregate function that calculates the mode of the input expression.
variance
variance(expression)
Create an aggregate function that calculates the sample variance of the input expression.
stddev
stddev(expression)
Create an aggregate function that calculates the sample standard deviation of the input expression.
skewness
skewness(expression)
Create an aggregate function that calculates the skewness of the input expression.
kurtosis
kurtosis(expression)
Create an aggregate function that calculates the kurtosis of the input expression.
entropy
entropy(expression)
Create an aggregate function that calculates the entropy of the input expression.
varPop
varPop(expression)
Create an aggregate function that calculates the population variance of the input expression.
stddevPop
stddevPop(expression)
Create an aggregate function that calculates the population standard deviation of the input expression.
corr
corr(a, b)
Create an aggregate function that calculates the correlation between the input expressions a and b.
covarPop
covarPop(a, b)
Create an aggregate function that calculates the population covariance between the input expressions a and b.
regrIntercept
regrIntercept(y, x)
Create an aggregate function that returns the intercept of the fitted linear regression model that predicts the target expression y based on the predictor expression x.
regrSlope
regrSlope(y, x)
Create an aggregate function that returns the slope of the fitted linear regression model that predicts the target expression y based on the predictor expression x.
regrCount
regrCount(y, x)
Create an aggregate function that returns the count of non-null values used to fit the linear regression model that predicts the target expression y based on the predictor expression x.
regrR2
regrR2(y, x)
Create an aggregate function that returns the R^2 value of the fitted linear regression model that predicts the target expression y based on the predictor expression x.
regrSXX
regrSXX(y, x)
Create an aggregate function that returns the SXX value (regrCount(y, x) * varPop(x)
) of the fitted linear regression model that predicts the target expression y based on the predictor expression x.
regrSYY
regrSYY(y, x)
Create an aggregate function that returns the SYY value (regrCount(y, x) * varPop(y)
) of the fitted linear regression model that predicts the target expression y based on the predictor expression x.
regrSXY
regrSXY(y, x)
Create an aggregate function that returns the SXY (regrCount(y, x) * covarPop(y, x)
) value of the fitted linear regression model that predicts the target expression y based on the predictor expression x.
regrAvgX
regrAvgX(y, x)
Create an aggregate function that returns the average x value of the data used to fit the linear regression model that predicts the target expression y based on the predictor expression x.
regrAvgY
regrAvgY(y, x)
Create an aggregate function that returns the average x value of the data used to fit the linear regression model that predicts the target expression y based on the predictor expression x.
first
first(expression)
Create an aggregate function that calculates the first observed value of the input expression.
last
last(expression)
Create an aggregate function that calculates the last observed value of the input expression.
argmax
argmax(arg, value)
Create an aggregate function that returns the expression arg corresponding to the maximum value of the expression value.
argmin
argmin(arg, value)
Create an aggregate function that returns the expression arg corresponding to the minimum value of the expression value.
stringAgg
stringAgg(expression)
Create an aggregate function that returns the string concatenation of the input expression values.
arrayAgg
arrayAgg(expression)
Create an aggregate function that returns a list of the input expression values.