﻿ Statistical Functions

# Statistical Functions

The functions that relate to statistical operations are:

AVEDEV(value1,value2,...)

AVEDEV(column)

This function calculates the average of the absolute deviations of the specified values from their mean.

Argument can be a double-precision floating-point value, an integer value, or an array (cell range or datasheet column) of these. Up to 255 arguments may be included. You can use a single array (cell range or column) instead of a list of values. You can use multiple arrays (cell ranges or columns) as well.

AVERAGE(value1,value2,...)

AVERAGE(column)

This function calculates the average of the specified numeric values.

Argument can be a double-precision floating-point value, an integer value, or an array (cell range or datasheet column) of these. Up to 255 arguments may be included. You can use a single array (cell range or column) instead of a list of values. You can use multiple arrays (cell ranges or columns) as well.

AVERAGEA(value1,value2,...)

AVERAGEA(column)

This function calculates the average of the specified values, including text or logical values as well as numeric values.

Argument can be a double-precision floating-point value, an integer value, or an array (cell range or datasheet column) of these. Up to 255 arguments may be included. You can use a single array (cell range or column) instead of a list of values. You can use multiple arrays (cell ranges or columns) as well.

This function calculates the cumulative beta distribution function.

 argument description x Value at which to evaluate the function, between the values of lower and upper alpha Alpha parameter of the distribution beta Beta parameter of the distribution lower [Optional] Lower bound of the interval for x; 0 if omitted upper [Optional] Upper bound of the interval for x; 1 if omitted

BETAINV(prob,alpha,beta,lower,upper)

This function calculates the inverse of the cumulative beta distribution function.

BINOMDIST(x,n,p,cumulative)

This function calculates the individual term binomial distribution probability.

 argument Description x Number representing the number of successes in trials; if not an integer, the number is truncated n Number representing the number of independent trials; if not an integer, the number is truncated p Probability of success on each trial; number between 0 and 1 cumulative Logical value that determines the form of the function; if TRUE, then this function returns the cumulative distribution function, which is the probability that there are at most x successes; if FALSE, it returns the probability mass function, which is the probability that there are x successes

The binomial probability mass function is calculated as follows:

CHIDIST(value,deg)

This function calculates the one-tailed probability of the chi-squared distribution.

CHIINV(prob,deg)

This function calculates the inverse of the one-tailed probability of the chi-squared distribution.

CHITEST(obs_array,exp_array)

This function calculates the test for independence from the chi-squared distribution.

obs_array: Array or column of observed values to test against expected values

exp_array: Array or column of expected values against which to test observed values

CONFIDENCE(alpha,stdev,size)

This function returns confidence interval for a population mean.

example: CONFIDENCE(0.05,3.5,150) gives the result 0.560106363

CORREL(array1,array2)

CORREL(column1,column2)

This function returns the correlation coefficient of the two sets of data. Data can be arrays or datasheet columns.

COUNT(value1,value2,...)

COUNT(column)

This function returns the number of cells that contain numbers.

The arguments may be separate values, an array or a column of values. Up to 255 arguments of individual cells may be included.

COUNTA(value1,value2,...)

COUNTA(column)

This function returns the number of number of cells that contain numbers, text, or logical values.

The arguments may be separate values, an array or a column of values. Up to 255 arguments of individual cells may be included.

COVAR(array1,array2)

COVAR(column1,column2)

This function returns the covariance, which is the average of the products of deviations for each data point pair in two sets of numbers. Data can be arrays or datasheet columns.

CRITBINOM(n,p,alpha)

This function returns the criterion binomial, the smallest value for which the cumulative binomial distribution is greater than or equal to a criterion value.

DEVSQ(value1,value2, ...)

DEVSQ(column)

This function calculates the sum of the squares of deviations of data points (or of an array of data points) from their sample mean.

Argument can be a double-precision floating-point value, an integer value, or an array (cell range or datasheet column) of these. Up to 255 arguments may be included. You can use a single array (cell range or column) instead of a list of values. You can use multiple arrays (cell ranges or columns) as well.

The sum of squared deviations is calculated as follows, where n is the number of values.

EXPONDIST(value,lambda,cumulative)

This function returns the exponential distribution or the probability density.

 argument description value Value of the function; must be positive or zero lambda Parameter value; must be greater than zero cumulative Logical value indicating whether to return the cumulative distribution; set to TRUE to return the cumulative distribution; set to FALSE to return the probability density

The cumulative distribution is calculated as follows:

where x is the value argument, lambda is the lambda argument.

The probability density is calculated as follows:

where x is the value argument, lambda is the lambda argument

FDIST(value,degnum,degden)

This function calculates the F probability distribution, to see degrees of diversity between two sets of data.

FINV(p,degnum,degden)

This function returns the inverse of the F probability distribution.

FISHER(value)

This function returns the Fisher transformation for a specified value.

The Fisher transformation is calculated as follows:

where x is the value argument

FISHERINV(value)

This function returns the inverse of the Fisher transformation for a specified value.

The inverse Fisher transformation is calculated as follows:

where y is the value argument.

FORECAST(value,Yarray,Xarray)

FORECAST(value,YColumn,XColumn)

This function calculates a future value using existing values.

 argument description value Value for which to predict the future dependent value Yarray An array or column of known dependent values (y’s) Xarray An array or column of known independent values (x’s)

This function is calculated as follows:

where v is the value argument, Y is the Yarray or YColumn argument, X is the Xarray or XColumn argument, and n is the size of the arrays.

FREQUENCY(dataarray,binarray)

This function calculates how often values occur within a range of values. This function returns a vertical array of numbers.

 argument description dataarray Array of values or a reference to a set of values for which to count frequencies binarray Array of intervals or a reference to intervals into which to group the values of dataarray

FTEST(array1,array2)

FTEST(column1,column2)

This function returns the result of an F-test, which returns the one-tailed probability that the variances in two arrays or columns are not significantly different.

This function returns the gamma distribution.

 argument description x Value at which to evaluate the distribution alpha Alpha parameter of the distribution beta Beta parameter of the distribution cumulative Logical value that determines the form of the function If cumulative is TRUE, then this function returns the cumulative distribution function; if FALSE, it returns the probability mass function.

The equation for this function is:

GAMMAINV(p,alpha,beta)

This function returns the inverse of the gamma cumulative distribution.

GAMMALN(value)

This function returns the natural logarithm of the Gamma function, G(x).

The equation for this function is:

where x is the value argument.

GEOMEAN(value1,value2,...)

GEOMEAN(column)

This function returns the geometric mean of a set of positive data.

You can specify a set of numeric values. You can also use a single column or an array instead of arguments separated by commas. If column or an array contains text, logical values, or empty cells, the function ignores those values; however, the function includes in calculations cells with the value zero. This function can have up to 255 arguments.

GROWTH(y,x,newx,constant)

This function calculates predicted exponential growth. This function returns the y values for a series of new x values that are specified by using existing x and y values.

HARMEAN(value1,value2,...)

HARMEAN(column)

This function returns the harmonic mean of a data set.

You can specify a set of numeric values. You can also use a single column or an array instead of arguments separated by commas. If column or an array contains text, logical values, or empty cells, the function ignores those values; however, the function includes in calculations cells with the value zero. This function can have up to 255 arguments.

HYPGEOMDIST(x,n,M,N)

This function returns the hypergeometric distribution.

 argument description x An integer representing the number of successes in the sample n An integer representing the size of the sample M An integer representing the number of successes in the population N An integer representing the size of the population

The equation for this function is:

INTERCEPT(dependent,independent)

This function returns the coordinates of a point at which a line intersects the y-axis, by using existing x values and y values.

 argument description dependent An array or column of known dependent values (y’s) independent An array or column of known independent values (x’s)

The equation for this function is:

where Y is the column or array of dependent variables, X is the column or array of independent variables, and n is the size of the arrays.

KURT(value1,value2,value3,value4,...)

KURT(column)

This function returns the kurtosis of a data set.

For the arguments, you can use numbers, arrays, or columns. If an array or column argument contains text, logical values, or empty cells, the function ignores those values; however, the function includes cells with the value zero in its calculations. You must provide four or more value arguments. You may provide up to 255 arguments.

LARGE(array,n)

LARGE(column,n)

This function returns the nth largest value in a data set, where n is specified.

LINEST(y,x,constant,stats)

This function calculates the statistics for a line.

LOGEST(y,x,constant,stats)

This function calculates an exponential curve that fits the data and returns an array of values that describes the curve.

This function returns the inverse of the lognormal cumulative distribution function of x, where LN(x) is normally distributed with the specified mean and standard deviation.

LOGNORMDIST(x,mean,stdev)

This function returns the cumulative natural log normal distribution of x, where LN(x) is normally distributed with the specified mean and standard deviation. Analyze data that has been logarithmically transformed with this function.

MAX(value1,value2,...)

MAX(column)

This function returns the maximum value, the greatest value, of all the values in a column or alist of arguments. You can use a single column instead of a list of values. You can use multiple columns as well.

MAXA(value1,value2,...)

MAXA(column)

This function returns the largest value in a column or a list of arguments, including text and logical values. You can use a single column instead of a list of values. You can use multiple columns as well.

MEDIAN(value1,value2,...)

MEDIAN(column)

This function returns the median, the number in the middle of the provided set of numbers; that is, half the numbers have values that are greater than the median, and half have values that are less than the median. You can use a single column instead of a list of values. You can use multiple columns as well.

MIN(value1,value2,...)

MIN(column)

This function returns the minimum value, the least value, of all the values in a column or a list of arguments. You can use a single column instead of a list of values. You can use multiple columns as well.

MINA(value1,value2,...)

MINA(column)

This function returns the minimum value in a column or a list of arguments, including text and logical values. You can use a single column instead of a list of values. You can use multiple columns as well.

MODE(value1,value2,...)

MODE(column)

This function returns the most frequently occurring value in a set of data. You can use a single column instead of a list of values. You can use multiple columns as well.

NEGBINOMDIST(x,r,p)

This function returns the negative binomial distribution.

NORMDIST(x,mean,stdev,cumulative)

This function returns the normal cumulative distribution for the specified mean and standard deviation.

NORMINV(prob,mean,stdev)

This function returns the inverse of the normal cumulative distribution for the given mean and standard deviation.

NORMSDIST(value)

This function returns the standard normal cumulative distribution function.

NORMSINV(prob)

This function returns the inverse of the standard normal cumulative distribution. The distribution has a mean of zero and a standard deviation of one.

PEARSON(array_ind,array_dep)

This function returns the Pearson product moment correlation coefficient, a dimensionless index between -1.0 to 1.0 inclusive indicative of the linear relationship of two arrays or columns of data.

PERCENTILE(array,n)

PERCENTILE(column,n)

This function returns the nth percentile of values in a column or a range.

 argument description array Array or column of values representing the data n Value representing the percentile value between 0 and 1

PERCENTRANK(array,n,sigdig)

PERCENTRANK(column,n,sigdig)

This function returns the rank of a value in a data set as a percentage of the data set.

 argument description array Array or column of data with numeric values that defines the relative ranking n Value for which you want to find the rank in percentage sigdig [Optional] Number of significant digits for the ranked percentage value; if omitted, the calculation used three significant digits; if not an integer, number is truncated

PERMUT(k,n)

This function returns the number of possible permutations for a specified number of items.

The equation for this function is:

where k and n are defined in the arguments.

POISSON(nevents,mean,cumulative)

This function returns the Poisson distribution.

 argument description nevents Number of events. Provide an integer, or the value is truncated. The number must be greater than zero. mean Expected numeric value The number must be greater than zero. cumulative Set to TRUE to return the cumulative Poisson probability that the number of random events occurring is between zero and nevents inclusive. Set to FALSE to return the Poisson probability mass function that the number of events occurring is exactly nevents.

The cumulative Poisson probability is calculated as follows:

The Poisson probability mass function is calculated as follows:

where x is the number of events (nevents argument), mu is the mean (mean argument).

PROB(array,probs,lower,upper)

PROB(column,probs,lower,upper)

This function returns the probability that values in a column or range are between two limits.

 argument description array Array or column of numeric values, which has corresponding probs probs Probabilities associated with the numeric values in array or column lower Lower limit on the numeric value for which you want a probability upper [Optional] Upper limit on the numeric value for which you want a probability; if omitted, returns the probability of result equal to lower limit

QUARTILE(array,quart)

QUARTILE(column,quart)

This function returns which quartile (which quarter or 25 percent) of a data set a value is.

argument

description

array

Array or column of numeric values for which you want the quartile value

quart

Quartile value for the array (see below)

 if quart is... then this function returns the... 0 Minimum value 1 First quartile (25th percentile) 2 Median value (50th percentile) 3 Third quartile (75th percentile) 4 Maximum value

RANK(number,array,order)

RANK(number,column,order)

This function returns the rank of a number in a set of numbers. If you were to sort the set, the rank of the number would be its position in the list.

 argument description number Number whose rank you want to return array Column that contains the set of numbers order [Optional] How the number is ranked, either in descending order (0 or omitted) or ascending order (non-zero value)

RSQ(array_dep,array_ind)

RSQ(column_dep,column_ind)

This function returns the square of the Pearson product moment correlation coefficient (R-squared) through data points in known y's and known x's. The columns must be the same size.

SKEW(number1,number2,...)

This function returns the skewness of a distribution.

SLOPE(array_dep,array_ind)

SLOPE(column_dep,column_ind)

This function calculates the slope of a linear regression. The columns must be the same size.

SMALL(array,n)

SMALL(column,n)

This function returns the nth smallest value in a data set, where n is specified.

 argument description array Array or Column from which to return the nth largest value n The position (from the largest value) for which to return the value (for example, 5 to return the fifth largest value). Must be equal to or less than the number of items in the column.

STANDARDIZE(x,mean,stdev)

This function returns a normalized value from a distribution characterized by mean and standard deviation.

STDEV(value1,value2,...)

STDEV(column)

This function returns the standard deviation for a set of numbers. Argument can be a cell, a cell range or a column of the datasheet. This function can have up to 255 arguments.

The standard deviation is calculated using the "non-biased" or "n-1" method.

The equation for calculating the standard deviation is:

where x is the value and n is the number of values.

STDEVA(value1,value2,...)

STDEVA(column)

This function returns the standard deviation for a set of numbers, text, or logical values. Argument can be a cell, a cell range or a column of the datasheet. This function can have up to 255 arguments.

STDEVP(value1,value2)

STDEVP(column)

This function returns the standard deviation for an entire specified population (of numeric values). Argument can be a cell, a cell range or a column of the datasheet. This function can have up to 255 arguments.

The standard deviation is calculated using the "biased" or "n" method.

The equation for calculating the standard deviation for a population is:

where x is the value and n is the number of values.

STDEVPA(value1,value2,...)

STDEVPA(column)

This function returns the standard deviation for an entire specified population, including text or logical values as well as numeric values.  Argument can be a cell, a cell range or a column of the datasheet. This function can have up to 255 arguments.

STEYX(array_dep,array_ind)

STEYX(column_dep,column_ind)

This function returns the standard error of the predicted y value for each x. The standard error is a measure of the amount of error in the prediction of y for a value of x. The columns must be the same size.

TDIST(x,deg,tails)

This function returns the probability for the t-distribution.

TINV(prog,deg)

This function returns the t-value of the student's t-distribution as a function of the probability and the degrees of freedom.

TREND(y,x,newx,constant)

This function returns values along a linear trend. This function fits a straight line to the arrays known x and y values. Trend returns the y values along that line for the array of specified new x values.

 argument description y Set of y values that are known in the relationship y=mx+b x (Optional) X is an optional set of x values that may be known in the relationship y=mx+b newx New x values for which this functions returns the corresponding y values constant Logical value that specifies whether to force the constant b to equal 0

TRIMMEAN(array,percent)

TRIMMEAN(column,percent)

This function returns the mean of a subset of data excluding the top and bottom data.

 argument description array Column or Array of values to trim and find the mean percent Fractional amount of data in array to trim (to exclude from calculation)

TTEST(array1,array2,tails,type)

TTEST(column1,column2,tails,type)

This function returns the probability associated with a t-test.

 argument description array1 Column or Array of values in first data set array2 Column or Array of values in second data set tails Number of tails type Type of t-test to perform (1, 2, or 3)

example: TTEST({2,2,2,3,4},{2,3,3,4,5},1,2) gives the result 0.126036

VAR(value1,value2,...)

VAR(column)

This function returns the variance based on a sample of a population, which uses only numeric values.

Each argument can be a double-precision floating-point value, an integer value, or a column of the datasheet. Up to 255 arguments may be included. You can use a single array instead of a column. You can use multiple columns as well.

This function uses the following equation to calculate the variance, where n is the number of values.

where x is the value and n is the number of values.

VARA(value1,value2,...)

VARA(column)

This function returns the variance based on a sample of a population, which includes numeric, logical, or text values.

Each argument can be a double-precision floating-point value, an integer value, or a column of the datasheet. Up to 255 arguments may be included. You can use a single array instead of a column. You can use multiple columns as well.

VARP(value1,value2,...)

VARP(column)

This function returns variance based on the entire population, which uses only numeric values.

Each argument can be a double-precision floating-point value, an integer value, or a column of the datasheet. Up to 255 arguments may be included. You can use a single array instead of a column. You can use multiple columns as well.

This function uses the following equation to calculate the variance, where n is the number of values.

where x is the value and n is the number of values.

VARPA(value1,value2,...)

VARPA(column)

This function returns variance based on the entire population, which includes numeric, logical, or text values.

Each argument can be a double-precision floating-point value, an integer value, or a column of the datasheet. Up to 255 arguments may be included. You can use a single array instead of a column. You can use multiple columns as well.

WEIBULL(x,alpha,beta,cumulative)

This function returns the two-parameter Weibull distribution, often used in reliability analysis.

 argument description x Value at which to evaluate the distribution alpha Scale parameter of the distribution, represented by alpha beta Shape parameter of the distribution, represented by beta cumulative Logical value that determines the form of the function If cumulative is TRUE, then this function returns the cumulative distribution function; if FALSE, it returns the probability mass function.

ZTEST(array,x,sigma)

ZTEST(column,x,sigma)

This function returns the significance value of a z-test. The z-test generates a standard score for x with respect to the set of data and returns the two-tailed probability for the normal distribution.

 argument description array Column or Array of data to test x Value at which to test sigma [Optional] Known standard deviation for the population; if omitted, the calculation uses the sample standard deviation

The equation for calculating the z-test is as follows, where n is the number of data points.