Glossary

Linda R. Cote Ph.D.; Rupa G. Gordon Ph.D.; Chrislyn E. Randell Ph.D.; Judy Schmitt; Helena Marvin

Glossary


alternative hypothesis: In hypothesis testing, the null hypothesis and an alternative hypothesis (the experimenter’s prediction) are put forward. If the data are sufficiently strong to reject the null hypothesis, then the null hypothesis is rejected in favor of an alternative hypothesis.
analysis of variance (ANOVA): A hypothesis-testing procedure that is used to evaluate mean differences between multiple treatments (or populations).
area in the tails of the distribution: The proportion of the distribution that falls in the tails of a normal curve. The area in the tail of the distribution associated with a particular z score can be found in Appendix A, column C.
area under the curve: The proportion of the distribution that is bounded by a single z score or a pair of z scores. The area under the curve bounded by a single z score can be found in Appendix A, column B.
arithmetic mean: Perhaps the most common measure of central tendency, the mean is the mathematical average of the scores in a sample.
bell curve: The bell curve is a symmetrical distribution in which there is a single peak at the center and tails that extend equally out to each side. The bell curve represents a normal distribution.
between-groups variability: The variability that arises from differences between groups, which includes treatment effects and error.
bimodal distribution: A distribution with two distinct peaks that lie roughly symmetrically on either side of the center point.
bin width: The width of class intervals in a histogram.
Bonferroni test: A Bonferroni test is one of the simplest post hoc analyses. It is a series of t tests performed on each pair of group means with a modified alpha level.
box plots: One of the more effective graphical summaries of a data set, the box plot generally shows the median, 25th and 75th percentiles, and outliers.
categorical variables: Also known as qualitative variables, categorical variables cannot be quantified, or measured numerically. Instead, they are measured on a nominal or ordinal scale.
central limit theorem: A mathematical theorem that states: for samples of a given sample size, drawn from a population with a given mean and variance, the sampling distribution of sample means will approximate a normal distribution as the sample size increases.
central tendency: The center or middle of a distribution. There are many measures of central tendency. The most common are the mean, median, and mode.
chi-square (test): A nonparametric test designed to understand the frequency distribution of a single categorical variable or find a relationship between two categorical variables.
confidence interval: A range of scores likely (i.e., with a certain degree of confidence) to contain the parameter being estimated.
confound variables: Two or more variables are confounded if their effects cannot be separated because they vary together. For example, if a study on the effect of light inadvertently manipulated heat along with light, then light and heat would be confounded.
contingency table: A frequency table that shows the frequency of each category in one variable, contingent upon the specific category or level of the other variable.
continuous variables: Numerical variables that can take on any value in a certain range. Time and distance are continuous; gender, SAT score, and “time rounded to the nearest second” are not.
control: The group in an experimental study that is not receiving the treatment being tested.
convenience sampling: A sampling strategy in which participants are recruited for their easy availability (e.g., college students). A sample obtained through convenience sampling should not be considered a representative sample.
correlation matrices: Tables displaying correlation coefficients that describe the relationships among multiple variables.
covariance: When variables differ together; that is, when one score changes, the other score also changes in a predictable or consistent way.
critical value: The value corresponding to a specific rejection region; also called critical region.
curvilinear models: Forms of regression that can explain curves in the data rather than straight lines.
curvilinear relationship: A relationship in which a line through the middle of the points in a scatter plot will be curved rather than straight.
data: A collection of values to be used for statistical analysis. Data is the plural form of datum.
degrees of freedom: The number of independent pieces of information that go into the estimate. In general, the degrees of freedom for an estimate is equal to the number of values minus the number of parameters estimated en route to the estimate in question.
dependent variable: A variable that measures the experimental outcome. In most experiments, the effects of the independent variable on the dependent variables are observed. For example, if a study investigated the effectiveness of an experimental treatment for depression, then the measure of depression would be the dependent variable.
descriptive statistics: A set of statistics—such as the mean, standard deviation, and skew—that describe a distribution.
difference score: The change in a single variable over time: the score at Time 2 minus the score at Time 1.
discrete variables: A variable that exists in indivisible units. For quantitative variables, it is measured in whole numbers that are discrete points on the scale.
dispersion: The extent to which values differ from one another; that is, how much they vary. Dispersion is also called variability or spread.
distribution of sample means: The set of means from all the possible random samples of a specific size selected from a specific population. The distribution of sample means is an example of a sampling distribution.
effect size: A statistic that indicates how large, important, or meaningful a statistically significant effect is.
error: The difference between a measured or calculated value and a true one.
event: Any specific outcome that could happen.
expected values: The expected value of a statistic is the mean of the sampling distribution of the statistic.
experimental: The group in a study that is receiving the treatment being tested.
experimental research: Research that involves the use of random assignment to treatment conditions and manipulation of the independent variable.
factorial ANOVA: An analysis of variance that uses multiple grouping variables, instead of just one, to look for group mean differences.
frequency polygons: A frequency polygon is a graphical representation of a distribution that is similar in appearance to a line graph. Frequency polygons can be grouped or ungrouped.
grand mean: The mean of a group of averages.
group mean differences: Research studies concerned with group mean differences determine whether, on average, a person from Group A is higher or lower or different on some variable than a person from Group B. Key criteria to consider in such studies: the groups must be mutually exclusive.
grouping variable: Also called the independent variable, a grouping variable is used to categorize data into groups. It predicts or explains the values in the outcome variable.
histogram: A graphical representation of a distribution that is similar in appearance to a bar chart. It partitions the variable on the x-axis into various contiguous class intervals of (usually) equal widths. The heights of the bars represent the class frequencies.
homogeneity of variance: The assumption that the true population variance for each group is the same and any difference in the observed sample variances is due to random chance.
hypothesis: A prediction that is tested in a research study.
independent samples (t test): The analysis of two samples that are selected from two populations, where the values from one population are not related in any way to the values from the other population.
independent variable: A variable that is manipulated by the experimenter, as opposed to a dependent variable. Most experiments consist of observing the effect of the independent variable(s) on the dependent variable(s).
inferential statistics: The branch of statistics concerned with drawing conclusions about a population from a sample. This is generally done through random sampling, followed by inferences made about central tendency, or any of a number of other aspects of a
interquartile range (IQR): The range of the middle 50% of the scores in a distribution; computed by subtracting the 25th percentile from the 75th percentile. The interquartile range is a robust measure of central tendency.
interval scale: A numerical scale in which the distance between scores on the scale is consistent (equal) and for which the zero is relative (rather than absolute).
inverse relationship: In an inverse relationship, variables are related but move in opposite directions when they change: as one variable goes up, the other variable goes down.
law of large numbers: A mathematical theorem that states: as sample size increases, the probability that a sample mean is an accurate representation of the true population mean also increases.
least squares error solution (linear regression): The solution—or equation—of a line is the one that provides the smallest possible value of the squared errors (squared so that they can be summed, just like in standard deviation) relative to any other straight line that could be drawn through the data.
lie factor: The ratio of the size of the effect shown in a graph to the size of the effect shown in the data. This term was coined by Edward Tufte, who suggested that lie factors greater than 1.05 or less than 0.95 produce unacceptable distortion.
line of best fit: The central tendency of our scatter plot. The term best fit means that the line is as close to all points (with each point representing both variables for a single person) in the scatter plot as possible, with a balance of scores above and below the line.
linear relationship: There is a perfect linear relationship between two variables if a scatter plot of the points falls on a straight line. The relationship is linear even if the points diverge from the line, as long as the divergence is random rather than being systematic.
magnitude: How strong or how consistent the relationship between variables is. Higher numbers mean greater magnitude, which means a stronger relationship.
margin of error: The difference between the statistic used to estimate a parameter and the endpoints of the confidence interval. For example, if the statistic were 0.6 and the confidence interval ranged from 0.4 to 0.8, then the margin of error would be ±.20. Unless otherwise specified, the 95% confidence interval is used.
marginal values: In a contingency table, marginal values are the total values for a single category of one variable, added up across levels of the other variable.
matched pairs: Two samples that are matched or dependent in some way.
mean: See arithmetic mean.
mean square: A sample variance that measures the mean of the squared deviations.
mean squared error (linear regression): The average squared difference between the estimated values and the actual value.
median: The median is a popular measure of central tendency. It is the 50th percentile of a distribution.
mode: A measure of central tendency, the mode is the most frequent value in a distribution.
moderation models: Forms of regression that change the relationship between two variables based on levels of a third variable.
multiple regression: Linear regression in which two or more predictor (independent) variables are used to predict the dependent variable.
negative relationship: A negative or inverse relationship means that as the value of one variable increases, the other decreases.
no relationship: There is no relationship between variables X and Y if the hypothetical line drawn through points on a scatter plot has no slope; in other words, values of X are not associated with the values of Y.
nominal scale: A scale in which no ordering is implied, and addition/subtraction and multiplication/division would be inappropriate for a variable. Variables measured on a nominal scale have no natural ordering, even if they are coded using numbers (e.g., for eye color 1 = blue, 2 = brown, 3 = hazel, etc.).
non-experimental research: Research that involves observing things as they occur naturally and recording observations as data. Also known as correlational research.
nonparametric tests: Tests in which there are no population parameters to estimate or distributions to test against. All chi-square tests are nonparametric.
normal distribution: One of the most common continuous distributions, a normal distribution is sometimes referred to as a bell-shaped distribution, a bell curve, or a Gaussian curve. If the mean is 0 and the standard deviation is 1, the distribution is referred to as the “standard normal distribution.”
null hypothesis: A null hypothesis is a hypothesis tested in significance testing. It is typically the hypothesis that a parameter is zero or that a difference between parameters is zero. For example, the null hypothesis might be that the difference between population means is zero.
observed effect: What is observed in the sample versus what was expected based on the population from which that sample was drawn.
ordinal scale: A set of ordered values in which there is no set distance between scale values; for example, asking someone to indicate how much education they completed by asking them to circle one of the following: did not complete high school, high school diploma, some college, college degree, professional degree.
outcome variable: Also known as the dependent variable, the outcome variable is thought to change as a function of changes in a predictor (independent) variable.
outlier: An atypical, infrequent observation; a value that has an extreme deviation from the center of the distribution. There is no universally agreed on criterion for defining an outlier, and outliers should only be discarded with extreme caution. However, one should always assess the effects of outliers on the statistical conclusions.
p value or probability value: In significance testing, the probability value is the probability of obtaining a statistic as different or more different from the parameter specified in the null hypothesis as the statistic obtained in the experiment. The probability value is computed assuming the null hypothesis is true. The lower the probability value, the stronger the evidence that the null hypothesis is false. Traditionally, the null hypothesis is rejected if the probability value is below .05.
point estimate: A single number (rather than a range of numbers) that is used to estimate a parameter.
pooled variance: A weighted average of two variances—the weights being determined by sample size—that can then be used when calculating standard error.
population: The complete set of observations a researcher is interested in. Contrast this with a sample which is a subset of a population. Inferential statistics are computed from sample data in order to make inferences about the population.
positive relationship: A positive relationship exists between variables X and Y when smaller values of X are associated with smaller values of Y, and a positive relationship is indicated graphically when a regression line drawn through the center of the points on a scatter plot has a positive slope.
post hoc test: A test that is conducted after an ANOVA with more than two treatment conditions where the null hypothesis was rejected. The purpose of post hoc tests is to determine exactly which treatment conditions are significantly different.
probability: The likelihood of a statistical result or the number of outcomes that satisfy specific criteria divided by the total number of possible outcomes.
qualitative variables: Also known as categorical variables, qualitative variables cannot be quantified, or measured numerically. Instead, they are measured on a nominal or ordinal scale. Variables that are not qualitative are known as quantitative variables.
quantitative variables: Variables that are measured on a numeric or quantitative scale or that can be ordered in some fashion. Ordinal, interval, and ratio scales are quantitative. A country’s population, a person’s shoe size, or a car’s speed are all quantitative variables. Variables that are not quantitative are known as qualitative variables.
quasi-experimental research: Research that involves manipulating the independent variable but not randomly assigning people to groups.
random error: Any deviations between a person and that person’s group mean caused only by chance. Random error is a component of within-groups variability.
range: The difference between the maximum and minimum values of a variable or distribution. The range is the simplest measure of variability.
range restriction: Failure to capture the full range of a variable’s potential scores.
ratio scale: A numerical scale in which the distance between scores on the scale is consistent (equal) and for which the zero is relative (rather than absolute).
rejection region: The region in which an experimenter can reject the null hypothesis, provided the test statistic falls into that region; also called the critical region.
related samples (t test): The analysis of two scores that are related in a systematic way within people or within pairs. Also called paired samples, matched pairs, repeated measures, dependent measures, and dependent samples, among other names.
repeated measures ANOVA: An analysis of variance that measures each study subject three or more times to look for a change. A repeated measures ANOVA is an extension of a related samples t test.
residual: The distance between the actual value of the Y variable and the predicted value of the Y variable.
robustness: Something is robust if it holds up well in the face of adversity. A measure of central tendency or variability is considered robust if it is not greatly affected by a few extreme scores. A statistical test is considered robust if it works well in spite of moderate violations of the assumptions on which it is based.
sample: A subset of a population, often taken for the purpose of statistical inference.
sampling bias: Sampling bias occurs when participants are not selected at random or when they have an unequal probability of being selected for participation in a study.
sampling distribution: A distribution that is obtained through repeated sampling from a larger population.
sampling error: The discrepancy between a parameter and the statistic used to estimate it.
scale (of a distribution): How far apart the values of a distribution are (their spread) and where they are located (their central tendency).
Scheffé test: A post hoc test for ANOVA that uses an F ratio to evaluate the significance of the difference between any two treatment conditions. The Scheffé test is one of the safest of all possible post hoc tests.
significance level: In significance testing, the significance level is the highest value of a probability value for which the null hypothesis is rejected. Common significance levels are .05 and .01. If the .05 level is used, then the null hypothesis is rejected if the probability value is less than or equal to .05. Also called the α level or simply α (“alpha”).
simple random sampling: A process of selecting a subset of a population for the purposes of statistical inference in which every member of the population is equally likely to be chosen.
skew: A distribution is skewed if one tail extends out further than the other, making the distribution asymmetrical. A distribution has a positive skew (is skewed to the right) if the tail to the right is longer. A distribution has a negative skew (is skewed to the left) if the tail to the left is longer.
sources of variability: The reasons that scores differ from one another. For example, in an ANOVA, we define two sources of variability: between-groups and within-groups variability.
Spearman’s rho: A statistic that expresses the relationship between two variables on a scale from 1 to –1. This correlation coefficient is designed to be used with ordinal data rather than continuous data and, unlike Pearson correlation, does not assume a linear relationship.
spread: The extent to which values differ from one another; that is, how much they vary. Spread is also called variability or dispersion.
standard deviation: The standard deviation is a widely used measure of variability. It is computed by taking the square root of the variance.
standard error: The standard deviation of the sampling distribution of a statistic.
standard error of the estimate: The average size of the residual, or the average distance from a researcher’s predictions to the actual observed values.
standard normal distribution: A normal distribution that has a mean of 0 and a standard deviation of 1; also known as the unit normal distribution.
standardization: The process of transforming any normal distribution into a standard normal distribution by converting all of the raw scores in the distribution into standard scores (z scores).
statistical power: The probability of correctly rejecting a false null hypothesis (i.e., not making a Type II error).
statistical significance: The probability of rejecting the null hypothesis when the null hypothesis is true. Generally, in psychology we look for p < .05 to indicate that a mean difference or relationship is statistically significant.
statistics: A range of techniques and procedures for analyzing, interpreting, displaying, and making decisions based on sample data.
stem-and-leaf display: A quasi-graphical representation of numerical data. Generally, all but the final digit of each value is a stem, and the final digit is the leaf. The stems are placed in a vertical list, with each matched leaf on one side.
stratified random sampling: In stratified random sampling, the population is divided into a number of subgroups (or strata). Random samples are then taken from each subgroup with sample sizes proportional to the size of the subgroup in the population.
sum of squares: The sum of squared deviations, or differences, between scores and the mean in a numeric dataset.
systematic variability: Variation in observations that are the result of factors related to the experimental differences between groups.
test for goodness of fit: A chi-square test that assesses whether the observed frequencies in a sample fit the frequencies in a known distribution.
test for independence: A chi-square test that assesses whether the values of each categorical variable (that is, the frequency of their levels) is related to or independent of the values of the other categorical variable. This type of analysis is performed on contingency tables.
test statistic: An inferential statistic used to test a null hypothesis.
Tukey’s honestly significant difference (HSD): A popular post hoc analysis for ANOVA that makes adjustments based on the number of comparisons; however, unlike the Bonferroni test, it makes adjustments to the test statistic when running the comparisons of two groups.
Type I error: Rejecting the null hypothesis when it is actually true; a false positive.
Type II error: Failing to reject the null hypothesis when it is actually false; a false negative.
variability: The extent to which values differ from one another; that is, how much they vary. Variability can also be thought of as how spread out or dispersed a distribution is.
variable: Something that can take on different values. For example, different subjects in an experiment weigh different amounts. Therefore “weight” is a variable in the experiment. Or, subjects may be given different doses of a drug. This would make “dosage” a variable. Variables can be dependent or independent, qualitative or quantitative, and continuous or discrete.
variance: The variance is a widely used measure of variability. It is defined as the mean squared deviation of scores from the mean.
whiskers: Vertical lines ending in a horizontal stroke that are added to box plots to indicate the spread of the data points. Whiskers are drawn from the upper and lower hinges to the upper and lower adjacent values.
within-groups variability: The variability that arises from differences that occur within each group due to individual differences among participants.
z score: The number of standard deviations a score is from the mean of its population. When the scores (or sample means) in the population are normally distributed, the z table can be used to find probabilities for obtaining a given z score.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Statistics in the Psychological Sciences Copyright © 2021 by Linda R. Cote Ph.D.; Rupa G. Gordon Ph.D.; Chrislyn E. Randell Ph.D.; Judy Schmitt; and Helena Marvin is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

License

Share This Book