Table of Contents
- Research Designs
- Descriptive Statistics
- Correlational Statistics
- Inferential Statistics
- Links
- References
According to Gall, Borg, and Gall (1996), there are generally three types of quantitative research designs: descriptive/comparative, correlational, and experimental.
-
Descriptive/comparative - This design can include the use of surveys, direct observation, and longitudinal data. The purpose of this design is to describe a phenomenon in an objective, quantifiable way. Typically, it involves descriptive statistics.
-
Correlational - This design seeks to compare one variable with another, typically in an attempt to show support for a causal relationship. It normally deals with probability data.
-
Experimental - This design attempts to manipulate one or one set of variables -- known as the independent variable(s) -- in an attempt to cause a change in another variable or set of variables -- known as the dependent variable(s). While this design normally involves descriptive statistics, it also includes inferential statistics.
There are two general types of descriptive statistics:
-
Measures of central tendency - Mean (average), Median (midpoint), Mode (most common)
-
Measures of variability - Standard Deviation (average difference from mean) and Variance (overall variability)
There are three general types of data: nominal, ordinal, and interval-ratio data.
-
Nominal - Categories (e.g., male, female)
-
Ordinal - Information that is in rank order (e.g., class rank)
-
Interval-ratio - Numerical data that is not ranked or categorized and for which there is an equal interval between steps (e.g., with time, there are equal intervals between seconds, minutes, hours, etc.). This is the most desirable type of data from a statistical standpoint because you can do more with it.
Symmetry (skewness) and kurtosis describe the shape of the curve created by the distribution of data points. This is another way of describing the data.
-
Skewness - The horizontal distribution is referred to as skweness. If the median and mode are greater than the mean, the curve is said to be positively skewed. If the opposite occurs, the data is negatively skewed. If the mean is the same as the median and mode, the data is symmetrically distributed.
-
Kurtosis - The vertical characteristics of the curve are described as kurtosis (not a disease). A peaked curve (like gothic architecture) is referred to as leptokurtic, while a flat curve is called platykurtic. A normal curve is mesokurtic.
The following chart summarizes these points and shows why interval-ratio data is the most flexible. The chart was adapted from a handout by Dr. Ed Yoder (Penn State University), which in turn was adapted from a class handout from J.R. Warmbrod (Ohio State University).
|
Type of Measure
|
|
Types of Data
|
|
|
Nominal
|
Ordinal
|
Interval-Ratio
|
|
| Central Tendency |
Mode
|
Mode, Median
|
Mode, Median, Mean
(if skewness < +/- 1) |
| Variability |
Frequency of categories
|
Semi-interquartile range (SIQR)
|
Variance, SD, Range
|
| Symmetry |
N/A
|
N/A
|
Positively skewed (+)
Negatively skewed (-) Symmetrical (0) |
| Kurtosis |
N/A
|
N/A
|
Mesokurtic - normal curve (0) |
Note that mean, variance, standard deviation, and range are only used to describe interval-ratio data.
Correlational statistics "is used to make predictions and to study relationships between variables" (Gall, Borg, & Gall, 1996:p. 409). This involves predicting a future event based on the performance of variables that were measured in the past in order to establish that a relationship exists. This relationship can exist due to pure chance (coincidence) without necessarily involving causality. There are generally two sets of correlation statistical methods depending on the number of variables under study: bivariate and multivariate.
-
Bivariate Statistics - Involves the comparison of two variables (e.g., correlation between the number of hours the student spends studying and test scores). These statistics are normally used to establish that a relationship exists between two variables.
-
Multivariate Statistics - Involves three of more variables (e.g., high test scores are caused by the number of hours spent studying, class attendance, completion of homework, etc.). These statistics are typically used to establish a set of root causes (e.g., factor analysis).
Bivariate Statistics
The following chart provides guidance regarding what correlation method you should use for the type of data that you have. The chart was adapted from a handout by Dr. Ed Yoder (Penn State University).
|
Scale of Measurement
|
|
Measure of Linear Relationship
(Variable 2) |
|
|
(Variable 1)
|
Nominal
|
Ordinal
|
Interval-Ratio
|
| Nominal |
Phi coefficient (2x2 table)
Cramer's statistic (RxC table) |
Rank-biserial coefficient
|
Point-biserial coefficient
|
| Ordinal |
Rank-biserial coefficient
|
Spearman rank coefficient
Kendall Tau coefficient |
Convert interval scores to ranks and calculate Spearman
rank-correlation or Kendall Tau
|
| Interval-Ratio |
Point-biserial coefficient
|
Convert interval scores to ranks and calculate Spearman
rank-correlation or Kendall Tau
|
Pearson product-momentum coefficient
|
For more information, see Hopkins & Glass
(1978) and Glass & Stanley (1970).
Also, there are more detailed tables in Gall, Borg, & Gall (1996: p. 428).
Linear vs. Non-Linear Data - Note that with correlations, the data are normally assumed to have a linear relationship and so they can be expressed with the formula y=mx+b. The plots below demonstrated the difference between linear and non-linear data sets.

Multivariate Statistics
While bivariate statistics are useful, the majority of issues in education are too complicated to be boiled down to single-cause scenarios. In general, the phenomena that we study has multiple causes and multiple effects. As a result, multivariate correlation statistics are very popular in educational research, especially the versatile multiple regression technique.
The following table summarizes the various multivariate techniques available to you. It was adapted from Gall, Borg, and Gall (1996: p. 433):
|
Statistic
|
Use
|
| Multiple Regression |
Calculate r between a single criterion variable and
a combination of two or more predictor variables
|
| Discriminant Analysis |
Calculate r between 2 or more predictor variables and
a single criterion variable involving categories
|
| Canonical Correlation |
Predict a combination of several criterion variables
from a combination of several predictor variables (similar to MANOVA
in which independent variable is a composite of 2 or more).
|
| Path Analysis |
Test theories about hypothesized causal links between
correlated variables (unique in that you must have formulated a causal
theory first).
|
| Structural Equation Modeling |
Test theories about hypothesized causal links between
variables that are correlated (yields more valid and reliable measures
of the variables to be analyzed than path analysis)
|
| Factor Analysis |
Reduce a large number of variables to a few factors
by combining variables that are moderately or highly correlated with
each other
|
| Differential Analysis |
Examine r's between variables among homogeneous subgroups
within a sample (can be used to ID moderator variables that improce
a measure's predictive validity)
|
"At the heart of any results section is information from which the researchers will ultimately draw conclusions about answers to the research problem" (Sowell & Casey, 1982: p. 128). Inferential statistics is the key to unlocking the generalizability of the results. Whether the data is correlational or experimental, it is critical to know the siginificance level of the results.
Significance and Alpha Levels - Significance level indicates the degree to which random chance is able to explain the results found (rather than your theory). If the significance level is .95, there is a 5% chance that the results are accounted for by random chance. This 5% is referred to as the alpha level. An alpha of 1% (.01) corresponds to a .99 significance level and means that there is less chance that the results can be attributed to error. The significance and alpha levels are set by you, the researcher, but .05 alpha levels are commonly used in educational research.
Probability or P-value - When you generate your results and run some form of inferential statistical analysis on the data, you will come up with a probability (a P-value). If that P-value is lower than the alpha level that you have set, then the results are significant and are likely attributable to your theory (not random chance). For example, if your P-value is .03 and alpha is .05, then the results are significant. If the P-value is .056, then the results are not significant and are considered inconclusive (no better than random chance). When it comes to P-values, less is definitely better, since it represents the chance of random error messing up your study. I think of P-value as a weed, you can't kill it completely but you can sure try!
T-ratio and F-ratio - Often, you will see F-ratios and T-ratios associated with your probabilities. These are the numbers from which the probabilities were computed and often reflect the degree to which the data are significant. Chi square is another popular one that is used.
Degrees of Freedom - It's combined with the F-value, T-value, Chi Square value, etc. to calculate your P-value. Basically, the df is your sample size minus one. You'll report that as well. So, your results will look something like this: "The analysis revealed significant effects of color-coding on recall tasks, F(1, 52) = .03, P<.05, which in English means that you are running a 1-tailed analysis of variance (ANOVA), your df is 52, so your F-value is .03, and your P-value is less than alpha (.05).
Null Hypothesis - Often, you'll read about the null and alternate hypotheses. This is just one of those things that scientists have made up to confuse people who are not in our special club. The alternate hypothesis is what your theory predicts, while the null hypothesis is exactly the opposite. You predict there will be a difference, so the null predicts there will be no difference. It's like that annoying friend who always takes the opposite viewpoint just to be a thorn in your side.
Rejecting the Null - More of that confusing stuff. If you get significant results (i.e., your P-value is lower than the alpha level you set), then we're kicking that annoying friend of yours to the curb by rejecting the null the hypothesis. Don't think any further about it because it just gets confusing. I suggest you just focus on those P-values and your alpha level.
Independent vs. Dependent Sample - Some sample groups consist of discrete subcategories (e.g., male and female), while others are related to each other (e.g., pretest and post-test scores). As the chart below indicates, the inferential statistics that you use will depend on whether or not the sample group is independent and on what type of data you are collecting.
The following table summarizes the various types of inferential statistical analysis techniques available to you if you are looking at just one dependent variable. It was adapted from a handout provided by Ed Yoder (Penn State University).
|
Type of Sample
|
|
Types of Data
|
|
|
Nominal
|
Ordinal
|
Interval-Ratio
|
|
|
Independent
Sample |
Chi Square Test |
Mann-Whiteney U Test
|
T-Test |
|
Dependent
Sample |
McNemar Test |
Wilcoxon Matched-Pairs Signed-Ranks test
|
Correlated t-Test
|
For more information about these, refer to Hinkle, Wiersma, & Jurs (1988), Marascuilo & Serlin (1988), and Pagano (1981).
http://ericae.net/testcol.htm
* Search for an instrument to measure a particular phenomenon. Developing
and testing your own can require a good deal of statistical work and can distract
you from your research question(s).
http://www.statlets.com/free/samsize1.htm
Sample size calculation tool
http://www.mhhe.com/socscience/education/edstat/resources.html
Educational Research and Statistics Link
http://www.utexas.edu/cc/stat/world/Education.html
Statistical Services Internet Statistics Resources
http://llanes.panam.edu/research/advisor/statistics.html
Quantitative Advisor
http://www.academicpress.com/ssr
Social Science Research
http://www.qualitative-research.net/fqs/fqs-e/inhalt1-01-e.htm
Qualitative and Quantitative Research: Conjunctions and Divergences
http://www.library.miami.edu/netguides/psymeth.html
Research Methods in the Social Sciences: Internet Resource List - University
of Miami
http://www.education.uconn.edu/siegle/research/Qualitative/qualquan.htm
Qualitative vs. Quantitative - Del Siegle
Gall, M.D., Borg, W.R., & Gall, J.P. (1996). Educational research: An introduction (6th ed.). London: Longman.
Glass, G.V., & Stanley, J.C. (1970). Statistical methods in education and psychology. Englewood Cliffs, NJ: Prentice-Hall.
Hinkle, D.E., Wiersma, W., & Jurs, S.G. (1988). Applied statistics for the behavioral sciences. Boston, MA: Houghton Miffling Company.
Hopkins, K.D., & Glass, G.V. (1978). Basic statistics for the behavioral sciences. Englewood Cliffs, NJ: Prentice-Hall.
Marascuilo, L.A., & Serlin, R.C. (1988). Statistical methods for the social and behavioral sciences. New York, NY: W.H. Freeman Company.
Pagano, R.R. (1981). Understanding statistics in the behavioral sciences. St. Paul, MN: West Publishing Company.
Sowell, E.J., & Casey, R.J. (1982). Analyzing educational research.
Belmont, CA: Wadsworth.




