The data we obtain from
surveys or from other research approaches are the raw data - the actual
responses given or made by each member of one or more groups of individuals
that interest us. In order to better understand these data, we look for
ways to summarize them. We talk about what percentage chose "X" or "Y",
what was the average (mean) rating, or similar summary statements that we can put
into text or tables to describe both what we found in our sample and,
hopefully, what that tells us about the people who interest us in general.
statistics are most appropriate for your data will depend on the measurement
scale used in collecting information on each particular item. There are
four types of measurement scales.
Like the name implies, these are really just names. Hair color, gender,
and marital status are examples of nominal variables. There is no
ordering among the categories (blonde hair is not more or less than
brown, various industry classifications are not "more" or "less") and averaging is not appropriate for this type of data. The
measures used to describe this type of data are the percentages that fall into
each category or the mode (the most commonly selected category).
Nominal scales are for
classification - they are not "measures" in the true sense of the term as they
do not represent "quantities", "magnitudes", "frequencies", or the
like. They are simply classifications. You use nominal scales when
the categories are exhaustive (include all alternatives, even though one may
be "other") and mutually exclusive (none fall into more than one category).
These reflect ordered categories (e.g., small, medium, or large). We can
say one category reflects more of the attribute we are measuring than does
another, or that the top category reflects more than those below it, but we
cannot say how much more. "Satisfaction", for example, can be rated from
less to more highly satisfied, but we are not sure if the difference between
being "Very Satisfied" and being "Satisfied" is really equivalent to the
difference between being "Dissatisfied" and being "Very Dissatisfied."
This limits the math, and therefore, the statistics we can use with this type of
data. Averages, for example, are not really appropriate
here. As a result, the statistical tests for these type of data tend to
use other approaches, such as looking at rankings across the categories in
You can summarize
these data as the percentage of respondents who fall into each category or
as the mode, which is the term used to describe the most common category
selected. The mode (most common category) can be used to express the
"middle" of the distribution, however, the frequency distribution
(percentages in each category) will generally suffice. There is no measure of variability
for this type
Most items on surveys
have ordinal scale alternatives as selections that the respondents can choose
from. We generally use 5-point scales, that include negative (e.g.,
"Very Dissatisfied", "Dissatisfied"), neutral ("Neutral"), and positive
alternatives ("Satisfied", "Very Satisfied"). Five-point scales are
sufficient for showing change in most instances and we find these easier to
express in writing in reports than we do some 7-point scales.
With interval scales we have ordered categories that are equidistant from each
other. If the categories are labeled A, B, and C, we can say A < B < C
and (C-B) = (B-A). If this is starting to look like algebra, then you
can appreciate that we can use more complex mathematics in analyzing this type
of data and indeed we are now able as a result to apply more powerful
analytical procedures. Unfortunately, questionnaires rarely generate
data. Analyses of variance and
t-tests are examples of the types of tests for group differences that are
commonly used with interval scale data.
These data can also be
summarized as the percentage of respondents who fall into each category.
The median can be used to indicate the center or mid-point of the
distribution and the interquartile range can be used as an indication of
variability in the data.
Some users of statistics feel
comfortable applying statistics for interval scales to these data if items
are summed to produce a total score (e.g., a satisfaction or a loyalty
index) or there are a wide range of ordered categories, but purists are
uncomfortable with this approach. There are, however, a variety of
statistical procedures (referred to as non-parametric statistics) that can
be used to test for the significance of differences between groups or
subgroups or to assess the significance of changes that occur over time.
For this type of data,
we can compute mean (average) scores or medians and percentiles.
Typically, the median is preferred if the data are skewed (biased towards
lower or higher scores) or the range can go to very high values (as in housing
costs or income levels) as the median is less affected by skew and outliers
than is the mean. Measures of the variance, the standard deviation, or
the median absolute deviation can be used to express variability in the
responses. Confidence intervals can be computed for sample means to
express the range within which the population mean is likely to lie.
These are interval scales with true zero points. Sometimes it seems this
only matters to physicists, who invented the Kelvin scale (which has an
absolute 0) so that they could use forms of mathematics that are not possible
with measures taken in Fahrenheit and Celsius, which are interval level
measures. For practical purposes in the behavioral sciences, we are
would generally be happy if we could get true interval level measurement and
it is unlikely we will ever measure attitudes, opinions, or intentions with
anywhere near this level of precision.
The statistics for
testing for group differences or changes over time that are available for this
type of data (e.g., the t-test, the Analysis of Variance or ANOVA, etc.) tend
to be more powerful. In large samples, there is the risk that even
trivial differences will be statistically significant. For this reason,
some researchers advocate the use of confidence intervals. When the
confidence interval for the difference between two means does not span 0, then
the findings will always be statistically significant.
use the highest level of measurement that is available to you. It is
better to express age in years than it is to express it in categories
(children, adolescents, young adults, etc.) as you have more power in
analyzing the data. Also, you can always break continuous data down into
categories for reporting purposes, but you cannot turn categorical data into
continuous data later if you change your mind.