 # Introduction to Using Statistics

Author/Creation: Candice Chovanec Melzow, September 2010.
Summary:  Provides an overview of using statistics in projects, reports, or papers.
Learning Objectives:  To select statistics from reputable sources by considering the source and its background. To report statistics ethically by providing appropriate context (e.g., the base of the percentage).

Statistics are one of the strongest types of evidence that writers can use –if they are used properly. Statistics can help “prove” your argument because they have the advantage of providing quantitative, concrete information about a subject which can, in turn, be interpreted to support your position. Let’s discuss a few guidelines to using statistics effectively and ethically.

Use statistics that come only from reliable sources.
Statistics are very easy to manipulate, so they should be used with caution. You must check the reputation of the source from which the statistics were taken. If the source is not reputable, then avoid using statistics from it. If you are not sure whether a source is reputable, it is always best to check with your instructor or a librarian.

Also keep in mind that sometimes a source’s author will use statistics that he or she obtained from another source. The source that you have is called the secondary source; the source where the information originally came from is called the primary source. Keep in mind that it’s always best to go back to the primary, or original, source, if possible, especially when using statistics.

Understand the background.
Since statistics are very easy to manipulate, it is imperative that you check on the background of the study from which the statistics were derived. Ask yourself the following questions:

1. Who conducted the research? (What motivations do the researchers have?)
2. Were the statistics derived from a poll, a survey, a clinical study?
3. How big was the survey group?
4. What was/were the geographic location, gender, ethnicity or income level of the people studied?

Asking these types of questions may help you to detect biases or limitations in the data. In some cases, those biases or limitations may mean that the research is no longer useful for your paper, project, or report. In other cases, you may be able to use the data as long as you acknowledge its limitations.

Exercise caution when using averages.
If you are using statistics that discuss “averages,” you must include the details that clearly identify and/or define what is meant by the term “average.” Essentially, in mathematics the word “average” can mean three things: mean, median, or mode, which can all provide entirely different results. Let’s take a look at the definition of each of these terms in conjunction with a set of numbers: 68, 74, 76, 85, 85, 92.

The mean is figured by adding all of the numbers in the set and then dividing that total by the quantity of numbers in the set. Mean is the traditional way that we find averages.

68 + 74 + 76 + 85 + 85 + 92 = 480; 480/6 = 80

The mean of the set is 80.

The median, on the other hand, is the “middle” number in an odd‐numbered set of numbers. If there is an even‐numbered set, then the median is found by taking the mean of its two middle numbers.

68, 74, 76, 85, 85, 92

76 + 85 = 161; 161/2 = 80.5

The mean for the set is 80.5

The mode is the most frequently occurring number in a set. Keep in mind that a set of numbers can have more than one mode.

68, 74, 76, 85, 85, 92

The mode for the set is 85.

Note that the mean, median and mode all result in very different results for the same set of numbers. Therefore, when statistics feature an “average,” it is best to find out which “average” (mean, median, mode) is meant in that particular case.

Give the bases of all percentages that you use.
Percentages are derived from specific bases, so you must always give this base when reporting them. Without their base, statistics are meaningless.

For example, if a university report states that “Students who participated in minimalist tutoring sessions received scores that were 70% higher,” this leaves the audience asking “Seventy percent higher than what?” The reason for confusion here is that a base is not given.