A WebQuest for exploring relationships between variables using simple linear regression. ![]()
Simple linear regression is a statistical method used to show the relationship between two variables (Weisberg, 2005).
You start with pairs of values called X and Y. These can be any numerical values. The goal is to
see if there is a relationship between X and Y and to determine if it is a strong relationship.
Simple Linear Regression is useful in a wide set of applications and industries, and since it is simple
to use, the method is a good starting point to analyze existing data and then predict Y values for any X. The only
rule of thumb is to pick X values within your minimum and maximum.
Using values outside of your minimum and maximum X is called extrapolation, and can yield bad results.
The X value is called the independent variable and the Y value is called the dependent variable. In common language,
the Y values depend on what X values you choose. While you can determine if a relationship exists with Simple Linear Regression,
the relationship is not necessarily a cause and effect relationship.
It takes more study to determine cause and effect relationships.
The equation that best represents your data as a line on the chart is called
the Simple Linear Regression Equation. It takes the form of Y = slope times X plus the y-intercept.
The slope is defined as the change in Y when X changes 1 unit of whatever you are measuring.
The y-intercept is the Y value when X has a value of zero. In the chart above,
1.042 is the value for the slope and .057 is the value for the y-intercept.
The slope and y-intercept are the two values we need to create the linear regression equation.
The nice thing about Microsoft Excel (R) is that it can calculate the values for you while producing the chart.
The R squared value is defined as the coefficient of determination.
R squared shows the amount of variation in Y that can be determined from the variation in X.
If the value is nearer to 1, it shows a strong relationship between X values and Y values.
If the value is nearer to 0 (Zero), it shows a weak relationship between X values and Y values.
(Bewick, Cheek, & Ball, 2003).
In this lesson, you will be able to use Microsoft Excel (R) to:
In order to master the skill, you can run the video and audio tutorial lesson as many times as you need to.
First you'll run
the spreadsheet tutorial...
Check your results against Y = 10.63 times X - 21264 as the simple regression equation
and the predicted Y value = 40.646 when X = 2004.2
If your results are different, you might be using a different number of decimal values in your calculation or the results.
If the numbers are way off, run the tutorial again to figure out how to run the process.
In a normal course, this is a good opportunity to have a student turn in results for a grade,
but as research, just compare your results to make sure you performed the function correctly.
Check your results against Y = -1.093 times X + 7.547 as the simple regression equation
and the predicted Y value = 4.858 when X = 2.46
If your results are different, it could be the number of decimal places. If way off, run the tutorial again to figure out how to run the process.
In a normal course, this is a good opportunity to have a student turn in results for a grade,
but as research, just compare your results to make sure you performed the function correctly.
Check your results against Y = .1 times X + 1 as the simple regression equation
and the predicted Y value = 3.04 when X = 20.4
If your results are different, it could be the number of decimal places. If way off, run the tutorial again to figure out how to run the process.
In a normal course, this is a good opportunity to have a student turn in results for a grade,
but as research, just compare your results to make sure you performed the function correctly.
Check your results against Y = -1.03 times X - 111.4 as the simple regression equation
and the predicted Y value = -414.22 when X = 294
If your results are different, it could be the number of decimal places. If way off, run the tutorial again to figure out how to run the process.
In a normal course, this is a good opportunity to have a student turn in results for a grade,
but as research, just compare your results to make sure you performed the function correctly.
There is a maximum of 8 points for this lesson.
1
2
3
4
To master the skill of performing Simple Linear Regression, a person needs to be able to derive the Simple Linear Regression equation and to be able to predict a Y value for a given X value. Microsoft Excel (R) is used in this skills training to make the calculations quick and easy. If you reached at least a score of 6 from the rubric above, you are ready to move to the next stage. If not, repeat the tutorial and try again.
The image and data used in this WebQuest are original material.
Teach Yourself Statistics (2009). AP Statistics tutorial: Least squares linear regression.
Retrieved from http://stattrek.com/AP-Statistics-1/Regression.aspx
CAUSEweb (2009). The Consortium to Advance Undergraduate Statistics Education. Retrieved from
http://www.causeweb.org
Bewick, V., Cheek, L., & Ball, J. (2003). Statistics review 7: Correlation and regression. Retrieved from http://ccforum.com/content/7/6/451
Weisberg, S. (2005). Applied Linear Regression, 3rd Ed. published by Wiley/Interscience in 2005 (ISBN 0-471-66379-4).
For important guidelines on how to teach statistics for college level students, see the following reference:
GAISE (2005). The American Statistical Association Web site Guidelines for Assessment and
Instruction in Statistics Education (GAISE) College report. Retrieved from
http://www.amstat.org/education/GAISE
|