|
A free membership is required to access uploaded content. Login or Register.
Difference between standardized and unstandardized regression models
|
Uploaded: 3 years ago
Category: Statistics and Probability
Type: Lecture Notes
Rating:
N/A
|
Filename: 4552160.ppt
(750 kB)
Page Count: 55
Credit Cost: 2
Views: 97
Last Download: N/A
|
Transcript
Associations among Continuous Variables
3 characteristics of a relationship
Direction
Degree of association
Form
Regression & Correlation
Correlation
Correlation - Definition
Correlation: a statistical technique that measures and describes the degree of linear relationship between two variables
’s r
A value ranging from -1.00 to 1.00 indicating the strength and direction of the linear relationship.
Absolute value indicates strength
+/- indicates direction
’s r
Deviation Score Formula
Deviation Score Formula
’s r
’s r
’s r
’s r
Z-score formula
Z-score formula
Z-score formula
Hypothesis testing with r
Hypotheses
H0: ? = 0
HA : ? ? 0
Practice
Practice
Linear Regression
Linear Regression
But how do we describe the line?
If two variables are linearly related it is possible to develop a simple equation to predict one variable from the other
The outcome variable is designated the Y variable, and the predictor variable is designated the X variable
E.g. centigrade to Fahrenheit:
F = 32 + 1.8C
this formula gives a specific straight line
The Linear Equation
The Linear Equation
The Linear Equation
The Linear Equation
Slope and Intercept
Equation of the line
The slope b: the amount of change in y with one unit change in x
The intercept a: the value of y when x is zero
Slope and Intercept
Equation of the line
The slope
The intercept
When there is no linear association (r = 0), the regression line is horizontal. b=0.
When the correlation is perfect (r = ± 1.00),
all the points fall along a straight line with a slope
When there is some linear association (0<|r|<1), the regression line fits as close to the points as possible and has a slope
Where did this line come from?
Regression lines
Unstandardized Regression Line
Equation of the line
The slope
The intercept
Standardized Regression Line
Equation of the line
The slope
The intercept
Exercise: Revisit the seed data
Calculate:
r =
b =
a =
? =
Write the regression equation:
Write the standardized equation:
Exercise: Revisit the seed data
Calculate:
r = .866
b = .375
a = 3.125
? = .866
Write the regression equation:
Write the standardized equation:
Overview
Correlation
-Definition
-Deviation Score Formula, Z score formula
-Hypothesis Test
Regression
Intercept and Slope
Unstandardized Regression Line
Standardized Regression Line
Hypothesis Tests
Direction
Positive(+)
Negative (-)
Degree of association
Between –1 and 1
Absolute values signify strength
Form
Linear
Non-linear
Positive
Large values of X = large values of Y,
Small values of X = small values of Y.
- e.g. IQ and SAT
Large values of X = small values of Y
Small values of X = large values of Y
-e.g. SPEED and ACCURACY
Negative
Strong
(tight cloud)
Weak
(diffuse cloud)
Linear
Non- linear
What is the best fitting straight line? Regression Equation: Y = a + bX
How closely are the points clustered around the line? ’s R
Obs X Y
A 1 1
B 1 3
C 3 2
D 4 5
E 6 4
F 7 5
Dataset
X
Y
Scatterplot
Below average on Y
Below average on Y
Above average on X
Below average on X
Above average on Y
Above average on Y
Above average on X
Below average on X
MEAN of Y
MEAN of X
The Logic of Correlation
Cross-Product =
For a strong positive association, the cross-products will mostly be positive
Below average on Y
Below average on Y
Above average on X
Below average on X
Above average on Y
Above average on Y
Above average on X
Below average on X
MEAN of Y
MEAN of X
Cross-Product =
For a strong negative association, the cross-products will mostly be negative
The Logic of Correlation
Below average on Y
Below average on Y
Above average on X
Below average on X
Above average on Y
Above average on Y
Above average on X
Below average on X
MEAN of Y
MEAN of X
Cross-Product =
For a weak association, the cross-products will be mixed
The Logic of Correlation
SP (sum of products) =
? (X – X)(Y – Y)
Deviation score formula
SSY
SSX
SP
66.00
58.2
mean
84
74
E
72
64
D
70
59
C
63
56
B
41
38
A
Humerus
Femur
SSY
1010
324
36
16
9
625
SSX
696.8
249.64
33.64
.64
4.84
408.04
SP
834
66.00
58.2
mean
284.4
18
15.8
84
74
E
34.8
6
5.8
72
64
D
3.2
4
0.8
70
59
C
6.6
-3
-2.2
63
56
B
505
-25
-20.2
41
38
A
Humerus
Femur
= .99
For a strong positive association, the SP will be a big positive number
SP (sum of products) =
? (X – X)(Y – Y)
Deviation score formula
Below average on Y
Below average on Y
Above average on X
Below average on X
Above average on Y
Below Average on Y
Above average on X
Below average on X
SP (sum of products) =
? (X – X)(Y – Y)
Deviation score formula
For a strong negative association, the SP will be a big negative number
Below average on Y
Below average on Y
Above average on X
Below average on X
Above average on Y
Below Average on Y
Above average on X
Below average on X
SP (sum of products) =
? (X – X)(Y – Y)
Deviation score formula
For a weak association, the SP will be a small number (+ and – will cancel each other out)
Below average on Y
Below average on Y
Above average on X
Below average on X
Above average on Y
Below Average on Y
Above average on X
Below average on X
Z score formula
Standardized cross-products
15.89
13.20
s
66.00
58.2
mean
84
74
E
72
64
D
70
59
C
63
56
B
41
38
A
ZXZY
ZY
ZX
Humerus
Femur
15.89
13.20
s
66.00
58.2
mean
1.133
1.197
84
74
E
0.378
0.439
72
64
D
0.252
0.061
70
59
C
-0.189
-0.167
63
56
B
-1.573
-1.530
41
38
A
ZXZY
ZY
ZX
Humerus
Femur
15.89
13.20
s
?=3.976
66.00
58.2
mean
1.356
1.133
1.197
84
74
E
0.166
0.378
0.439
72
64
D
0.015
0.252
0.061
70
59
C
0.031
-0.189
-0.167
63
56
B
2.408
-1.573
-1.530
41
38
A
ZXZY
ZY
ZX
Humerus
Femur
r = .99
Formulas for R
Z score formula
Deviations formula
Interpretation of R
A measure of strength of association: how closely do the points cluster around a line?
A measure of the direction of association: is it positive or negative?
Interpretation of R
r = .10 very small association, not usually reliable
r = .20 small association
r = .30 typical size for personality and social studies
r = .40 moderate association
r = .60 you are a research rock star
r = .80 hmm, are you for real?
Interpretation of R-squared
The amount of covariation compared to the amount of total variation
“The percent of total variance that is shared variance”
E.g. “If r = .80, then X explains 64% of the variability in Y” (and vice versa)
Test statistic = r
Or just use table E.2 to find critical values of r
SSY
SSX
SP
mean
3.47
5.63
E
3.34
4.89
D
3.77
6.19
C
3.76
6.13
B
4.03
6.47
A
tobacco
alcohol
SSY
.30
SSX
1.55
SP
.64
mean
3.47
5.63
E
3.34
4.89
D
3.77
6.19
C
3.76
6.13
B
4.03
6.47
A
tobacco
alcohol
Properties of R
A standardized statistic – will not change if you change the units of X or Y. (bc based on z-scores)
The same whether X is correlated with Y or vice versa
Fairly unstable with small n
Vulnerable to outliers
Has a skewed distribution
F = 32 + 1.8(C)
General form is Y = a + bX
The prediction equation: Y’ = a+ bX
Where
a = intercept
b = slope
X = the predictor
Y = the criterion
a and b are constants
in a given line;
X and Y change
F = 32 + 1.8(C)
General form is Y = a + bX
The prediction equation: Y’ = a + bX
Where
a = intercept
b = slope
X = the predictor
Y = the criterion
When b changes…
F = 32 + 1.8(C)
General form is Y = a + bX
The prediction equation: Y’ = a + bX
Where
a = intercept
b = slope
X = the predictor
Y = the criterion
When a changes…
F = 32 + 1.8(C)
General form is Y = a + bX
The prediction equation: Y’ = a + bX
Where
a = intercept
b = slope
X = the predictor
Y = the criterion
When both a
and b change…
The slope is influenced by r, but is not the same as r
and our best estimate of age is 29.5 at all heights.
It is a straight line which is drawn through a scatterplot, to summarize the relationship between X and Y
It is the line that minimizes the squared deviations (Y’ – Y)2
We call these vertical deviations “residuals”
Minimizing the squared vertical distances, or “residuals”
Properties of b (slope)
An unstandardized statistic – will change if you change the units of X or Y.
Depends on whether Y is regressed on X or vice versa
A person 1 stdev above the mean on height would be how many stdevs above the mean on weight?
Properties of ? (standardized slope)
An standardized statistic – will not change if you change the units of X or Y.
Is equal to r, in simple linear regression
1.291
1.118
7
9
0.645
1.118
6
9
0.645
0
6
5
-0.645
0
4
5
-0.645
-1.118
4
1
-1.291
-1.118
3
1
ZD
ZM
RawD
RawM
1.291
1.118
7
9
0.645
1.118
6
9
0.645
0
6
5
-0.645
0
4
5
-0.645
-1.118
4
1
-1.291
-1.118
3
1
ZD
ZM
RawD
RawM
Regression Coefficients Table
sig
t
SEb
b
Variable X
-
SEa
a
Intercept
Standardized Coefficient
Standard error
Unstandardized Coefficient
Predictor
Summary
Correlation: ’s r
Unstandardized Regression Line
Standardized Regression Line
Associations among Continuous Variables
3 characteristics of a relationship
Direction
Degree of association
Form
Regression & Correlation
Correlation
Correlation - Definition
Correlation: a statistical technique that measures and describes the degree of linear relationship between two variables
’s r
A value ranging from -1.00 to 1.00 indicating the strength and direction of the linear relationship.
Absolute value indicates strength
+/- indicates direction
’s r
Deviation Score Formula
Deviation Score Formula
’s r
’s r
’s r
’s r
Z-score formula
Z-score formula
Z-score formula
Hypothesis testing with r
Hypotheses
H0: ? = 0
HA : ? ? 0
Practice
Practice
Linear Regression
Linear Regression
But how do we describe the line?
If two variables are linearly related it is possible to develop a simple equation to predict one variable from the other
The outcome variable is designated the Y variable, and the predictor variable is designated the X variable
E.g. centigrade to Fahrenheit:
F = 32 + 1.8C
this formula gives a specific straight line
The Linear Equation
The Linear Equation
The Linear Equation
The Linear Equation
Slope and Intercept
Equation of the line
The slope b: the amount of change in y with one unit change in x
The intercept a: the value of y when x is zero
Slope and Intercept
Equation of the line
The slope
The intercept
When there is no linear association (r = 0), the regression line is horizontal. b=0.
When the correlation is perfect (r = ± 1.00),
all the points fall along a straight line with a slope
When there is some linear association (0<|r|<1), the regression line fits as close to the points as possible and has a slope
Where did this line come from?
Regression lines
Unstandardized Regression Line
Equation of the line
The slope
The intercept
Standardized Regression Line
Equation of the line
The slope
The intercept
Exercise
Calculate:
r =
b =
a =
? =
Write the regression equation:
Write the standardized equation:
Exercise
Calculate:
r = .866
b = .375
a = 3.125
? = .866
Write the regression equation:
Write the standardized equation:
Exercise in Excel
Calculate:
r =
b =
a =
? =
Write the regression equation:
Write the standardized equation:
Sketch the scatterplot and regression line
F = 32 + 1.8(C)
General form is Y = a + bX
The prediction equation: Y’ = a + bX
Where
a = intercept
b = slope
X = the predictor
Y = the criterion
Different b’s…
F = 32 + 1.8(C)
General form is Y = a + bX
The prediction equation: Y’ = a + bX
Where
a = intercept
b = slope
X = the predictor
Y = the criterion
Different a’s…
F = 32 + 1.8(C)
General form is Y = a + bX
The prediction equation: Y’ = a + bX
Where
a = intercept
b = slope
X = the predictor
Y = the criterion
Different a’s and b’s …
Properties of ? (standardized slope)
A standardized statistic – will not change if you change the units of X or Y.
Is equal to r, in simple linear regression
7
9
6
9
6
5
4
5
4
1
3
1
Y
X
7
9
6
9
6
5
4
5
4
1
3
1
Y
X
-6
9
-2
7
-4
4
-3
2
0
1
-1.5
1
Y
X
Associations among Continuous Variables
3 characteristics of a relationship
Direction
Degree of association
Form
Regression & Correlation
Correlation
Correlation - Definition
Correlation: a statistical technique that measures and describes the degree of linear relationship between two variables
’s r
A value ranging from -1.00 to 1.00 indicating the strength and direction of the linear relationship.
Absolute value indicates strength
+/- indicates direction
’s r
Deviation Score Formula
Deviation Score Formula
’s r
’s r
’s r
’s r
Z-score formula
Z-score formula
Z-score formula
Hypothesis testing with r
Hypotheses
H0: ? = 0
HA : ? ? 0
Practice
Practice
Linear Regression
Linear Regression
But how do we describe the line?
If two variables are linearly related it is possible to develop a simple equation to predict one variable from the other
The outcome variable is designated the Y variable, and the predictor variable is designated the X variable
E.g. centigrade to Fahrenheit:
F = 32 + 1.8C
this formula gives a specific straight line
The Linear Equation
The Linear Equation
The Linear Equation
The Linear Equation
Slope and Intercept
Equation of the line
The slope b: the amount of change in y with one unit change in x
The intercept a: the value of y when x is zero
Slope and Intercept
Equation of the line
The slope
The intercept
When there is no linear association (r = 0), the regression line is horizontal. b=0.
When the correlation is perfect (r = ± 1.00),
all the points fall along a straight line with a slope
When there is some linear association (0<|r|<1), the regression line fits as close to the points as possible and has a slope
Where did this line come from?
Regression lines
Unstandardized Regression Line
Equation of the line
The slope
The intercept
Standardized Regression Line
Equation of the line
The slope
The intercept
Exercise
Calculate:
r =
b =
a =
? =
Write the regression equation:
Write the standardized equation:
Exercise
Calculate:
r = .866
b = .375
a = 3.125
? = .866
Write the regression equation:
Write the standardized equation:
Exercise in Excel
Calculate:
r =
b =
a =
? =
Write the regression equation:
Write the standardized equation:
Sketch the scatterplot and regression line
|
|
Comments (0)
|
Post your homework questions and get free online help from our incredible volunteers
|