Regression Analysis Question with Solution Sample Assignment
PROBLEM
1. Assume you have noted the following prices for books and the number of pages that each book contains.
Book |
Pages (x) |
Price (y) |
A |
500 |
$7.00 |
B |
700 |
7.50 |
C |
750 |
9.00 |
D |
590 |
6.50 |
E |
540 |
7.50 |
F |
650 |
7.00 |
G |
480 |
4.50 |
a. |
Develop a least-squares estimated regression line. |
b. |
Compute the coefficient of determination and explain its meaning. |
c. |
Compute the correlation coefficient between the price and the number of pages. Test to see if x and y are related. Use a = 0.10. |
ANS:
a. |
y= 1.0416 + 0.0099x |
b. |
r 2 = .5629; the regression equation has accounted for 56.29% of the total sum of squares |
c. |
rxy = 0.75 t = 2.54 > 2.015 (df = 5); p-value is between .05 and 0.1; (Excel’s results: p-value of 0.052); reject Ho, and conclude x and y are related |
2. Assume you have noted the following prices for books and the number of pages that each book contains.
Book |
Pages (x) |
Price (y) |
A |
500 |
$7.00 |
B |
700 |
7.50 |
C |
750 |
9.00 |
D |
590 |
6.50 |
E |
540 |
7.50 |
F |
650 |
7.00 |
G |
480 |
4.50 |
a. |
Perform an F test and determine if the price and the number of pages of the books are related. Let a = 0.01. |
b. |
Perform a t test and determine if the price and the number of pages of the books are related. Let a = 0.01. |
c. |
Develop a 90% confidence interval for estimating the average price of books that contain 800 pages. |
d. |
Develop a 90% confidence interval to estimate the price of a specific book that has 800 pages. |
ANS:
a. |
F = 6.439 < 16.26; p-value is between 0.1 and 0.2 (Excel’s result: p-value = .052); do not reject Ho; conclude x and y are not related |
b. |
t = 2.5376 < 4.032; p-value is between 0.1 and 0.2. (Excel’s result: p-value = .052); do not reject Ho; conclude x and y are not related |
c. |
$7.29 to $10.63 (rounded) |
d. |
$5.62 to $12.31 (rounded) |
3. The following data represent the number of flash drives sold per day at a local computer shop and their prices.
Price (x) |
Units Sold (y) |
$34 |
3 |
36 |
4 |
32 |
6 |
35 |
5 |
30 |
9 |
38 |
2 |
40 |
1 |
a. |
Develop a least-squares regression line and explain what the slope of the line indicates. |
b. |
Compute the coefficient of determination and comment on the strength of relationship between x and y. |
c. |
Compute the sample correlation coefficient between the price and the number of flash drives sold. Use a= 0.01 to test the relationship between x and y. |
ANS:
a. |
y= 29.7857 - 0.7286x The slope indicates that as the price goes up by $1, the number of units sold goes down by 0.7286 units. |
b. |
r 2 = .8556; the regression equation has accounted for 85.56% of the total sum of squares |
c. |
rxy = -0.92 t = -5.44 < -4.032 (df = 5); p-value < .01; (Excel’s result: p-value = .0028); reject Ho, and conclude x and y are related |
4. The following data represent the number of flash drives sold per day at a local computer shop and their prices.
Price (x) |
Units Sold (y) |
$34 |
3 |
36 |
4 |
32 |
6 |
35 |
5 |
30 |
9 |
38 |
2 |
40 |
1 |
a. |
Perform an F test and determine if the price and the number of flash drives sold are related. Let a = 0.01. |
b. |
Perform a t test and determine if the price and the number of flash drives sold are related. Let a = 0.01. |
ANS:
a. |
F = 29.624 > 16.26; p-value < .01; (Excel’s result: p-value = .0028); reject Ho, x and y are related |
b. |
t = -5.4428 < -4.032; p-value < .01; (Excel’s result: p-value = .0028); reject Ho, x and y are related |
5. Shown below is a portion of an Excel output for regression analysis relating Y (dependent variable) and X (independent variable).
ANOVA | ||
df |
SS | |
Regression |
1 |
110 |
Residual |
8 |
74 |
Total |
9 |
184 |
Coefficients |
Standard Error | |
Intercept |
39.222 |
5.943 |
x |
-0.5556 |
0.1611 |
a. |
What has been the sample size for the above? |
b. |
Perform a t test and determine whether or not X and Y are related. Let a = 0.05. |
c. |
Perform an F test and determine whether or not X and Y are related. Let a = 0.05. |
d. |
Compute the coefficient of determination. |
e. |
Interpret the meaning of the value of the coefficient of determination that you found in d. Be very specific. |
ANS:
a through d
Summary Output | |
Regression Statistics | |
Multiple R |
0.7732 |
R Square |
0.5978 |
Adjusted R Square |
0.5476 |
Standard Error |
3.0414 |
Observations |
10 |
ANOVA | |||||
df |
SS |
MS |
F |
Significance F | |
Regression |
1 |
110 |
110 |
11.892 |
0.009 |
Residual |
8 |
74 |
9.25 | ||
Total |
9 |
184 |
Coefficients |
Standard Error |
t Stat |
P-value | |
Intercept |
39.222 |
5.942 |
6.600 |
0.000 |
x |
-0.556 |
0.161 |
-3.448 |
0.009 |
e. |
59.783% of the variability in Y is explained by the variability in X. |
6. Shown below is a portion of a computer output for regression analysis relating Y (dependent variable) and X (independent variable).
ANOVA | ||
df |
SS | |
Regression |
1 |
24.011 |
Residual |
8 |
67.989 |
Coefficients |
Standard Error | |
Intercept |
11.065 |
2.043 |
x |
-0.511 |
0.304 |
a. |
What has been the sample size for the above? |
b. |
Perform a t test and determine whether or not X and Y are related. Let a = 0.05. |
c. |
Perform an F test and determine whether or not X and Y are related. Let a = 0.05. |
d. |
Compute the coefficient of determination. |
e. |
Interpret the meaning of the value of the coefficient of determination that you found in d. Be very specific. |
ANS:
a through d
Summary Output | |
Regression Statistics | |
Multiple R |
0.511 |
R Square |
0.261 |
Adjusted R Square |
0.169 |
Standard Error |
2.915 |
Observations |
10 |
ANOVA | |||||
df |
SS |
MS |
F |
Significance F | |
Regression |
1 |
24.011 |
24.011 |
2.825 |
0.131 |
Residual |
8 |
67.989 |
8.499 | ||
Total |
9 |
92 |
Coefficients |
Standard Error |
t Stat |
P-value | |
Intercept |
11.065 |
2.043 |
5.415 |
0.001 |
x |
-0.511 |
0.304 |
-1.681 |
0.131 |
e. |
26.1% of the variability in Y is explained by the variability in X. |
7. Part of an Excel output relating X (independent variable) and Y (dependent variable) is shown below. Fill in all the blanks marked with {"?"}.
Summary Output | |
Regression Statistics | |
Multiple R |
0.1347 |
R Square |
? |
Adjusted R Square |
? |
Standard Error |
3.3838 |
Observations |
? |
ANOVA | |||||
df |
SS |
MS |
F |
Significance F | |
Regression |
? |
2.7500 |
? |
? |
0.632 |
Residual |
? |
? |
11.45 | ||
Total |
14 |
? |
Coefficients |
Standard Error |
t Stat |
P-value | |
Intercept |
8.6 |
2.2197 |
? |
0.0019 |
x |
0.25 |
0.5101 |
? |
0.632 |
ANS:
Summary Output | |
Regression Statistics | |
Multiple R |
0.1347 |
R Square |
0.0181 |
Adjusted R Square |
-0.0574 |
Standard Error |
3.384 |
Observations |
15 |
ANOVA | |||||
|
df |
SS |
MS |
F |
Significance F |
Regression |
1 |
2.750 |
2.75 |
0.2402 |
0.6322 |
Residual |
13 |
148.850 |
11.45 | ||
Total |
14 |
151.600 |
Coefficients |
Standard Error |
t Stat |
p-value | |
Intercept |
8.6 |
2.2197 |
3.8744 |
0.0019 |
x |
0.25 |
0.5101 |
0.4901 |
0.6322 |
8. Shown below is a portion of a computer output for a regression analysis relating Y (dependent variable) and X (independent variable).
ANOVA | ||
df |
SS | |
Regression |
1 |
115.064 |
Residual |
13 |
82.936 |
Total | ||
Coefficients |
Standard Error | |
Intercept |
15.532 |
1.457 |
x |
-1.106 |
0.261 |
a. |
Perform a t test using the p-value approach and determine whether or not Y and X are related. Let a = 0.05. |
b. |
Using the p-value approach, perform an F test and determine whether or not X and Y are related. |
c. |
Compute the coefficient of determination and fully interpret its meaning. Be very specific. |
ANS:
a and b
Summary Output | |
Regression Statistics | |
Multiple R |
0.7623 |
R Square |
0.5811 |
Adjusted R Square |
0.5489 |
Standard Error |
2.5258 |
Observations |
15 |
ANOVA | |||||
df |
SS |
MS |
F |
Significance F | |
Regression |
1 |
115.064 |
115.064 |
18.036 |
0.001 |
Residual |
13 |
82.936 |
6.380 | ||
Total |
14 |
198 |
Coefficients |
Standard Error |
t Stat |
P-value | |
Intercept |
15.532 |
1.457 |
10.662 |
0.000 |
x |
-1.106 |
0.261 |
-4.247 |
0.001 |
c. |
58.11% of the variability in Y is explained by the variability in X. |
9. Part of an Excel output relating X (independent variable) and Y (dependent variable) is shown below. Fill in all the blanks marked with {"?"}.
Summary Output | |
Regression Statistics | |
Multiple R |
? |
R Square |
0.5149 |
Adjusted R Square |
? |
Standard Error |
7.3413 |
Observations |
11 |
ANOVA | |||||
df |
SS |
MS |
F |
Significance F | |
Regression |
? |
? |
? |
? |
0.0129 |
Residual |
? |
? |
? | ||
Total |
? |
1000 |
Coefficients |
Standard Error |
t Stat |
P-value | |
Intercept |
? |
29.4818 |
3.7946 |
0.0043 |
x |
? |
0.7000 |
-3.0911 |
0.0129 |
ANS:
Summary Output | |
Regression Statistics | |
Multiple R |
0.7176 |
R Square |
0.5149 |
Adjusted R Square |
0.4611 |
Standard Error |
7.3413 |
Observations |
11 |
ANOVA | |||||
df |
SS |
MS |
F |
Significance F | |
Regression |
1 |
514.9455 |
514.9455 |
9.5546 |
0.0129 |
Residual |
9 |
485.0545 |
53.8949 | ||
Total |
10 |
1000.0000 |
Coefficients |
Standard Error |
t Stat |
P-value | |
Intercept |
111.8727 |
29.4818 |
3.7946 |
0.0043 |
x |
-2.1636 |
0.7000 |
-3.0911 |
0.0129 |
10. Shown below is a portion of a computer output for a regression analysis relating Y (demand) and X (unit price).
ANOVA | ||
df |
SS | |
Regression |
1 |
5048.818 |
Residual |
46 |
3132.661 |
Total |
47 |
8181.479 |
Coefficients |
Standard Error | |
Intercept |
80.390 |
3.102 |
X |
-2.137 |
0.248 |
a. |
Perform a t test and determine whether or not demand and unit price are related. Let a = 0.05. |
b. |
Perform an F test and determine whether or not demand and unit price are related. Let a = 0.05. |
c. |
Compute the coefficient of determination and fully interpret its meaning. Be very specific. |
d. |
Compute the coefficient of correlation and explain the relationship between demand and unit price. |
ANS:
a and b
Summary Output | |
Regression Statistics | |
Multiple R |
0.786 |
R Square |
0.617 |
Adjusted R Square |
0.609 |
Standard Error |
8.252 |
Observations |
48 |
ANOVA | |||||
df |
SS |
MS |
F |
Significance F | |
Regression |
1 |
5048.818 |
5048.818 |
74.137 |
0.000 |
Residual |
46 |
3132.661 |
68.101 | ||
Total |
47 |
8181.479 |
Coefficients |
Standard Error |
t Stat |
P-value | |
Intercept |
80.390 |
3.102 |
25.916 |
0.000 |
X |
-2.137 |
0.248 |
-8.610 |
0.000 |
c. |
R2 = 0.617; 61.7% of the variability in demand is explained by the variability in price. |
d. |
R = -0.786; since the slope is negative, the coefficient of correlation is also negative, indicating that as unit price increases demand decreases. |
11. Shown below is a portion of a computer output for a regression analysis relating supply (Y in thousands of units) and unit price (X in thousands of dollars).
ANOVA | ||
df |
SS | |
Regression |
1 |
354.689 |
Residual |
39 |
7035.262 |
Coefficients |
Standard Error | |
Intercept |
54.076 |
2.358 |
X |
0.029 |
0.021 |
a. |
What has been the sample size for this problem? |
b. |
Perform a t test and determine whether or not supply and unit price are related. Let a = 0.05. |
c. |
Perform and F test and determine whether or not supply and unit price are related. Let a = 0.05. |
d. |
Compute the coefficient of determination and fully interpret its meaning. Be very specific. |
e. |
Compute the coefficient of correlation and explain the relationship between supply and unit price. |
f. |
Predict the supply (in units) when the unit price is $50,000. |
ANS:
a through c
Regression Statistics | |
Multiple R |
0.219 |
R Square |
0.048 |
Adjusted R Square |
0.024 |
Standard Error |
13.431 |
Observations |
41 |
ANOVA | |||||
df |
SS |
MS |
F |
Significance F | |
Regression |
1 |
354.689 |
354.689 |
1.966 |
0.169 |
Residual |
39 |
7035.262 |
180.391 | ||
Total |
40 |
7389.951 |
Coefficients |
Standard Error |
t Stat |
P-value | |
Intercept |
54.076 |
2.358 |
22.938 |
0.000 |
X |
0.029 |
0.021 |
1.402 |
0.169 |
d. |
R2 = 0.048; 4.8% of the variability in supply is explained by the variability in price. |
e. |
R = 0.219; since the slope is positive, as unit price increases so does supply. |
f. |
supply = 54.076 + .029(50) = 55.526 (55,526 units) |
12. Given below are four observations collected in a regression study on two variables x (independent variable) and y (dependent variable).
x |
y |
2 |
4 |
6 |
7 |
9 |
8 |
9 |
9 |
a. |
Develop the least squares estimated regression equation. |
b. |
At 95% confidence, perform a t test and determine whether or not the slope is significantly different from zero. |
c. |
Perform an F test to determine whether or not the model is significant. Let a = 0.05. |
d. |
Compute the coefficient of determination. |
ANS:
Regression Statistics | |
Multiple R |
0.977 |
R Square |
0.955 |
Adjusted R Square |
0.932 |
Standard Error |
0.564 |
Observations |
4 |
ANOVA | |||||
df |
SS |
MS |
F |
Significance F | |
Regression |
1 |
13.364 |
13.364 |
42.000 |
0.023 |
Residual |
2 |
0.636 |
0.318 | ||
Total |
3 |
14 |
Coefficients |
Standard Error |
t Stat |
P-value | ||
Intercept |
2.864 |
0.698 |
4.104 |
0.055 | |
X |
0.636 |
0.098 |
6.481 |
0.023 |
a. |
= 2.864 + 0.636x |
b. |
p-value < .05; reject Ho |
c. |
p-value < .05; reject Ho |
d. |
0.955 |
13. Given below are five observations collected in a regression study on two variables, x (independent variable) and y (dependent variable).
x |
y |
2 |
4 |
3 |
4 |
4 |
3 |
5 |
2 |
6 |
1 |
a. |
Develop the least squares estimated regression equation. |
b. |
At 95% confidence, perform a t test and determine whether or not the slope is significantly different from zero. |
c. |
Perform an F test to determine whether or not the model is significant. Let a = 0.05. |
d. |
Compute the coefficient of determination. |
e. |
Compute the coefficient of correlation. |
ANS:
Regression Statistics | |
Multiple R |
0.970 |
R Square |
0.941 |
Adjusted R Square |
0.922 |
Standard Error |
0.365 |
Observations |
5 |
ANOVA | |||||
df |
SS |
MS |
F |
Significance F | |
Regression |
1 |
6.4 |
6.400 |
48.000 |
0.006 |
Residual |
3 |
0.4 |
0.133 | ||
Total |
4 |
6.8 |
Coefficients |
Standard Error |
t Stat |
P-value | ||
Intercept |
6.000 |
0.490 |
12.247 |
0.001 | |
X |
-0.800 |
0.115 |
-6.928 |
0.006 |
a. |
= 6 - 0.8 x |
b. |
p-value < .05; reject Ho |
c. |
p-value < .05; reject Ho |
d. |
0.941 |
e. |
-0.970 |
14. Below you are given a partial computer output based on a sample of 8 observations, relating an independent variable (x) and a dependent variable (y).
Coefficient |
Standard Error | |
Intercept |
13.251 |
10.77 |
X |
0.803 |
0.385 |
Analysis of Variance |
|
SOURCE |
SS |
Regression | |
Error (Residual) |
41.674 |
Total |
71.875 |
a. |
Develop the estimated regression line. |
b. |
At a = 0.05, test for the significance of the slope. |
c. |
At a = 0.05, perform an F test. |
d. |
Determine the coefficient of determination. |
ANS:
a. |
= 13.251 + 0.803x |
b. |
t = 2.086; p-value is between .05 and .1 (critical t = 2.447); do not reject Ho |
c. |
F = 4.348; p-value is between .05 and .1 (critical F = 5.99); do not reject Ho |
d. |
0.42 |
15. Below you are given a partial computer output based on a sample of 8 observations, relating an independent variable (x) and a dependent variable (y).
Coefficient |
Standard Error | |
Intercept |
-9.462 |
7.032 |
x |
0.769 |
0.184 |
Analysis of Variance |
|
SOURCE |
SS |
Regression |
400 |
Error (Residual) |
138 |
a. |
Develop the estimated regression line. |
b. |
At a = 0.05, test for the significance of the slope. |
c. |
At a = 0.05, perform an F test. |
d. |
Determine the coefficient of determination. |
ANS:
a. |
y= -9.462 + 0.769x |
b. |
t = 4.17; p-value (actual p-value using Excel = 0.0059) < .05; reject Ho |
c. |
F = 17.39; p-value (actual p-value using Excel = 0.0059) < .05; reject Ho |
d. |
0.743 |
16. The following data represent a company's yearly sales volume and its advertising expenditure over a period of 8 years.
(Y) Sales in Millions of Dollars |
(X) Advertising in ($10,000) |
15 |
32 |
16 |
33 |
18 |
35 |
17 |
34 |
16 |
36 |
19 |
37 |
19 |
39 |
24 |
42 |
a. |
Develop a scatter diagram of sales versus advertising and explain what it shows regarding the relationship between sales and advertising. |
b. |
Use the method of least squares to compute an estimated regression line between sales and advertising. |
c. |
If the company's advertising expenditure is $400,000, what are the predicted sales? Give the answer in dollars. |
d. |
What does the slope of the estimated regression line indicate? |
e. |
Compute the coefficient of determination and fully interpret its meaning. |
f. |
Use the F test to determine whether or not the regression model is significant at a = 0.05. |
g. |
Use the t test to determine whether the slope of the regression model is significant at a = 0.05. |
h. |
Develop a 95% confidence interval for predicting the average sales for the years when $400,000 was spent on advertising. |
i. |
Compute the correlation coefficient. |
ANS:
a. | The scatter diagram shows a positive relation between sales and advertising. |
b. |
= -10.42 + 0.7895X |
c. |
$21,160,000 |
d. |
As advertising is increased by $10,000, sales are expected to increase by $789,500. |
e. |
0.8459; 84.59% of variation in sales is explained by variation in advertising |
f. |
F = 32.93; p-value (actual p-value using Excel = 0.0012) < .05; reject Ho; it is significant (critical F = 5.99) |
g. |
t = 5.74; p-value (actual p-value using Excel = 0.0012) < .05; reject Ho; significant (critical t = 2.447) |
h. |
$19,460,000 to $22,860,000 |
i. |
0.9197 |
17. Given below are five observations collected in a regression study on two variables x (independent variable) and y (dependent variable).
x |
y |
10 |
7 |
20 |
5 |
30 |
4 |
40 |
2 |
50 |
1 |
a. |
Develop the least squares estimated regression equation |
b. |
At 95% confidence, perform a t test and determine whether or not the slope is significantly different from zero. |
c. |
Perform an F test to determine whether or not the model is significant. Let a = 0.05. |
d. |
Compute the coefficient of determination. |
e. |
Compute the coefficient of correlation. |
ANS:
a. |
y= 8.3 - 0.15x |
b. |
t = -15; p-value (actual p-value using Excel = 0.0001) < .05; reject Ho (critical t = 3.18) |
c. |
F = 225; p-value (actual p-value using Excel = 0.0001) < .05; reject Ho (critical F = 10.13) |
d. |
0.9868 |
e. |
0.9934 |
18. Below you are given a partial computer output based on a sample of 14 observations, relating an independent variable (x) and a dependent variable (y).
Predictor |
Coefficient |
Standard Error |
Constant |
6.428 |
1.202 |
X |
0.470 |
0.035 |
Analysis of Variance |
|
SOURCE |
SS |
Regression |
958.584 |
Error (Residual) | |
Total |
1021.429 |
a. |
Develop the estimated regression line. |
b. |
At a = 0.05, test for the significance of the slope. |
c. |
At a = 0.05, perform an F test. |
d. |
Determine the coefficient of determination. |
e. |
Determine the coefficient of correlation. |
ANS:
a. |
= 6.428 + 0.47x |
b. |
t = 13.529; p-value (actual p-value using Excel = 0.0000) < .05; reject Ho (critical t = 2.179) |
c. |
F = 183.04; p-value (actual p-value using Excel = 0.0000) < .05; reject Ho (critical F = 4.75) |
d. |
0.938 |
e. |
0.968 |
19. Below you are given a partial computer output based on a sample of 21 observations, relating an independent variable (x) and a dependent variable (y).
Predictor |
Coefficient |
Standard Error |
Constant |
30.139 |
1.181 |
X |
-0.252 |
0.022 |
Analysis of Variance |
|
SOURCE |
SS |
Regression |
1,759.481 |
Error |
259.186 |
a. |
Develop the estimated regression line. |
b. |
At a = 0.05, test for the significance of the slope. |
c. |
At a = 0.05, perform an F test. |
d. |
Determine the coefficient of determination. |
e. |
Determine the coefficient of correlation. |
ANS:
a. |
y= 30.139 - 0.252X |
b. |
t = -11.357; p-value (almost zero) < a = .05; reject Ho (critical t = 2.093) |
c. |
F = 128.982; p-value (almost zero) < a = .05; reject Ho (critical F = 4.38) |
d. |
0.872 |
e. |
-0.934 |
20. An automobile dealer wants to see if there is a relationship between monthly sales and the interest rate. A random sample of 4 months was taken. The results of the sample are presented below. The estimated least squares regression equation is
y= 75.061 - 6.254X
Y |
X |
Monthly Sales |
Interest Rate (In Percent) |
22 |
9.2 |
20 |
7.6 |
10 |
10.4 |
45 |
5.3 |
a. |
Obtain a measure of how well the estimated regression line fits the data. |
b. |
You want to test to see if there is a significant relationship between the interest rate and monthly sales at the 1% level of significance. State the null and alternative hypotheses. |
c. |
At 99% confidence, test the hypotheses. |
d. |
Construct a 99% confidence interval for the average monthly sales for all months with a 10% interest rate. |
e. |
Construct a 99% confidence interval for the monthly sales of one month with a 10% interest rate. |
ANS:
a. |
R2 = 0.8687 |
b. |
H0: b1 = 0 |
Ha: b1 0 | |
c. |
test statistic t = -3.64; p-value is between .05 and .10 (critical t = 9.925); do not reject H0 |
d. |
-33.151 to 58.199; therefore, 0 to 58.199 |
e. |
-67.068 to 92.116; therefore, 0 to 92.116 |
21. Jason believes that the sales of coffee at his coffee shop depend upon the weather. He has taken a sample of 6 days. Below you are given the results of the sample.
Cups of Coffee Sold |
Temperature |
350 |
50 |
200 |
60 |
210 |
70 |
100 |
80 |
60 |
90 |
40 |
100 |
a. |
Which variable is the dependent variable? |
b. |
Compute the least squares estimated line. |
c. |
Compute the correlation coefficient between temperature and the sales of coffee. |
d. |
Is there a significant relationship between the sales of coffee and temperature? Use a .05 level of significance. Be sure to state the null and alternative hypotheses. |
e. |
Predict sales of a 90 degree day. |
ANS:
a. |
Cups of coffee sold |
b. |
y= 605.714 - 5.943X |
c. |
0.95197 |
d. |
H0: b1 = 0 |
Ha: b1 0 | |
t = -6.218; p-value (actual p-value using Excel = 0.0034) < a = .05; reject Ho (critical t = 2.776) | |
e. |
70.8 or 71 cups |
22. Researchers have collected data on the hours of television watched in a day and the age of a person. You are given the data below.
Hours of Television |
Age |
1 |
45 |
3 |
30 |
4 |
22 |
3 |
25 |
6 |
5 |
a. |
Determine which variable is the dependent variable. |
b. |
Compute the least squares estimated line. |
c. |
Is there a significant relationship between the two variables? Use a .05 level of significance. Be sure to state the null and alternative hypotheses. |
d. |
Compute the coefficient of determination. How would you interpret this value? |
ANS:
a. |
Hours of Television |
b. |
= 6.564 - 0.1246X |
c. |
H0: b1 = 0 |
Ha: b1 0 | |
t = -12.018; p-value (actual p-value using Excel = 0.0002) < a = .05; reject H0 (critical t = 3.18) | |
d. |
0.98 (rounded); 98 % of variation in hours of watching television is explained by variation in age. |
23. Given below are seven observations collected in a regression study on two variables, X (independent variable) and Y (dependent variable).
X |
Y |
2 |
12 |
3 |
9 |
6 |
8 |
7 |
7 |
8 |
6 |
7 |
5 |
9 |
2 |
a. |
Develop the least squares estimated regression equation. |
b. |
At 95% confidence, perform a t test and determine whether or not the slope is significantly different from zero. |
c. |
Perform an F test to determine whether or not the model is significant. Let a = 0.05. |
d. |
Compute the coefficient of determination. |
ANS:
a. |
= 13.75 -1.125X |
b. |
t = -5.196; p-value (actual p-value using Excel = 0.0001) < a = .05; reject Ho (critical t = 2.571) |
c. |
F = 27; p-value (actual p-value using Excel = 0.0001) < a = .05; reject Ho (critical F = 6.61) |
d. |
0.844 |
24. The owner of a retail store randomly selected the following weekly data on profits and advertising cost.
Week |
Advertising Cost ($) |
Profit ($) |
1 |
0 |
200 |
2 |
50 |
270 |
3 |
250 |
420 |
4 |
150 |
300 |
5 |
125 |
325 |
a. |
Write down the appropriate linear relationship between advertising cost and profits. Which is the dependent variable? Which is the independent variable? |
b. |
Calculate the least squares estimated regression line. |
c. |
Predict the profits for a week when $200 is spent on advertising. |
d. |
At 95% confidence, test to determine if the relationship between advertising costs and profits is statistically significant. |
e. |
Calculate the coefficient of determination. |
ANS:
a. |
E(Y) = b0 + b1X, where Y is profit and X is advertising cost |
b. |
= 210.0676 + 0.80811X |
c. |
$371.69 |
d. |
t = 6.496; p-value (actual p-value using Excel = 0.0013) < a = .05; reject Ho; relationship is significant (critical t = 3.182) |
e. |
0.9336 |
25. The owner of a bakery wants to analyze the relationship between the expenditure of a customer and the customer's income. A sample of 5 customers is taken and the following information was obtained.
Y |
X |
Expenditure |
Income (In Thousands) |
.45 |
20 |
10.75 |
19 |
5.40 |
22 |
7.80 |
25 |
5.60 |
14 |
The least squares estimated line is = 4.348 + 0.0826 X.
a. |
Obtain a measure of how well the estimated regression line fits the data. |
b. |
You want to test to see if there is a significant relationship between expenditure and income at the 5% level of significance. Be sure to state the null and alternative hypotheses. |
c. |
Construct a 95% confidence interval estimate for the average expenditure for all customers with an income of $20,000. |
d. |
Construct a 95% confidence interval estimate for the expenditure of one customer whose income is $20,000. |
ANS:
a. |
R2 = 0.0079 |
b. |
H0: b1 = 0 |
Ha: b1 0 | |
t = 0.154; p-value (actual p-value using Excel = 0.8871) > a = .05; do not reject H0; (critical t = 3.182) | |
c. |
0.185 to 12.185 |
d. |
-9.151 to 21.151 |
26. Below you are given information on annual income and years of college education.
Income (In Thousands) |
Years of College |
28 |
0 |
40 |
3 |
36 |
2 |
28 |
1 |
48 |
4 |
a. |
Develop the least squares regression equation. |
b. |
Estimate the yearly income of an individual with 6 years of college education. |
c. |
Compute the coefficient of determination. |
d. |
Use a t test to determine whether the slope is significantly different from zero. Let a = 0.05. |
e. |
At 95% confidence, perform an F test and determine whether or not the model is significant. |
ANS:
a. |
y= 25.6 + 5.2X |
b. |
$56,800 |
c. |
0.939 |
d. |
t = 6.789; p-value (actual p-value using Excel = 0.0008) < a = .05; reject Ho; significant (critical t = 3.182 |
e. |
F = 46.091; p-value (actual p-value using Excel = 0.0008) < a = .05; reject Ho; significant (critical F = 10.13) |
27. Below you are given information on a woman's age and her annual expenditure on purchase of books.
Age |
Annual Expenditure ($) |
18 |
210 |
22 |
180 |
21 |
220 |
28 |
280 |
a. |
Develop the least squares regression equation. |
b. |
Compute the coefficient of determination. |
c. |
Use a t test to determine whether the slope is significantly different from zero. Let a = 0.05. |
d. |
At 95% confidence, perform an F test and determine whether or not the model is significant. |
ANS:
a. |
y= 54.834 + 7.536X |
b. |
R2 = 0.568 |
c. |
t = 1.621; p-value (actual p-value using Excel = 0.2464) > a = .05; do not reject Ho; not significant (critical t = 4.303) |
d. |
F = 2.628; p-value (actual p-value using Excel = 0.2464) > a = .05; do not reject Ho; not significant (critical F = 18.51) |
28. The following sample data contains the number of years of college and the current annual salary for a random sample of heavy equipment salespeople.
Years of College |
Annual Income (In Thousands) |
2 |
20 |
2 |
23 |
3 |
25 |
4 |
26 |
3 |
28 |
1 |
29 |
4 |
27 |
3 |
30 |
4 |
33 |
4 |
35 |
a. |
Which variable is the dependent variable? Which is the independent variable? |
b. |
Determine the least squares estimated regression line. |
c. |
Predict the annual income of a salesperson with one year of college. |
d. |
Test if the relationship between years of college and income is statistically significant at the .05 level of significance. |
e. |
Calculate the coefficient of determination. |
f. |
Calculate the sample correlation coefficient between income and years of college. Interpret the value you obtain. |
ANS:
a. |
Y (dependent variable) is annual income and X (independent variable) is years of college |
b. |
= 21.6 + 2X |
c. |
$23,600 |
d. |
The relationship is not statistically significant since t = 1.51; p-value (actual p-value using Excel = 0.1696) > a = .05 (critical t = 2.306) |
e. |
0.222 |
f. |
0.471; there is a positive correlation between years of college and annual income |
29. The following data shows the yearly income (in $1,000) and age of a sample of seven individuals.
Income (in $1,000) |
Age |
20 |
18 |
24 |
20 |
24 |
23 |
25 |
34 |
26 |
24 |
27 |
27 |
34 |
27 |
a. |
Develop the least squares regression equation. |
b. |
Estimate the yearly income of a 30-year-old individual. |
c. |
Compute the coefficient of determination. |
d. |
Use a t test to determine whether the slope is significantly different from zero. Let a = 0.05. |
e. |
At 95% confidence, perform an F test and determine whether or not the model is significant. |
ANS:
a. |
y= 16.204 + 0.3848X |
b. |
$27,748 |
c. |
0.2266 |
d. |
t = 1.21; p-value (actual p-value using Excel = 0.2803) > a = .05; not significant (critical t = 2.571) |
e. |
F = 1.46; p-value (actual p-value using Excel = 0.2803) > a = .05; not significant (critical F = 6.61) |
30. The following data show the results of an aptitude test (Y) and the grade point average of 10 students.
Aptitude Test Score (Y) |
GPA (X) |
26 |
1.8 |
31 |
2.3 |
28 |
2.6 |
30 |
2.4 |
34 |
2.8 |
38 |
3.0 |
41 |
3.4 |
44 |
3.2 |
40 |
3.6 |
43 |
3.8 |
a. |
Develop a least squares estimated regression line. |
b. |
Compute the coefficient of determination and comment on the strength of the regression relationship. |
c. |
Is the slope significant? Use a t test and let a = 0.05. |
d. |
At 95% confidence, test to determine if the model is significant (i.e., perform an F test). |
ANS:
a. |
= 8.171 + 9.4564X |
b. |
0.83; there is a fairly strong relationship |
c. |
t = 6.25; p-value (actual p-value using Excel = 0.0002) < a =.05; it is significant (critical t = 2.306) |
d. |
F = 39.07; p-value (actual p-value using Excel = 0.0002) < a =.05; it is significant (critical F = 5.32) |
31. Shown below is a portion of the computer output for a regression analysis relating sales (Y in millions of dollars) and advertising expenditure (X in thousands of dollars).
Predictor |
Coefficient |
Standard Error |
Constant |
4.00 |
0.800 |
X |
0.12 |
0.045 |
Analysis of Variance |
|
|
SOURCE |
DF |
SS |
Regression |
1 |
1,400 |
Error |
18 |
3,600 |
a. |
What has been the sample size for the above? |
b. |
Perform a t test and determine whether or not advertising and sales are related. Let a = 0.05. |
c. |
Compute the coefficient of determination. |
d. |
Interpret the meaning of the value of the coefficient of determination that you found in Part c. Be very specific. |
e. |
Use the estimated regression equation and predict sales for an advertising expenditure of $4,000. Give your answer in dollars. |
ANS:
a. |
20 |
b. |
t = 2.66; p-value is between 0.01 and 0.02; they are related (critical t = 2.101) |
c. |
R2 = 0.28 |
d. |
28% of variation in sales is explained by variation in advertising expenditure. |
e. |
$4,480,000 |
32. A company has recorded data on the daily demand for its product (Y in thousands of units) and the unit price (X in hundreds of dollars). A sample of 15 days demand and associated prices resulted in the following data.
SX = 75 |
S (Y- )(X- ) = -59 |
SY = 135 |
S (X- )2 = 94 |
S (Y- )2 = 100 | |
SSE = 62.9681 |
a. |
Using the above information, develop the least-squares estimated regression line and write the equation. |
b. |
Compute the coefficient of determination. |
c. |
Perform an F test and determine whether or not there is a significant relationship between demand and unit price. Let a = 0.05. |
d. |
Would the demand ever reach zero? If yes, at what price would the demand be zero? |
ANS:
a. |
y= 12.138 - 0.6277X |
b. |
R2 = 0.3703 |
c. |
F = 7.65; p-value is between .01 and .025; reject Ho and conclude that demand and unit price are related (critical F = 4.67) |
d. |
Yes, at $1,934 |
33. A regression and correlation analysis resulted in the following information regarding an independent variable (x) and a dependent variable (y).
SX = 42 |
S (Y - )(X - ) = 37 |
SY = 63 |
S (X - )2 = 84 |
n = 7 |
S (Y - )2 = 28 |
a. |
Develop the least squares estimated regression equation. |
b. |
At 95% confidence, perform a t test and determine whether or not the slope is significantly different from zero. |
c. |
Perform an F test to determine whether or not the model is significant. Let a = 0.05. |
d. |
Compute the coefficient of determination. |
ANS:
a. |
= 6.3571 + 0.4405x |
b. |
p-value < .05; reject Ho |
c. |
p-value < .05; reject Ho |
d. |
0.582 |