19
Institutionen f¨ or matematik och matematisk statistik Ume˚ a universitet 22 oktober 2011 Inl¨ amningsuppgift 1 Mariam Shirdel ([email protected]) Kvalitetsteknik och f¨ ors¨ oksplanering, 7.5 hp

Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Institutionen for matematik och matematisk statistikUmea universitet 22 oktober 2011

Inlamningsuppgift 1

Mariam Shirdel ([email protected])

Kvalitetsteknik och forsoksplanering, 7.5 hp

Page 2: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

1 Uppgift 1: H10.42

Consider the hypothesis test H0: σ21 = σ2

2 against H1: σ21 > σ2

2 . Suppose thatthe sample sizes are n1 = 20 and n2 = 8 and that s21 = 4.5 and s22 = 2.3. Useα = 0.01.

a. State the hypothesis and identify the claim.b. Find the critical value.c. Compute the test value.d. Make the decision.e. Summarize the results.

1.1 a

The null hypothesis is H0: σ21 = σ2

2 (claim) and H1: σ21 > σ2

2 .

1.2 b

The critical value is given by fα,n1−1,n2−1. In our case α = 0.01, n1 − 1 = 19and n2 − 1 = 7. Looking up f0.01,19,7 from table IV on page 468 in Engineeringstatistics, fourth edition, C. Montgomery, Wiley, 2007, one finds that f0.01,19,7 ≈6.18.

1.3 c

The test value is given by f0 =s21s22

= 4.52.3 ≈ 1.96.

1.4 d

Reject the null hypothesis if f0 > fα,n1−1,n2−1. In this case we have 1.96 ¿6.18, which is false. So the null hypothesis is not rejected at the 0.01 level ofsignificance.

1.5 e

Since the test value f0 ≈ 1.96 is less than the critical value f0.01,19,7 ≈ 6.18 wecannot reject the null hypothesis, which states that the two variances are equal.

1

Page 3: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

2 Uppgift 2: H113.11

Three different relaxation techniques are given to randomly selected patients inan effort to reduce their stress levels. A special instrument has been designed tomeasure the percentage of stress reduction in each person. The data are shownin the table. At α = 0.05, can one conclude that there is a difference in themeans of the percentages?

Technique I Technique II Technique III3 12 1510 12 145 17 181 13 1413 18 203 9 224 14 16

Assume that all variables are normally distributed, that the samples are in-dependent, and that the population variances are equal.

a. State the hypothesis and identify the claim.b. Find the critical value.c. Compute the test value.d. Make the decision.e. Summarize the results and explain where the differences in the means are.

2.1 a

The null hypothesis is H0: τ1 = τ2 = τ3 = 0 and H1: τi 6= 0 for at least one i,where our claim is H1.

If we do an ANOVA we find that

2

Page 4: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

General Linear Model: Pecentage versus Technique Factor Type Levels Values

Technique fixed 3 Technique I; Technique II; Technique III

Analysis of Variance for Pecentage, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P

Technique 2 481.52 481.52 240.76 19.06 0.000

Error 18 227.43 227.43 12.63

Total 20 708.95

S = 3.55456 R-Sq = 67.92% R-Sq(adj) = 64.36%

Grouping Information Using Bonferroni Method and 95.0% Confidence

Technique N Mean Grouping

Technique III 7 17.0 A

Technique II 7 13.6 A

Technique I 7 5.6 B

Means that do not share a letter are significantly different.

Bonferroni 95.0% Simultaneous Confidence Intervals

Response Variable Pecentage

All Pairwise Comparisons among Levels of Technique

Technique = Technique I subtracted from:

Technique Lower Center Upper ---+---------+---------+---------+---

Technique II 2.986 8.000 13.01 (---------*---------)

Technique III 6.414 11.429 16.44 (---------*---------)

---+---------+---------+---------+---

0.0 5.0 10.0 15.0

Technique = Technique II subtracted from:

Technique Lower Center Upper ---+---------+---------+---------+---

Technique III -1.586 3.429 8.443 (---------*---------)

---+---------+---------+---------+---

0.0 5.0 10.0 15.0

Bonferroni Simultaneous Tests

Response Variable Pecentage

All Pairwise Comparisons among Levels of Technique

Technique = Technique I subtracted from:

Difference SE of Adjusted

Technique of Means Difference T-Value P-Value

Technique II 8.000 1.900 4.211 0.0016

Technique III 11.429 1.900 6.015 0.0000

Technique = Technique II subtracted from:

Difference SE of Adjusted

Technique of Means Difference T-Value P-Value

Technique III 3.429 1.900 1.805 0.2637

3

Page 5: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

2.2 b

The critical value fα,a−1,a(n−1) is f0.05,2,18 = 3.55 from table IV on page 466 inEngineering statistics, fourth edition, C. Montgomery, Wiley, 2007, for α = 0.05,a = 3 and n = 7.

2.3 c

From the ANOVA table we see that the test value, f0 = 19.06.

2.4 d

Since f0 = 19.06 > f0.05,2,18 = 3.55 the null hypothesis is rejected. This canalso be found from the P-value = 0 from the ANOVA table, if P −value < α wereject the null hypothesis, since α = 0.05 H0 is rejected and at least one meanis different from the others.

2.5 e

Doing an ANOVA one sees that the P-value = 0 is lower than the level ofsignificance α = 0.05, and that the test value, f0 = 19.06, is much bigger thatthe critical value, f0.05,2,18 = 3.55, thus the null hypothesis is rejected. Thismeans that at least one mean is different from the others.

If one looks at the Bonferroni test of the ANOVA table when comparingTechnique I to both II and III, one can see that the P-values are smaller than thelevel of significance (α = 0.05) and can be rejected, thus the mean of TechniqueI differs from II and III. The P-value for Technique II compared to III is notrejected so that implies that the means of the percentages do not differ. One canconclude that there is a difference in the means of the percentages for TechniqueI compared to the other techniques.

4

Page 6: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

3 Uppgift 3: H14.10

The strength of concrete used in commercial construction tends to vary fromone batch to another. Consequently, small test cylinders of concrete sampledfrom a batch are ”cured” for periods up to about 28 days in temperature- andmoisturecontrolled environments before strength measurements are made. Con-crete is then ”bought and sold on the basis of strength test cylinders” (ASTMC 31 Standard Test Method for Making and Curing Concrete Test Specimensin the Field). The accompanying data resulted from an experiment carried outto compare three different curing methods with respect to compressive strength(MPa). Analyze this data.

Batch Method A Method B Method C1 30.7 33.7 30.52 29.1 30.6 32.63 30.0 32.2 30.54 31.9 34.6 33.55 30.5 33.0 32.46 26.9 29.3 27.87 28.2 28.4 30.78 32.4 32.4 33.69 26.6 29.5 29.210 28.6 29.4 33.2

3.1 Analysis

Doing an ANOVA in Minitab leads to

5

Page 7: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

General Linear Model: Result versus Batch; Method Factor Type Levels Values

Batch fixed 10 1; 2; 3; 4; 5; 6; 7; 8; 9; 10

Method fixed 3 A; B; C

Analysis of Variance for Result, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P

Batch 9 93.659 93.659 10.407 11.36 0.000

Method 2 19.125 19.125 9.562 10.43 0.001

Error 18 16.495 16.495 0.916

Total 29 129.279

S = 0.957292 R-Sq = 87.24% R-Sq(adj) = 79.44%

Grouping Information Using Bonferroni Method and 95.0% Confidence

Method N Mean Grouping

B 10 31.3 A

C 10 31.0 A

A 10 29.5 B

Means that do not share a letter are significantly different.

Bonferroni 95.0% Simultaneous Confidence Intervals

Response Variable Result

All Pairwise Comparisons among Levels of Method

Method = A subtracted from:

Method Lower Center Upper --+---------+---------+---------+----

B 0.6901 1.820 2.950 (--------*---------)

C 0.4001 1.530 2.660 (---------*--------)

--+---------+---------+---------+----

-1.2 0.0 1.2 2.4

Method = B subtracted from:

Method Lower Center Upper --+---------+---------+---------+----

C -1.420 -0.2900 0.8399 (---------*--------)

--+---------+---------+---------+----

-1.2 0.0 1.2 2.4

6

Page 8: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Bonferroni Simultaneous Tests

Response Variable Result

All Pairwise Comparisons among Levels of Method

Method = A subtracted from:

Difference SE of Adjusted

Method of Means Difference T-Value P-Value

B 1.820 0.4281 4.251 0.0014

C 1.530 0.4281 3.574 0.0065

Method = B subtracted from:

Difference SE of Adjusted

Method of Means Difference T-Value P-Value

C -0.2900 0.4281 -0.6774 1.000

210-1-2

99

90

50

10

1

Residual

Pe

rce

nt

3432302826

1

0

-1

Fitted Value

Re

sid

ua

l

1.60.80.0-0.8-1.6

10.0

7.5

5.0

2.5

0.0

Residual

Fre

qu

en

cy

30282624222018161412108642

1

0

-1

Observation Order

Re

sid

ua

l

Normal Probability Plot Versus Fits

Histogram Versus Order

Residual Plots for Result

7

Page 9: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

The model seems to fit the data well. From the versus fits plot one can seethat the residuals are scattered randomly and the histogram has the wantedbell shape of a normal distribution. The normal probability plot looks good too,with the residuals following a straight line. Thus, the normal distributon is anappropriate model. The coefficient of determination, R2 = 0.8724, which is alsovery high, meaning that the model is well fitted to the data material.From the table one can see that the P-values for both the Batch (P-value =0.000) and the Method (P-value = 0.001) is below the level of significance,α = 0.05, so the null hypothesis, that the means are the same between thebatches and methods, can be rejected. At least one mean differs from the others.

If one looks at the Bonferroni test of the ANOVA table when comparingMethod A to both B and C, one can see that the P-values are smaller than thelevel of significance (α = 0.05). The P-value when comparing A to B is 0.0014and comparing A to C is 0.0065, so they are lower than the level of significanceand can be rejected, thus the mean of Method A differs from B and C. TheP-value for Method B compared to C, P-value = 1.000, is not rejected so thatimplies that the means of the methods do not differ. One can conclude thatthere is a difference in the means of the methods, but only when comparingmethod A with the others.

4 Uppgift 4: H11.105

A study is done to see whether there is a relationship between a student’s gradepoint average and thhe number of hours the student watches television eachweek. The data are show here. If there is a significant relationship, predict theGPA of a student who wtches television 9 hours per week.

Hours, x 6 10 8 15 5 6 12GPA, y 2.4 4 3.2 1.6 3.7 3 3.5

a. Estimate the intercept β0 and slope β1 regression coefficients. Write the esti-mated regression line.c. Compute SSE and estimate the variance.f. Compute the coefficient of determination, R2. Comment on the value.g. Use a t-test o test for significance of the intercept and slope coefficients atα = 0.05. Give the P-values of each and comment on your results.h. Constuct the ANOVA table and test for significance of regression using theP-value. Comment on your results and their relationship to your results in partg.i. Construct 95% CIs on the intercept and slope. Comment on the relationshopof these CIs and your findings in part g and h.j. Perform model adequacy checks. Do you believe the model provides an ade-quate fit?k. Compute the sample correlation coefficient and test for its significance atα = 0.05. Give the P-value and comment on your results in part g and h.

4.1 a

Doing a regression analysis in Minitab leads to

8

Page 10: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Regression Analysis: GPA versus Hours,x The regression equation is

GPA = 3.83 - 0.0871 Hours,x

Predictor Coef SECoef T P

Constant 3.8286 0.8781 4.36 0.007

Hours,x -0.08710 0.09256 -0.94 0.390

S = 0.832309 R-Sq = 15.0% R-Sq(adj) = 0.0%

Analysis of Variance

Source DF SS MS F P

Regression 1 0.6135 0.6135 0.89 0.390

Residual Error 5 3.4637 0.6927

Total 6 4.0771

15.012.510.07.55.0

4.0

3.5

3.0

2.5

2.0

1.5

Hours,x

GP

A

S 0.832309

R-Sq 15.0%

R-Sq(adj) 0.0%

Fitted Line PlotGPA = 3.829 - 0.08710 Hours,x

9

Page 11: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

210-1-2

99

90

50

10

1

Residual

Pe

rce

nt

3.503.253.002.752.50

1.0

0.5

0.0

-0.5

-1.0

Fitted Value

Re

sid

ua

l

1.00.50.0-0.5-1.0

2.0

1.5

1.0

0.5

0.0

Residual

Fre

qu

en

cy

7654321

1.0

0.5

0.0

-0.5

-1.0

Observation Order

Re

sid

ua

l

Normal Probability Plot Versus Fits

Histogram Versus Order

Residual Plots for GPA

10

Page 12: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

where the intercept is β0 = 3.8286, and the slope is β1 = −0.08710. Theestimated regression line is GPA = 3.83− 0.0871 Hours, x, which can be seen inthe fitted line plot.

4.2 c

From the table one can see that SSE = 3.4637. The estimate of variance is alsofound in the table to be σ2 = 0.6927

4.3 f

From the table, one can also see that the coefficient of determination is R2 =0.15. This is a really small R2, which indicates that one should try to findanother model, which accounts for more of the variability in y.

4.4 g

The test statistic values, t0, for β0 and β1 are presented in the table. t0,β0 = 4.36with P− value = 0.007 and t0,β1

= −0.94 with P− value = 0.390.

To be able to reject the null hypotheses, where{H0 : β1 = 0H1 : β1 6= 0{H0 : β0 = 0H1 : β0 6= 0

the rejection criterion, |t0| > tα/2,n−2 has to be valid. In this case α = 0.05and n = 7. This leads to t0.025,5 = 2.571, which is found from table II onpage 462 in Engineering statistics, fourth edition, C. Montgomery, Wiley, 2007.Thus |t0,β0 | = 4.36 > t0.025,5 = 2.571 and the null hypothesis can be reject-ed, which means that the hypothesis that the intercept is zero is rejected.|t0,β1

| = 0.94 < t0.025,5 = 2.571 and the null hypothesis cannot be rejected,which means that there is no linear relationship between the predictor and theresponse variable. This is also clear from the P-values. The P-value for β0, 0.007,is lower than the level of significance at 0.05, while the P-value for β1, 0.390, isnot.

4.5 h

The ANOVA table can be seen in 4.1 a. The computed value for the test forsignificance of regression is f0 = 0.89 with a P-value = 0.39. Thus, the nullhypothesis, that the slope of the regression line is zero, cannot be rejected. Thisresult is consistent with the result in 4.1 g.

4.6 i

The confidence interval on the intercept, β0 and the slope, β1, are

β0 − tα/2,n−2 se(β0) ≤ β0 ≤ β0 + tα/2,n−2 se(β0)

β1 − tα/2,n−2 se(β1) ≤ β1 ≤ β1 + tα/2,n−2 se(β1)

11

Page 13: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

where α = 0.05, n = 7, t0.025,5 = 2.571, se(β0) = 0.8781 and se(β1) = 0.09256,which can be found in the table. Inserting the values in the confidence intervalequations one gets

3.8286− 2.571 ∗ 0.8781 ≤ β0 ≤ 3.8286 + 2.571 ∗ 0.8781

−0.08710− 2.571 ∗ 0.09256 ≤ β1 ≤ −0.08710 + 2.571 ∗ 0.09256

3.8286− 2.571 ∗ 0.8781 ≤ β0 ≤ 3.8286 + 2.571 ∗ 0.8781

−0.08710− 2.571 ∗ 0.09256 ≤ β1 ≤ −0.08710 + 2.571 ∗ 0.09256

1.571 ≤ β0 ≤ 6.086

−0.3251 ≤ β1 ≤ 0.1509

This result agrees with the results in 4.1 g and h. The zero is not included inthe confidence interval for β0, but it is included in the confidence interval forβ1.

4.7 j

From the plots one can conclude that the model is inadequate. The histogramdoes not have a nice bell shape, so it is not normaly distributed. The residualvs the fitted value plot shows a bit of a pattern, the residuals are not scatteredrandomly, so this plot agrees with the fact that the model is inadequate. Ifone also looks at the normal probability plot one sees that the residuals have acurvature effect, almost like a sinus curve, which is also an indication that themodel is inadequate. Thus, the conclusion is that the model does not providean adequate fit.

4.8 k

The sample correlation coefficient r =√R2. The value of R2 is found from the

table, and is 0.15. Thus r ≈ 0.3873. To test for the significance of the samplecorrelation coefficient at α = 0.05 one can test the following{

H0 : ρ = 0H1 : ρ 6= 0

where ρ is the population correlation coefficient, which is equivalent to test-ing if the slope is zero. The test statistic then becomes

t0 =r√n− 3√

1− r2=

0.3873 ∗√

7− 3√1− 0.38732

≈ 0.84.

Once again |t0| > tα/2,n−2 has to be valid to be able to reject the nullhypothesis. 0.84 < 2.571, so the null hypothesis cannot be rejected. The P-valueis approximately 0.44, which is what to be expected.

This result is in agreement with the previous results in 4.1 g and h.

12

Page 14: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

5 Uppgift 5: H12.4

The electric power consumed each month by a chemical plant is thought to berelated to the average ambient temperature (x1), the number of days in themonth (x2), the average product purity (x3), and the tons of product produced(x4). The past year’s historical data are available and are represented in thefollowing table:

y x1 x2 x3 x4240 25 24 91 100236 31 21 90 95270 45 24 88 110274 60 25 87 88301 65 25 91 94316 72 26 94 99300 80 25 87 97296 84 25 86 96267 75 24 88 110276 60 25 91 105288 50 25 90 100261 38 23 89 98

a. Estimate the regression coefficients. Write the multiple linear regression mod-el. Comment on the relationship found between the set of independent variablesand the dependent variable.b. Compute the residuals.c. Compute SSE and estimate the variance.d. Compute the coefficient of determination, R2, and adjusted coefficient of mul-tiple determination, R2

Adjusted. Comment on their values.e. Construct the ANOVA table and test for significance regression. Commenton your results.f. Find the standard error of the individual coefficients.g. Use a t-test to test for significance of the individual coefficients at α = 0.05.Comment on your results.h. Construct 95% CIs on the individual coefficients. Compare your results withthose found in part g and comment.i. Perform a model adequacy check, including computing studentized residualsand Cook’s distance measure for each of the observations. Comment on yourresults.j. Compute the variance inflation factor and comment on the presence of mul-ticollinearity.

5.1 a

Doing a regression analysis in Minitab leads to

13

Page 15: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Regression Analysis: y versus x1; x2; x3; x4 The regression equation is

y = - 123 + 0.757 x1 + 7.52 x2 + 2.48 x3 - 0.481 x4

Predictor Coef SE Coef T P VIF

Constant -123.1 157.3 -0.78 0.459

x1 0.7573 0.2791 2.71 0.030 2.323

x2 7.519 4.010 1.87 0.103 2.161

x3 2.483 1.809 1.37 0.212 1.335

x4 -0.4811 0.5552 -0.87 0.415 1.009

S = 11.7866 R-Sq = 85.2% R-Sq(adj) = 76.8%

Analysis of Variance

Source DF SS MS F P

Regression 4 5600.5 1400.1 10.08 0.005

Residual Error 7 972.5 138.9

Total 11 6572.9

Source DF Seq SS

x1 1 4233.4

x2 1 1027.8

x3 1 234.9

x4 1 104.3

Obs x1 y Fit SE Fit Residual St Resid

1 25.0 240.00 254.10 7.90 -14.10 -1.61

2 31.0 236.00 236.01 10.78 -0.01 -0.00

3 45.0 270.00 256.98 8.21 13.02 1.54

4 60.0 274.00 283.96 8.59 -9.96 -1.23

5 65.0 301.00 294.80 5.81 6.20 0.61

6 72.0 316.00 312.66 9.54 3.34 0.48

7 80.0 300.00 294.78 6.07 5.22 0.52

8 84.0 296.00 295.81 7.33 0.19 0.02

9 75.0 267.00 279.70 8.61 -12.70 -1.58

10 60.0 276.00 285.72 5.45 -9.72 -0.93

11 50.0 288.00 278.07 5.19 9.93 0.94

12 38.0 261.00 252.42 5.33 8.58 0.82

RESI1 COOK1

-14.0984 0.424963

-0.0084 0.000003

13.0164 0.446758

-9.9636 0.345562

6.2044 0.023536

3.3411 0.088498

5.2208 0.019223

0.1936 0.000056

-12.7023 0.571651

-9.7166 0.047088

9.9336 0.042414

8.5795 0.034191

14

Page 16: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

20100-10-20

99

90

50

10

1

Residual

Pe

rce

nt

320300280260240

10

0

-10

Fitted Value

Re

sid

ua

l

151050-5-10-15

3

2

1

0

Residual

Fre

qu

en

cy

121110987654321

10

0

-10

Observation Order

Re

sid

ua

l

Normal Probability Plot Versus Fits

Histogram Versus Order

Residual Plots for y

15

Page 17: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

where the multiple linear regression model is found to be

y = −123 + 0.757x1 + 7.52x2 + 2.48x3 − 0.481x4.

The corresponding regression coefficients are β0 = −123, β1 = 0.757, β2 = 7.52,β3 = 2.48 and β4 = −0.481. From the model one can see that the consumedelectric power (y) will go up by 0.757 times the average ambient temperature(x1), it will go up by 7.52 times the number of days in the month (x2), it will goup by 2.48 times the average product purity (x3) and it will go down by 0.481times the tons of product produced (x4).

5.2 b

The residuals for each observation can be seen in the regression analysis fromMinitab in the Residual column and also in the RESI1 column.

5.3 c

From the table one can see that SSE = 972.5. The estimate of variance is alsofound in the table to be σ2 = 138.9

5.4 d

The coefficient of determination, R2, can be found in the table and has the valueR2 = 0.852. The adjusted coefficient of multiple determination, R2

Adjusted, can

be found in the table and has the value R2Adjusted = 0.768. Both values indicate

that the model fits the data well.

5.5 e

The ANOVA table can be seen in 5.1 a. The hypothesis is{H0 : β1 = β2 = β3 = β4 = 0H1 : At least one βj 6= 0

If f0 > fα,k,n−p the null hypothesis can be rejected. The computed value for thetest for significance of regression is f0 = 10.08 with a P-value = 0.005. α = 0.05,n = 12, p = 5 and k = 4, which leads to f0.05,4,7 = 4.12 from table IV onpage 466 in Engineering statistics, fourth edition, C. Montgomery, Wiley, 2007.Since f0 = 10.08 > f0.05,4,7 = 4.12 and the P-value= 0.005 < α = 0.05 one canconclude that the null hypothesis is rejected and at least one of the regressorvariables (xi) is lineary connected to the response (y).

5.6 f

The standard error of the individual coefficients can be found in the column SECoef in the table.

16

Page 18: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

5.7 g

The hypothesis is{H0 : βj = 0H1 : βj 6= 0

The t-values and P-values for the t-test at α = 0.05 are found in the table.The t-value for x1 is t0 = 2.71 and has the P-value = 0.030, which indicatesthat the regressor x1 contributes significantly to the model. The t-value for x2is t0 = 1.87 and has the P-value = 0.103 , the t-value for x3 is t0 = 1.37 and hasthe P-value = 0.212 and the t-value for x4 is t0 = −0.87 and has the P-value =0.415. This indicates that the null hypothesis cannot be rejected for x2, x3 andx4. This is an indicator that the regressors should be removed from the model.One should refit the model by removing the least significant regressor until allthe regressors are significant.

5.8 h

A 95% CI for the individual coefficient is made by

Bj − tα/2,n−p se(Bj) ≤ Bj ≤ Bj + tα/2,n−p se(Bj)

where t0.025,7 = 2.365 from table II on page 462 in Engineering statistics, fourthedition, C. Montgomery, Wiley, 2007. The standard error for each coefficient,se(Bj) is taken from the table.

0.0972 ≤ B1 ≤ 1.417

−1.965 ≤ B2 ≤ 17.00

−1.795 ≤ B3 ≤ 6.761

−1.794 ≤ B4 ≤ 0.8319

The zero is included in the CIs for Bj for j = 2,3,4. This agrees with the resultin 5.7 g.

5.9 i

From the histogram plot one can see that it doesn’t have the wanted bell shapeof a normal distribution. The residuals in the vesus fits plot look random enough.The fourth point in the normal probability plot deviates a bit too much, butother than that it looks like a good enough fit.When looking at the studentized residuals in the column St Resid in the table,one can see that none of the studentized residuals is large enough to indicatethe presence of outliers.The Cook’s distance values are found in the column COOK1 in the table. Sincenone of the values are bigger than unity, the points are not very influential.

17

Page 19: Inl amningsuppgift 1marshi/Courses/Kvalitetsteknik... · 2011. 10. 22. · 10 12 14 5 17 18 1 13 14 13 18 20 ... 22 oktober 2011 The model seems to t the data well. From the versus

Mariam Shirdel ([email protected]) 22 oktober 2011

5.10 j

The variance inflation factors, VIF, are computed in column VIF in the table.All the VIFs are below 10 and they are all rather small so there is no apparentproblem with multicollinearity.

18