Download ppt - Koch, M.: Basic Statistics© Springer-Verlag Berlin Heidelberg 2003 In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry –

In: Wenclawiak, Koch, Hadjicostas (eds.) Quality Assurance in Analytical Chemistry – Training and Teaching

Koch, M.: Basic Statistics © Springer-Verlag Berlin Heidelberg 2003

Basic Statistics

Michael Koch



Statistics

Are a tool to answer questions: How accurate is the analyses? How many analyses do I have to make to

overcome problems of inhomogeneity or imprecision?

Does the product meet the specification? With what confidence is the limit

exceeded?



Analytical results vary

Because of unavoidable deviation during measurement

Because of inhomogeneity between the subsamples

It is very important to differentiate between these two cases

Because the measures to be taken can be different



Variations

Are always present

If there seems to be none, the resolution is not high enough

322322322322322322322322322322322322322322322322322322322322

322,3322,5322,3322,2321,7321,8321,7322,2321,6322,4322,4321,5322,4322,1322,2322,5321,7321,8321,9322,4

322,29322,49322,28322,17321,67321,76321,75322,17321,58322,36322,40321,53322,42322,05322,16322,47321,73321,76321,85322,41



Histogram

For the description of the variability Divide range of possible measurements

into a number of groups Count observations in each group Draw a boxplot of the frequencies



Histogram

0

2

4

6

8

10

12

14

16

18

20

45,75 46,25 46,75 47,25 47,75 48,25 48,75 49,25 49,75 50,25 50,75

mg/g

fre

qu

en

cy



Distribution

If the number of values rises and the groups have a smaller range, the histogramm changes into a distribution curve



Cumulative Distribution

The cumulative distribution is obtained by summing up all frequencies

It can be used to find the proportion of values that fall above or below a certain value

For distributions (or histograms) with a maximum, the cumulative distribution curve shows the typical S-shape



Cumulative Distribution

0

20

40

60

80

100

45,75 46,75 47,75 48,75 49,75 50,75 mg/g

cu

mu

lati

ve f

req

ue

nc

y [%

]



Population – Sample

It is important to know, whether all data (the population) or only a subset (a sample) are known.

PopulationSampleAll inhabitants of a town

A selection of 1000 inhabitants of a townAny number of measurements of copper in a soil

Not possible



Distribution Descriptives

The data distribution can be described with four characteristics: Measures of location Measures of dispersion Skewness Kurtosis



Location – Arithmetic Mean

Best estimation of the population mean µ for random samples from a population

xx

n

ii

n

1



Location – Median

Sort data by value Median is the central value of this series

The median is robust, i.e. it is not affected by extreme values or outliers



Location – Mode

The most frequent value

Large number of values necessary It is possible to have two or more

modes



Location – Other Means

Geometric mean Harmonic Mean Robust Mean (e.g. Huber mean) ...



Dispersion – Variance of a population

The mean of the squares of the deviations of the individual values from the population mean.

2

2

1

( )x

n

ii

n



Dispersion – Variance of a sample

Variance of population with (n-1) replacing n and the arithmetic mean of the data (estimated mean) replacing the population mean µ

s

x x

n

ii

n

2

2

1

1

( )



Dispersion – standard deviation

The positive square root of the variance

Population

( )x

n

ii

n2

1

Sample

s

x x

n

ii

n

( ) 21

1



Dispersion – standard deviation of the mean

possible means of a data set are more closely clustered than the data

sdmn

where n is the number of

measurements for the mean



Dispersion – relative standard deviation

measure of the spread of data compared with the mean

RSDs

x



Skewness

The skewness of a distribution represents the degree of symmetry Analytical data close to the detection

limit often are skewed because negative concentrations are not possible



Kurtosis

The peakedness of a distribution. Flat platykurtic Very peaked leptokurtic In between mesokurtic



Normal distribution - I

First studied in the 18th century by Carl Friedrich Gauss

He found that the distribution of errors could be closely approximated by a curve called the „normal curve of errors“



Normal distribution - II

bell shaped completely determined by µ and σ

µ

y ex

1

2

2

22

( )



Normal distribution – important properties

the curve is symmetrical about µ the greater the value of σ the greater the spread of the curve approximately 68% (68,27%) of the data lies within µ±1σ approximately 95 % (95,45%) of the data lies within µ±2σ approximately 99,7 % (99,73%) of the data lies within µ±3σ

µµµµ



Normal distribution – a useful model

With the mean and the standard deviation area under the curve can be defined

These areas can be interpreted as proportions of observations falling within these ranges defined by µ and σ



One-tailed / two-tailed questions

One-tailed e.g. a limit for the

specification of a product

It is only of interest, whether a certain limit is exceeded or not

Two-tailed e.g. the analysis of a

reference material we want to know,

whether a measured value lies within or outside the certified value ± a certain range



One-sided probabilities

µ-k µ µ+k

Less than µ + k , P(<)Greater than µ + k , P(>)

k P(<) P(>)1 0.841 0.1591.64 0.950 0.0501.96 0.975 0.0252 0.977 0.0233 0.998 0.002

One-sided Probabilities



Two-sided probabilities

µ-k µ µ+k

Area within the range µ ± k , P(in)Area outside the range µ ± k , P(out)

k P(in) P(out)1 0.683 0.3171.64 0.900 0.1001.96 0.950 0.0502 0.954 0.0463 0.997 0.003

Two-sided Probabilities



Population / sample small random samples of results from a

normal distribution may not look normal at first sight e.g. samples from a population of 996 data

sample of 10 values

0

0,5

1

1,5

2

2,5

3

3,5

42 50 58 66 74 82 90 98 106

114

122

130

138

146

154

value

fre

qu

en

cy

sample of 20 values

0

0,5

1

1,5

2

2,5

3

3,5

42 50 58 66 74 82 90 98 106

114

122

130

138

146

154

value

fre

qu

en

cy

sample of 50 values

0

1

2

3

4

5

6

42 50 58 66 74 82 90 98 106

114

122

130

138

146

154

value

fre

qu

en

cy

sample of 100 values

0

1

2

3

4

5

6

7

8

value

fre

qu

en

cy


0

2

4

6

8

10

12

value

fre

qu

en

cy


0

5

10

15

20

25

value

fre

qu

en

cy

all 996 values

05

1015202530354045

value

freq

uenc

y

Population: µ=100, σ=19,75Sample size mean s

10 99,6 20,48

20 101,8 18,14

50 97,76 19,13

100 96,52 20,14

249 98,63 20,10

498 97,81 20,49



Confidence limits / intervals

The confidence limits describe the range within which we expect with given confidence the true value to lie.



Confidence limits for the population mean

If the population standard deviation σ and the estimated mean are known:

CL for the population mean, estimated from n measurements on a confidence level of 95%

CL xn

1 96.



Confidence limits for the population mean

If the standard deviation has to be estimated from a sample standard deviation s (this is normally the case):

Additional uncertainty use of Student t distribution

CL x ts

n

Where t is the two-tailed tabulated value for (n-1) degrees of freedom and the requested confidence limit



t-valuesCritical values for the Student t-Testd.o.f. \ 1T 60% 75% 80% 85% 90% 95% 97,50% 99% 99,50% 99,90% 99,95%d.o.f. \ 2T 20% 50% 60% 70% 80% 90% 95% 98% 99% 99,80% 99,90%

1 0,3249 1,000 1,376 1,963 3,078 6,314 12,71 31,82 63,66 318,3 636,62 0,2887 0,8165 1,061 1,386 1,886 2,920 4,303 6,965 9,925 22,33 31,603 0,2767 0,7649 0,9785 1,250 1,638 2,353 3,182 4,541 5,841 10,21 12,944 0,2707 0,7407 0,9410 1,190 1,533 2,132 2,776 3,747 4,604 7,173 8,6105 0,2672 0,7267 0,9195 1,156 1,476 2,015 2,571 3,365 4,032 5,893 6,8696 0,2648 0,7176 0,9057 1,134 1,440 1,943 2,447 3,143 3,707 5,208 5,9597 0,2632 0,7111 0,8960 1,119 1,415 1,895 2,365 2,998 3,499 4,785 5,4088 0,2619 0,7064 0,8889 1,108 1,397 1,860 2,306 2,896 3,355 4,501 5,0419 0,2610 0,7027 0,8834 1,100 1,383 1,833 2,262 2,821 3,250 4,297 4,781

10 0,2602 0,6998 0,8791 1,093 1,372 1,812 2,228 2,764 3,169 4,144 4,58711 0,2596 0,6974 0,8755 1,088 1,363 1,796 2,201 2,718 3,106 4,025 4,43712 0,2590 0,6955 0,8726 1,083 1,356 1,782 2,179 2,681 3,055 3,930 4,31813 0,2586 0,6938 0,8702 1,079 1,350 1,771 2,160 2,650 3,012 3,852 4,22114 0,2582 0,6924 0,8681 1,079 1,345 1,761 2,145 2,624 2,977 3,787 4,14015 0,2579 0,6912 0,8662 1,074 1,341 1,753 2,131 2,602 2,947 3,733 4,07316 0,2576 0,6901 0,8647 1,071 1,337 1,746 2,120 2,583 2,921 3,686 4,01517 0,2573 0,6892 0,8633 1,069 1,333 1,740 2,110 2,567 2,898 3,646 3,96518 0,2571 0,6884 0,8620 1,067 1,330 1,734 2,201 2,552 2,878 3,610 3,92219 0,2569 0,6876 0,8610 1,066 1,328 1,729 2,093 2,539 2,861 3,579 3,88320 0,2567 0,6870 0,8600 1,064 1,325 1,725 2,086 2,528 2,845 3,552 3,850



Confidence limits – depending on n

Confidence Limits are strongly dependent on the number of measurements

µ

CLn=1

µ

CLn=1

CLn=3

µ

CLn=1

CLn=3

CLn=10



Trueness – Precision – Accuracy - Error Trueness

the closeness of agreement between the true value and the sample mean

Precision the degree of agreement between the results of

repeated measurements Accuracy

the sum of both, trueness and precision Error

the difference between one measurement and the true value



Linkage

Improvingaccuracy

Improving precision

Error

Impr

ovin

g tr

uene

ss

from: LGC, vamstat II



Significance Testing - ?

Typical Situation: We have analyzed a reference material

and we want to know, if the bias between the mean of measurements and the certified value of a reference material is „significant“ or just due to random errors.

This is a typical task for significance testing



Significance testing

a decision is made about a population based on observations made from a sample of the

population on a certain level of confidence

there can never be any guarantee that the decision is absolutely correct

on a 95% confidence level there is a probability of 5% that the decision is wrong



Before significance testing

inspect a graphical display of the data before using any statistics

are there any data that are not suitable? is a significance test really necessary? compare the statistical result with a

commonsense visual appraisal a difference may be statistically significant,

but not of any importance, because it is small



The most important thing in statistics

No blind use



The stages of significance testing - I

Null hypothesis a statement about the population, e.g.:

There is no bias (µ = xtrue)

Alternative hypothesis H1

the opposite of the null hypothesis, e.g.: there is a bias (µ ≠ xtrue)



The stages of significance testing - II

Critical value the tabulated value of the test statistics,

generally at the 95% confidence level. If the calculated value is greater than the

tabulated value than the test result indicates a significant difference

Evaluation of the test statistics

Decisions and conclusions



Significance test example – the question

Is the bias between the mean of measurements and the certified value of a reference material significant or just due to random errors?



Significance test example – graphical explanation

x ref

95% confidence interval for the population mean µ

x

significant

x ref

95% confidence interval for the population mean µ

x

not significant

the reference value is outside the 95% confidence limits of the data set

the bias is unlikely to result from random variations in the results

the bias is significant

the reference value is inside the 95% confidence limits of the data set

the bias is likely to result from random variations in the results

the bias is not significant



Significance test example - calculation

The Null hypothesis we assume there is no bias: x=xref

Calculation of confidence limits x

t s

nx

t s

n

With the Null hypothesis x

t s

nx x

t s

nre f

t t

x x

s

n

critica l observed

ref

or




1 0,3249 1,000 1,376 1,963 3,078 6,314 12,71 31,82 63,66 318,3 636,62 0,2887 0,8165 1,061 1,386 1,886 2,920 4,303 6,965 9,925 22,33 31,603 0,2767 0,7649 0,9785 1,250 1,638 2,353 3,182 4,541 5,841 10,21 12,944 0,2707 0,7407 0,9410 1,190 1,533 2,132 2,776 3,747 4,604 7,173 8,6105 0,2672 0,7267 0,9195 1,156 1,476 2,015 2,571 3,365 4,032 5,893 6,8696 0,2648 0,7176 0,9057 1,134 1,440 1,943 2,447 3,143 3,707 5,208 5,9597 0,2632 0,7111 0,8960 1,119 1,415 1,895 2,365 2,998 3,499 4,785 5,4088 0,2619 0,7064 0,8889 1,108 1,397 1,860 2,306 2,896 3,355 4,501 5,0419 0,2610 0,7027 0,8834 1,100 1,383 1,833 2,262 2,821 3,250 4,297 4,781

10 0,2602 0,6998 0,8791 1,093 1,372 1,812 2,228 2,764 3,169 4,144 4,58711 0,2596 0,6974 0,8755 1,088 1,363 1,796 2,201 2,718 3,106 4,025 4,43712 0,2590 0,6955 0,8726 1,083 1,356 1,782 2,179 2,681 3,055 3,930 4,31813 0,2586 0,6938 0,8702 1,079 1,350 1,771 2,160 2,650 3,012 3,852 4,22114 0,2582 0,6924 0,8681 1,079 1,345 1,761 2,145 2,624 2,977 3,787 4,14015 0,2579 0,6912 0,8662 1,074 1,341 1,753 2,131 2,602 2,947 3,733 4,07316 0,2576 0,6901 0,8647 1,071 1,337 1,746 2,120 2,583 2,921 3,686 4,01517 0,2573 0,6892 0,8633 1,069 1,333 1,740 2,110 2,567 2,898 3,646 3,96518 0,2571 0,6884 0,8620 1,067 1,330 1,734 2,201 2,552 2,878 3,610 3,92219 0,2569 0,6876 0,8610 1,066 1,328 1,729 2,093 2,539 2,861 3,579 3,88320 0,2567 0,6870 0,8600 1,064 1,325 1,725 2,086 2,528 2,845 3,552 3,850



Significance test example – conclusions I

If we find tcritical > tobserved

conforming to our Null hypothesis

we can say: we cannot reject the Null hypothesis we accept the Null hypothesis we find no significant bias at the 95% level we find no measurable bias (under the given

experimental conditions) but not: there is no bias



Significance test example – conclusions II

If we find tcritical < tobserved

not conforming to our Null hypothesis

we can say: we reject the null hypothesis we find significant bias at the 95% level but not: there is a bias, because there is

roughly a 5% chance of rejecting the Null hypothesis when it is true.



One-sample t-test

This test compares the mean of analytical results with a stated value

The question may be two-tailed as in the previous example

or one-tailed e.g.: is the copper content of an alloy below the

specification? for one-tailed questions other t-values are used




1 0,3249 1,000 1,376 1,963 3,078 6,314 12,71 31,82 63,66 318,3 636,62 0,2887 0,8165 1,061 1,386 1,886 2,920 4,303 6,965 9,925 22,33 31,603 0,2767 0,7649 0,9785 1,250 1,638 2,353 3,182 4,541 5,841 10,21 12,944 0,2707 0,7407 0,9410 1,190 1,533 2,132 2,776 3,747 4,604 7,173 8,6105 0,2672 0,7267 0,9195 1,156 1,476 2,015 2,571 3,365 4,032 5,893 6,8696 0,2648 0,7176 0,9057 1,134 1,440 1,943 2,447 3,143 3,707 5,208 5,9597 0,2632 0,7111 0,8960 1,119 1,415 1,895 2,365 2,998 3,499 4,785 5,4088 0,2619 0,7064 0,8889 1,108 1,397 1,860 2,306 2,896 3,355 4,501 5,0419 0,2610 0,7027 0,8834 1,100 1,383 1,833 2,262 2,821 3,250 4,297 4,781

10 0,2602 0,6998 0,8791 1,093 1,372 1,812 2,228 2,764 3,169 4,144 4,58711 0,2596 0,6974 0,8755 1,088 1,363 1,796 2,201 2,718 3,106 4,025 4,43712 0,2590 0,6955 0,8726 1,083 1,356 1,782 2,179 2,681 3,055 3,930 4,31813 0,2586 0,6938 0,8702 1,079 1,350 1,771 2,160 2,650 3,012 3,852 4,22114 0,2582 0,6924 0,8681 1,079 1,345 1,761 2,145 2,624 2,977 3,787 4,14015 0,2579 0,6912 0,8662 1,074 1,341 1,753 2,131 2,602 2,947 3,733 4,07316 0,2576 0,6901 0,8647 1,071 1,337 1,746 2,120 2,583 2,921 3,686 4,01517 0,2573 0,6892 0,8633 1,069 1,333 1,740 2,110 2,567 2,898 3,646 3,96518 0,2571 0,6884 0,8620 1,067 1,330 1,734 2,201 2,552 2,878 3,610 3,92219 0,2569 0,6876 0,8610 1,066 1,328 1,729 2,093 2,539 2,861 3,579 3,88320 0,2567 0,6870 0,8600 1,064 1,325 1,725 2,086 2,528 2,845 3,552 3,850



Two-sample t-test

This test compares two independent sets of data (analytical results)

The question may be two-tailed or one-tailed as well two-tailed: are the results of two methods

significantly different one-tailed: are the results of method A

significantly lower than the results of method B



two-sample t-test – t-value

The formula for the calculation of the t-value is somewhat more complex:

21111

)(

21

2221

21

21

21

nnnsns

nn

xxtobserved

The degree of freedom for the tabulated value tcritical is

d.o.f = n1+n2-2



two-sample t-test - validity

The two-sample t-test is valid only if there is no significant difference between the both standard deviations s1 and s2.

This can (and should) be tested with the F-test



F-test

The F-test is a method of testing whether two independent estimates of variance are significantly different.

The F-value is calculated according to Fobserved=sA

2/sB2, with sA

2 > sB2

The critical value of F depends on the degrees of freedom dofA=nA-1 and dofB=nB-1 and can be taken from a table



nA=10

nB=10

F-values

1 2 3 4 5 6 7 8 9 10

1 647,79 799,50 864,16 899,58 921,85 937,11 948,22 956,66 963,29 968,63

2 38,51 39,00 39,17 39,25 39,30 39,33 39,36 39,37 39,39 39,40

3 17,44 16,04 15,44 15,10 14,89 14,74 14,62 14,54 14,47 14,42

4 12,22 10,65 9,98 9,61 9,36 9,20 9,07 8,98 8,91 8,84

5 10,01 8,43 7,76 7,39 7,15 6,98 6,85 6,76 6,68 6,62

6 8,81 7,26 6,60 6,23 5,99 5,82 5,70 5,60 5,52 5,46

7 8,07 6,54 5,89 5,52 5,29 5,12 5,00 4,90 4,82 4,76

8 7,57 6,06 5,42 5,05 4,82 4,65 4,53 4,43 4,36 4,30

9 7,21 5,72 5,08 4,72 4,48 4,32 4,20 4,10 4,03 3,96

10 6,94 5,46 4,83 4,47 4,24 4,07 3,95 3,86 3,78 3,72

degrees of freedom for numerator (A)

deg

rees

of

free

dom

fo

r de

num

erat

or (

B)

F statistics for P=0.05



F-test - decision

If Fobserved < Fcritical than the variances sA2

and sB2 are not significantly different

on the chosen confidence level



Paired t-test

Two different methods are applied for measuring different samples

sample 1sample 2

sample 3sample 4

sample 5sample 6

0

0,5

1

1,5

2

2,5

3

Method A

Method B

Two-sample t-test is inappropriate, because differences between samples ar much greater than a bias between the two methods

paired t-test



Paired t-test - principle

The differences between both methods are calculated

A one-sample t-test is applied on the differences to decide whether the mean difference is significantly different from zero.



Paired t-test – test statistics

The t-value is calculated according to:

n

s

x

observed diff

diff

t

The critical t-value is taken from the t-table for the desired confidence level



Statistics

There are many statistical tools for the analyst to interpret analytical data

Statistics are not a „cure-all“ that can answer all questions

If the analyst uses statistics with a commonsense, this is a powerful tool to help him to answer his customers or his own questions.



Recommended literature and training software

LGC, VAMSTAT II, Statistics Training for Valid Analytical Measurements. CD-ROM, Laboratory of the Government Chemist, Teddington 2000.

H. Lohninger: Teach/Me - Data Analysis. Multimedia teachware. Springer-Verlag 1999.

J.C. Miller and J.N. Miller: Statistics for Analytical Chemistry. Ellis Horwood Ltd., Chicester 1988

T.J. Farrant: Practical Statistics for the Analytical Scientist – a Bench Guide. Royal Society of Chemistry for the Laboratory of the Government Chemist (LGC), Teddington 1997