RP-04 Laboratory: Monitoring and Adjustment of Calibration Intervals for Mass Standards

N. Dupuis-Désormeaux, Senior Engineer, Gravimetry
August 2002

RP-04: Monitoring and Adjustment of Calibration Intervals for Mass Standar

ds, in PDF format, 305 KB

Amendment, in PDF Format, 195 KB


Table of Contents

1. Abstract

2. Rationale

3. Detailed Steps

3.1 Establishment of the Minimum Sample Size

3.1.1 Maximum Variance of the Data
3.1.2 Maximum Tolerable Error
3.1.3 Z-statistic Corresponding to the Confidence Level
3.1.4 Minimum Sample Size Computation

3.2 Computation of the As-Found Values

3.3 Determination of Uncertaintie

3.4 Establishment of Reliability Targets

3.5 Formulation of Statistical Hypotheses

3.6 Analysis of Data

3.6.1 Maximum Allowable Number of Defects
3.6.2 Minimum Number of Defects
3.6.3 Examples

3.7 Adjustment of the Calibration Interval

4. Summary - At a Glance

5. References

Abstract

Measurement Canada recognizes that it is important to ensure that the standards calibration activities are well documented and monitored. This is especially pertinent given the private sector’s increasing use of quality assurance and quality control programs.

This document addresses exclusively one quality control aspect of Measurement Canada’s mass calibration activities: the monitoring and adjustment of calibration intervals.

Monitoring:

An evaluation mechanism is suggested to determine if the number of masses found to be “out-of-tolerance” at the end of their calibration cycle is statistically significant. This verification is done through the analysis of “as-found” values that have been accumulated by the Calibration Standards Laboratory (CSL).

Adjustment of the Calibration Intervals:

A method for shortening or lengthening the calibration intervals is proposed when the above analysis reveals significant differences from expected results.

Rationale

Various acceptance sampling methods are used to estimate the maximum (and minimum) number of defects to be (statistically) expected.

For the monitoring of mass calibration activities, two evaluation tools are most appropriate: a maximum likelihood estimate method and the method proposed in ISO 2859-1.

A) Given a known sample size, confidence level and reliability target, a maximum likelihood estimate method with the probability density function of the normal distribution is used. This is similar to the method of ISO 7966: 1993 (E) when used with proportions.


The seven steps involved in the method are as follows:

1) Establishment of the Sample Size
2) Computation of the “As-found” Values
3) Determination of Uncertainties
4) Establishment of Reliability Targets
5) Formulation of Statistical Hypotheses
6) Analysis of Data
7) Adjustment of the Calibration Interval

These steps will be explained in detail in the following section.

B) The ISO 2859-1 standard with multiple sampling plans (Table IV-4) can also be used.

Although the ISO 2859-1method offers a more advanced technique for computing acceptance quality levels, the benefits gained from using a more elaborate sampling plan seem to be limited at this time.

Detailed Steps

3.1 Establishment of the Minimum Sample Size

3.1.1 Maximum Variance of the Data
3.1.2 Maximum Tolerable Error
3.1.3 Z-statistic Corresponding to the Confidence Level
3.1.4 Minimum Sample Size Computation

3.2 Computation of the As-Found Values

3.3 Determination of Uncertaintie

3.4 Establishment of Reliability Targets

3.5 Formulation of Statistical Hypotheses

3.6 Analysis of Data

3.6.1 Maximum Allowable Number of Defects
3.6.2 Minimum Number of Defects
3.6.3 Examples

3.7 Adjustment of the Calibration Interval

3.1 Establishment of the Minimum Sample Size (N)

The determination of the sample size is crucial to any statistical analysis since it defines the confidence that we will have in the results. The analysis is then conclusive only if all points recommended are sampled.

The maximum variance of the process, the maximum tolerable error between sample mean and true population average, and the confidence level must be known before the minimum sample size can be calculated.


The sample size is a function of:

- a) the maximum variance of the data under study (σ2Max) ;
- b) the maximum tolerable error (E) between the observed sample mean and the true population average; and
- c) the confidence level (expressed as a Zα/2 value) with which we can say that the population average is contained within the sample mean ± E.

This relationship can be expressed as:

Equation


3.1.1 Maximum Variance of the Data (α2 Max)

The maximum variance α2 Max of the values observed for the correction from nominal can be estimated based on experience. In the past, the values for the correction have fluctuated between +11/3 tolerance and -11/3 tolerance. Therefore, the range of values observed can be expressed as |Max - min| = | 11/3 tolerance - -11/3 tolerance | = |4/3 + 4/3| tolerance = 8/3 tolerance. If we treat this fluctuation as a rectangular distribution, we have

Equation

3.1.2 Maximum Tolerable Error (E)

The maximum tolerable error E is the maximum difference that we are willing to accept between the average value obtained for the sample and the “true” average of the population. In our case, we are gathering data on the correction from nominal value and thus will obtain an average value for this correction.

For mass calibrations, we have that the maximum expanded combined uncertainty 2ucombined_max must be smaller than 1/3 the applicable tolerance on that mass; therefore, the maximum ucombined_max (at 1σ) on the measurement is no greater than 1/6 tolerance.

It is reasonable to assume that we want our experimental average to be no farther than 1/6 tolerance from the true average of the population. Thus we set E = ucombined_max = 1/6 tolerance. However, when the calculated value of the combined uncertainty ucombined (see 3.3) is known, this value can be used instead of the maximum combined uncertainty ucombined_max , discussed above.

This means that we require a sample of “N” points (see 3.1.4) in order to ensure that the “true” average correction from nominal value is no greater than 1/6 tolerance away from the observed average correction from nominal.


Technical Note:

Since the average correction is c = (3xi)/n, where xi are the individual corrections, we have that the uncertainty of the average correction observed “c”can be expressed as:

(u(c))2 = ((∂c/∂x1)u(x1))2 + ((∂c/∂x2)u(x2))2 + (∂c/∂x3)u(x3))2 + (...) + (∂c/∂xi)u(xi))2

Because we know that MAX (u(x1)) = u(x1) = u(x2) = (...) = u(xi) = 1/6 tolerance, we have that

(u(c))2 = (1/n)2 ∑ (u(xi))2 = (1/n)2 (n) (u(xi))2 = (1/n) (u(xi))2. Therefore, at 1σ, u(c)= 1σ(c) = σ(xi)/√n represents the chance variations of the sample mean from the “true” mean. This respects the Central Limit Theorem that states that, the standard deviation of the means (of samples of ‘n’ points) is equal to the standard deviation of the entire population divided by the square root of the number of points in each sample.


3.1.3 Z-statistic (Z α/2) Corresponding to the Confidence Level

Finally, we set the confidence level at 95%.

This implies that if we want to be 95% certain that the true population average will be within ± E from the observed sample average, we must collect at least “N” points. Because we are considering both possibilities + E and -E, we use a two-sided Z-statistic Z α/2.

For a confidence level (c.l.) of 95%, we have α = 1-c.l. = 0.05, or α/2 = 0.025; to this α/2 corresponds a Zα/2 of 1.96.

3.1.4 Minimum Sample Size Computation


The minimum sample size is therefore computed as follows :

Equation


Equation

A number of N ≥ 82 is computed from the above. With n = 100, we have sufficient points to ensure that the confidence level is respected. It should be borne in mind that if fewer than 82 points are sampled, the confidence level will be directly affected. Further, if the sample size is smaller than 30 points the analytic techniques presented in this paper are inappropriate and should not be used.

Please note that ISO 7966:1993(E) could also have been used to determine the sample size.


ISO 7966:1993(E)

Alpha risk set at 0.05 and beta risk set at 0.05 because we want to be 95% certain of the results of our findings. These risks are one-sided; hence, Zα = Z β = 1.645

Zp0 = Z0.05 = one-sided = 1.645 = accept lot if fewer than 5% of standards are outside the control limits.
Zp1 = Z0.10 = one-sided = 1.282 = reject lot if more than 10% of standards are outside the control limits.

USL =0.5 tolerance, LSL = -0.5 tolerance, because outside these values the mass will need to be adjusted

S = σ within = random variability of each data point, which in our case corresponds to a maximum value of 1/6 of the tolerance = 0.17 tolerance. Please note that this parameter is different from the standard deviation expected in the population as described in section 3.1.1.

APL
----- USL - Zp0S =USL - 1.645 S = 0.5 tolerance - 1.645 (0.17 tolerance) = 0.2203 tolerance
----- LSL + Zp0S = LSL + 1.645 S = -0.5 tolerance + 1.645 (0.17 tolerance) = -0.2203 tolerance
Masses with corrections smaller than ± 0.2203 tolerance away from nominal have a 95% chance of being accepted (and fewer than 5% will be over the ± 0.5 tolerance limit)
RPL
----- USL - Zp1S = USL - 1.282 S = 0.5 tolerance - 1.282 (0.17 tolerance) = 0.2821 tolerance
----- LSL + Zp1S = LSL + 1.282 S = -0.5 tolerance + 1.282 (0.17 tolerance) = -0.2821 tolerance
(RPL - APL)2 = (0.2821 tolerance - 0.2203 tolerance )2 = (0.0618 tolerance)2
Masses with corrections greater than ± 0.2821 tolerance away from nominal have a 95% chance of being rejected (and more than 10% will exceed the ± 0.5 tolerance limit)

N = (Zα + Z β )2 σwithin 2 ÷ (RPL - APL )2 = (3.29)2 (0.17 tolerance)2 ÷ (0.0618 tolerance)2 = 82


3.2 Computation of the “As-found” Values

Full calibration procedures are provided in RP-01 Laboratory Calibration Procedures for Standards of Mass

Using the usual calibration procedures outlined in RP-01 Laboratory Calibration Procedures for Standards of Mass, determine the “as-found” value of at least the same number of standards as the minimum sample size determined in 3.1.4 above (i.e., n ≥ N = 82).


The “as-found” values are obtained prior to cleaning or adjusting the standards.


3.3 Determination of Uncertainties

Full computation of the uncertainties for mass calibration activities can be found in RP-02 Determination of Mass Calibration Values and Related Uncertainties

Compute or note the associated uncertainty for the nominal value and class of the mass standard under evaluation. If this information is not available, then a maximum uncertainty (at 1σ) of 1/6 tolerance can be used; this is discussed in section 3.1.2.

3.4 Establishment of Reliability Targets

Parameters for analysis:

- Reliability target: 90% of population must fall within control limits
- Control Limits: ± ½ tolerance from nominal
- Confidence Level: 95%

The reliability target is set such that: at the end of their calibration cycle, 90% of the standards must fall within the control limits.

The control limits are set according to the assumption that we want the values for the correction from nominal to be within ±½ tolerance from nominal when the masses return for calibration. This also accounts for an expanded uncertainty in the determination of the correction equal to ± a tolerance.


In other words, a mass is called a defect if, when it returns for calibration, its mass value is beyond the control limits of ± ½ tolerance from nominal. The number of total defects observed is called Xdefects.


The confidence level represents that Measurement Canada can be 95% certain that the analysis performed will adequately detect when more than 10% of the standards are outside the control limits at the end of their calibration period.

3.5 Formulation of Statistical Hypotheses

After the reliability target, control limits and confidence level have been established, a comparison criteria is defined via statistical hypotheses. The “as-found” data are then compared to these statistical hypotheses.

In hypothesis testing, a null hypothesis is formulated and compared against its alternative hypothesis.

In our case,

- the null hypothesis Ho is: no more than 10% of the population will be outside the control limits;
- its alternative hypothesis H1 is: more than 10% of the population will be outside the control limits.

Note: the population is the total number of active standards of the same nominal value, class and usage.

This can be written as:

Ho: X ≤ (10%) M out of control limits
H1: X ≥ (10% ) M out of control limits
where X is the total number of defects within a population of M standards.

3.6 Analysis of Data

The “as-found” data are now compared against the statistical hypotheses. This step is crucial as it determines if the number of observed “good” points (within the control limits) is sufficient to not-reject the Ho hypothesis. Likewise, the number of observed points that are “out of control limits” can be used to test our hypotheses.

To determine if the null hypothesis is to be rejected in favor of the alternative hypothesis, we must calculate what is called a test statistic. Hence, we are now ready to compare our sample data to our expected results by means of the test statistic.


Technical note:

Using a maximum likelihood estimate and the probability density function of the binomial distribution we can estimate what fraction of items will still be in-tolerance at the end of the calibration period. Note that a Bernoulli trial is used since there are only two possible outcomes: either the points are within control limits or they are outside these limits. Note that when using the (cumulative) normal distribution instead of the (discrete) binomial distribution, a correction for continuity is necessary when the sample is small (n smaller than 30). Further, if the following conditions are met, the normal distribution can be used instead of the binomial distribution:

If n≥ 30 OR
If np ≥ 5 AND n(1-p) ≥ 5

where n is the number of points sampled and p is the probability of success(or failure) expected

The Z test statistic will be used in our analysis. It is based on the normal distribution. When using a normal approximation, it is necessary to know the sample mean and standard deviation. Regarding proportions, the mean is expressed as μ = np and the standard deviation is expressed as σ = √npq, where n is the number of points sampled, p is the proportion of defects expected and q = 1- p is the proportion of non-defective items expected.

The Z statistic is expressed as:

Equation

where

X = the number of defects found in the sample
n = the total number of points sampled
p = the expected (target) probability of defects

The Z statistic is compared to the Zα value for expected results corresponding to the confidence level of our analysis. In fact we are computing a critical value for testing the null hypothesis by means of Zα against which the results are compared. If the calculated Z value falls within the acceptance region, the null hypothesis is not to be rejected.


As explained in the technical note above, the Z test statistic will be used in our analysis. It is based on the normal distribution. For a 95% confidence level, the corresponding one-sided Zα value is 1.645.

3.6.1 Maximum Allowable Number of Defects

The following equation is used to calculate the maximum allowable number of defects to be observed in the sample of “n” points:


Equation

Note: It is given that p = 10% = 0.1. This is because we want no more than 10% defects in the entire population.

The total number of defects observed Xdefects should be no greater than the value of Xmax above; otherwise, the calibration interval should be shortened according to section 3.7.


3.6.2 Minimum Number of Defects

The same process can be used to decide if the number of defects is lower and statistically significant from the 10% mark. In this case, we have:


Equation 8

The total number of defects observed Xdefects should be no smaller than the value of Xmin above; otherwise, perhaps the calibration interval should be lengthened.


3.6.3 Examples

Here are a few examples of the calculations involved:

Example 1: Reliability Target q = 90% , or p = 10%. In other words, 90% of the standards are within control limits at the end of their calibration cycle. Points sampled n=100.

1.645 √(n * 0.1 * 0.9) + (n * 0.1) = X max
1.645 √(100 * 0.1 * 0.9) + (100 * 0.1) = 14.935 = 15

- 1.645 √(n * 0.1 * 0.9) + (n * 0.1) = X min
- 1.645 √(100 * 0.1 * 0.9) + (100 * 0.1) = 5.065 = 5

The above implies that the total number of defects observed Xdefects out of the n=100 standards sampled must be larger than 5 and smaller than 15.

Example 2: Reliability Target q = 80% , or p = 20%. In other words, 80% of the standards are within control limits at the end of their calibration cycle. Points sampled n = 100.

1.645 √(n * 0.2 * 0.8) + (n * 0.2) = X max
1.645 √(100 * 0.2 * 0.8) + (100 * 0.2) = 26.58 = 26

- 1.645 √(n * 0.2 * 0.8) + (n * 0.2) = X min
- 1.645 √(100 * 0.2 * 0.8) + (100 * 0.2) = 13.42 = 14

The above implies that the total number of defects observed Xdefects out of the n=100 standards sampled must be larger than 14 and smaller than 26.

Example 3: Reliability Target q = 70% , or p = 30%. In other words, 70% of the standards are within control limits at the end of their calibration cycle. Points sampled n=100.

1.645 √(n * 0.3 * 0.7) + (n * 0.3) = X max
1.645 √(100 * 0.3 * 0.7) + (100 * 0.3) = 37.53 = 37

- 1.645 √(n * 0.3 * 0.7) + (n * 0.3) = X min
- 1.645 √(100 * 0.3 * 0.7) + (100 * 0.3) = 22.46 = 23

The above implies that the total number of defects observed Xdefects out of the n=100 standards sampled must be larger than 23 and smaller than 37.

3.7 Adjustment of the Calibration Interval


The method used to determine the new calibration interval is based on an exponential reliability model and is similar to Method A-3 described in the document Establishment and Adjustment of Calibration Intervals, Recommended Practice RP-1, January 1996, produced by the National Conference of Standards Laboratories.


The calibration interval may need to be shortened (or lengthened) if the above analysis detects more (or fewer) defects than is statistically expected. The interval is then adjusted using an exponential reliability model as follows:


Equation

where

I1 = the revised calibration interval
I0 = the present calibration interval
R = the target (expected) reliability in terms of numbers of defects per population
= (maximum number of defects allowed) ÷ (total population of standards)
R0 = the observed reliability for the interval I0
= (number defects found at I0) ÷ (number of points sampled at I0)
= Xdefects / n

Please note that lengthening calibration intervals requires modifications to the Weights and Measures Act and Regulations and should not take place unless approved by Senior Management.


Calibration intervals can be shortened as required using the above exponential reliability model.

Summary - At a Glance

At a Glance

To re-establish calibration intervals, the following calculations must be made:

a) Select at least n≥N standards out of the total population of standards and compute their as-found value prior to cleaning and adjustment.

b) Use the following equation to compute the maximum number of defects to be observed:

1.645 √(n * p * q) + (n * p) = X max

If the reliability target id 90%, then the equation to use is:

1.645 √(n * 0.1 * 0.9) + (n * 0.1) = X max

c) Use the following equation to compute the minimum number of defects to be observed:

- 1.645 √(n * p * q) + (n * p) = X min

If the reliability target is 90%, then the equation to use is:

- 1.645 √(n * 0.1 * 0.9) + (n * 0.1) = X min

d) Compute the number of standards found to have a correction from nominal that is beyond the contril limits of ± 1/2 tolerance. This number is the total number of defects observed Xdefects.

e) If Xdefects < Xmin or if Xdefects > Xmax, discuss the issue with the Director of the CSL and calculate the new calibration interval using the following expression:

Equation


References

1. Chao, Lincoln L., Introduction to Statistics, Brooks/Cole Publishing Company, Monterey, California, 1980.

2. ISO Standards Handbook, Statistical methods for quality control, vol. 1: Terminology and symbols, Acceptance sampling, fourth edition, 1995.

3. ISO Standards Handbook, Statistical methods for quality control, vol. 2: Measurement methods and results, Interpretation of statistical data, Process control, fourth edition, 1995.

4. Joint International Committee ISO/IEC/OIML/BIPM- TAG-4, Guide to the Expression of Uncertainty in Measurement, first edition, 1993.

5. Johnson, Robert, Elementary Statistics, third edition, North Scituate, Massachusetts: Duxbury Press, 1980.

6. Taylor, John K., and Oppermann, Henry V., Handbook for the Quality Assurance of Metrological Measurements, NBS Handbook 145, 1986.

7. Wonnacott, Thomas H., and Wonnacott, Ronald J., Introductory Statistics, New York: John Wiley & Sons, Inc., 1969.