ANOVA Variance Component – Urgent
Six Sigma – iSixSigma › Forums › Old Forums › General › ANOVA Variance Component – Urgent
 This topic has 9 replies, 6 voices, and was last updated 12 years ago by Remi.

AuthorPosts

September 13, 2006 at 4:27 am #44588
I am quite confused now about ANOVA result and the variance component analysis result. There is the case: the result of one factor Anova showing that there is significant difference between factor levels (20 samples per factor level) but at the same time, the result of variance component analysis showing the variation due to factor difference covers only 30%—– I am lost here and wonder shouldn’t it be a high percentage. How do we explain for this?
Would appreciate for your help.
Thanks. Holly
0September 13, 2006 at 7:06 am #143162
Adrian.P.SmithParticipant@Adrian.P.Smith Include @Adrian.P.Smith in your post and this person will
be notified via email.“Significant” in this context means statistical significance (i.e. the dataset produces a sufficiently low p value) and not “large”.You need to look for other factors to explain the rest of your variation.Rgds,
Adrian0September 14, 2006 at 7:37 am #143240Professionals,
Need your help.
Does anyone know where to find articles related to ANOVA’s application for sampling plan for control chart. Or would you share your experience with me about this?
Thanks a lot.
Holly
0September 15, 2006 at 12:56 am #143310Professionals,
Need your help.
Does anyone know where to find articles related to ANOVA’s application for sampling plan for control chart. Or would you share your experience with me about this?
Thanks a lot.
Holly0September 15, 2006 at 3:31 am #143312Holly,
Unless I have missed a big chunk in statistics, the three concepts are not related to each other. Sampling is needed for both ANOVA and process capability studies, but with ANOVA you are looking primarily at experimental design. control chart design is based on different considerations regarding selection of the sample size, control limits and frequency of tesing (see Montogmery: Introduction to Statistical Quality Control, 3rd edition).
In regard to sampling you must specify both sample size and frequency of sampling.
Sample size is driven by the following consideration: any shift in the process, allocation of sampling effort (small samples in short intervals or larger samples in longer intervals), average run length, i.e. the average number of points that must be plotted before a point indicates an outof control conditom. random sampling versus systematic sampling.
Many statisticians will argue with some of Montgomery’s points (he is an engineer, not a statistician), but it will be of value to review his book from pp. 132 – 146. A detailed discussion of each of the points mentioned above is provided in those pages. Chambers and Wheeler: Understanding Statistical Process Control, 1992) discuss the importance of sampling and the difference in sampling methods between “population” statistics and “timerelated statistics” in Chapter 3 (the two terms are mine: Chambers and Wheels use “”welldefined phenomenon” vs. data (used) for predictions of what will occur in the future.
In general, I prefer Chamber and Wheeler’s approach, but your question is a general one about sampling and control charts. ANOVA has very different purposes than control chart techniques. I hope this helps.0September 15, 2006 at 3:44 am #143313Thanks Hans,
Let me explain my question further and you will have a full picture of it.
In a training material, most of sampling plan is nested, and ANOVA is used very often to identify and quantify the contributors to the total variation in a fully nested design and thus to determine a sampling plan for control chart: The sample size and frequency are dictated by different factors found to be significant in the variance component analysis. samples should be selected so that if assignable causes are present, the chance for differences between subgroups will be maximized, while the chance for differences due to those assignable causes within subgroup will be minimized.
Have anyone used the nested ANOVA to design the sampling plan for control chart?
Thanks.
Holly
0September 15, 2006 at 4:27 am #143314Holly,
Yes there is a whole discussion about “statistical control as nested design” (the design is analyzed via ANOVA for nested designs, now your original post makes sense). When you Google for it you will find some literature, but I assume the training materials will cover this approach from a practical point of view. I have not had experience with this approach. Maybe someone else with more background in manufacturing applications will help you out. I assume Robert will have something to say about it. Regards.0September 15, 2006 at 12:14 pm #143318
John NogueraParticipant@JohnNoguera Include @JohnNoguera in your post and this person will
be notified via email.Holly,
You have to be careful in comparing results from different analyses. One Factor ANOVA assumes that your factor is fixed, and the null hypothesis is equality of means. On the other hand Variance Components Analysis assumes that your factor(s) is random and the null hypothesis is equality of variance.
Having said that, using your variance component study to determine how to apply control charts, you now know that 70% of your variation is “unexplained”. So are there other factors that you can consider such as temporal, location, operator, equipment, etc? If you are able to reanalyze or redo your SOV study you can include these to determine the largest component and Pareto the variance components. Hans mentioned specialized control charts for variance components, which could then be applied.
If you are just getting started with SPC I would probably keep it simple and use classical Xbar & S looking for assignable causes. As you mature in the use of the tool then you can look at more advanced techniques such as variance components control charts.0November 26, 2009 at 1:59 pm #187088Much of the Six Sigma DMAIC methodology is concerned with finding differences: Do people do a certain job the same way or are there differences? Will a particular change make a difference in the output? Are there differences in where and when a problem occurs?
In most cases, the answer to all these questions is yes. People will do things differently. Process changes will affect output. A problem will appear in some places and not others.
That is why the more important question is often “does the difference really matter?” (Or, as statisticians would say, “Are the differences significant?”) When trying to compare results across different processes, sites, operators, etc., an hypothesis testing tool that can be used to help answer that question is the analysis of variance (ANOVA).
While the theory behind ANOVA can get complicated, the good news for Six Sigma practitioners with little experience is that most of the analysis is done automatically by statistical software, so no one has to crunch a lot of numbers. Better still, the software usually produces a simple chart that visually depicts the degree of difference between items being compared making it easy to interpret and explain to others.
A simple case study shows ANOVA in action.
The Question: Which Site Is Fastest?Table 1: Collected Data
Site A
Site B
Site C
Time in minutes to completefive loan applications
15
28
26
17
25
23
18
24
20
19
27
17
24
25
21In order to optimize the loan application process across three branches, a company wants to know which of the three locations handles the process the most efficiently. Once it determines which site is consistently fastest, the company plans to study what that site does, and adapt what it learns to the other sites. In the adjacent table is a sample of the data collected. (In real life, it is likely that more than five data points per location would be collected, but this is a simple example to illustrate the principles.)
A quick glance at this data would probably lead to the conclusion that Site B is considerably slower than Site A. (The differences are usually much harder to detect when there are a lot more data points.) But is it different from Site C? And are A and C really different?
The ANOVA Analysis
To understand the calculations performed in an ANOVA test, a person would need to study up on statistical topics like “degrees of freedom” and “sum of squares.” Fortunately, to interpret the results, a person only needs to understand three basic concepts:Mean: The mathematical average of a set of values.
Standard deviation: A value that represents a typical amount of variation in a set of data. (“Sigma” is the statistical notation used to represent one standard deviation; the term “six sigma” is used to indicated that a process is so good that six standard deviations three above and three below the mean fit within the specification limits)
pvalue: A term used in hypothesis testing to indicate how likely it is that the items being compared are the same. A low pvalue often anything below 0.05 indicates that it is very unlikely the items are the same. (Or, as nonstatisticians would say, “They are different.”)
The output from the statistical software is in two parts. Figure 1 shows the first portion:Figure 1: Numerical Output — OneWay ANOVA
As can be seen, the pvalue here is .007, a very small value. That shows that all three sites are not the same, but it does not indicate in what ways they differ. For that, the second part of the ANOVA output needs to be examined (Figure 2).
Figure 2: Graphical Output — Boxplot
The graphical output from the ANOVA analysis is easy to interpret once the format being used by the statistical program is understood. The example in Figure 2 is a boxplot, typical output from statistical software.
The two key features of a boxplot are the location of the circles, denoting the mean or average for each site, and the range of the shaded gray boxes, which are drawn at plus and minus one standard deviation. Compare where the circle (average) for item falls relative to the gray boxes for the other items. If the two overlap, then they are not “statistically different.” If they do not overlap, it can be concluded that they are different.
In this case, for example, the circle (average) for Site C falls within the values marked by the gray box for Site A. So based on this data, Site A is not statistically different from Site C. However, the circle (average) for Site B does not fall within the graybox values for either Site A or Site C, so it is significantly different from those sites.
Acting on the Results of ANOVA
Knowing that the goal was to optimize the loan application times, what path should be taken, given these results? Odds are that there are major differences in how Site B handles the loan applications compared to Site A and Site C. At the very least, the company would want to bring Site B up to the speed of the other two sites. Thus, the first step would be to compare the loan application processes across all three sites and see how Site B differs in its policies or procedures. Once all three sites were operating the same way, then the company can look for further improvements across the board.
Conclusion: Aid for Improve Phase
In Six Sigma projects, one of the biggest challenges is often whether the differences which are observed are significant enough to warrant action. One often overlooked tool that helps project teams make definitive conclusions is ANOVA. Analysis of variance is appropriate whenever continuous data from two or more groups or categories are being compared.
A better understanding of the calculations used to generate the numerical and graphical results can be found in the book Statistics for Experiments by George Box, et al. Or, those using ANOVA for the first time should be able to get help setting up the data in a statistical software program from an experienced Black Belt or Master Black Belt.
However, as shown in the example, both the numerical and graphical output from the ANOVA tests are easy to interpret. The knowledge gained will help the project team plan its improvement approach.0November 26, 2009 at 2:35 pm #187090Sorry Sathya, but your explanation of boxplotconclusions is WRONG.
Duplicate the datasets sever times (make sample Size artificially high). The boxplots will not be different; the Mean and StDev do ‘not’ change; but the pvalues will get arbitrary low: the larger you make the N the smaller the Pvalue gets.
Remi
0 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.