Introduction to Statistics with M&Ms Using EXCEL

Share the mean and standard deviation of all the sets of M&M's.  Compute the mean of the means for plain and peanut M&Ms.  The standard deviations can be combined by squaring them, adding them together, and then taking the square root.  Use these class means and standard deviations to plot the Gaussian distributions for plain M&Ms and peanut M&Ms.

You need to get four sets of data from other students: two sets of the same kind of M&M's, and two of the other kind.  You will do F and t tests for these sets against your own set.  When you do a set that matches the type you have, say plain vs plain, the F test should say that the variances are the same, and you will then do a t test assuming equal variances.  When you do a test using two different kinds of M&Ms, plain vs peanut, the T test should say that the variances are not equal, and you should then do a t test assuming unequal variances.

Sample and Measurements.  Each student will receive a bag containing about 20 M&Ms; they may be plain or peanut M&Ms.  Weigh the mass of each M&M in grams on the new Mettler-Toledo XS204 balances in the balance room.  In your notebook record the color and mass of each M&M.  Record all digits from the balance!   Be sure to make the M&Ms "disappear" after you have weighed them.

Analysis.  1 Descriptive Statistics:  Compute the "descriptive statistics" of your data.  This will usually include the count or sample size, the average, the standard deviation, the standard error (aka the standard deviation in the mean), and the 95% confidence limits.  The minimum, the maximum, and the range are also useful.

    Excel.  Probably the easiest way to do this is to use the "Descriptive Statistics" function in the Data Analysis Tool Pak in Excel.  Place all your data in a column of a worksheet.  Then click on "Data Analysis ..." in the Tools menu of Excel, and select "Descriptive Statistics" from the alphabetical list.  The Descriptive Stats routine will want you to select a range for your data.  Check the box for "Summary Statistics".  The "New Worksheet Ply" means that the output will be placed on a brand new sheet in the workbook.  That probably isn't what you want to do, but although go ahead and give it a try.  A better option is push the "Output Range" button which will cause the output to be placed on the current worksheet; however, you must select a 15 x 2 set of cells for the output.  Then click the OK button and the results will spill out on to the sheet.  There will a lot of results!  Remove unnecessary statistics such as "skewness" from the table before you copy it into Word.
If the
Data Analysis Tool Pak is not available on your computer, you can use the functions in the spreadsheet.  The 95% confidence limits are a range defined by the average plus or minus t times the standard error:
                                  ⟨x±tsn⟨x⟩±\frac{ts}{\sqrt{n}}
The correct value of t can be obtained using TINV(0.05, n-1)n-1 is the number of degrees of freedom.
Do NOT use the function CONFIDENCE in EXCEL to compute confidence limits!
The range can be computed as MAX - MIN.

Use the mean and standard deviation to plot a Gaussian curve for your M&Ms using NORMDIST.  Once you have used the data to find the curve parameters, you don't need the data for plotting.  The function is defined by its mean and standard deviation, so there is no need to try to compute the function for each M&M mass; the function should be plotted over whatever range of numbers lets you see the graph best.   

Sort the M&M data by color.  Do a t test on the two most frequent colors in your set of M&Ms.  For example, test whether the average mass of the blue and yellow M&Ms are significantly different.  Use the instructions below for t test assuming equal variances.
 

                2 F tests and t tests:  The F test answers the question "Is my standard deviation significantly different from the standard deviation of another set of data?"  The t test answers the question "Is my mean value significantly different from mean value of another set of data?"  You must compare the standard deviation and mean of your data with those of four other sets of data: two that are the same type of M&Ms as your own, and two that are the other type of M&M!  For example, if you have peanut M&Ms, you will compare your data set against two sets of plain M&Ms and also against two sets of peanut M&Ms.  You must always do the F test first, because its result will determine how you do the t test.

    Excel.  You will find that the F test is available in the Data Analysis Tool Pak.  You must have each set of data on the same spreadsheet to perform the F test.  A 3 x 10 set of cells is needed to contain the output. Consult your Quant book to interpret the results correctly, because it can be confusing.  The test will give you two F values, F and Fcritical.  If F is farther from 1.00 than Fcritical, then the standard deviations are different.  Arrange the test so that F comes out greater than 1.00, then check to see if F>Fcrit; if so, then the standard deviations are different.  If not, then they are not significantly different.
   If the Data Analysis Tool Pak is not available on your computer, you can use the FTEST function in EXCEL.  The function returns α,  αthe probability of an insignificant difference, but it does not give the F values.  Use FTEST(range1, range2) to find α.  Use FINV(α, df1, df2) to find the value of F and FINV(0.05, df1,df2) to find the value of Fcrit.   The dfs are the number of degrees of freedom in each data set, n-1.  The first value df1 must be the set with the greater standard deviation.  If  α <0.05, then the difference is significant, and you should find that F>Fcrit.
α
The t test is also in the Data Analysis Tool Pak; however, you must select the t test based on the results of the F test!  Choose either  t Test: Two Sample Assuming Equal Variances or t Test: Two Sample Assuming Unequal Variances depending on whether the F test showed no difference or a difference.  You will need a 3 x 14 set of cells for the output.  Again, consult your Quant book to interpret the results.  There will be three T values: T stat, T critical one-tail, and T critical two-tail.  Ignore the one-tail value.  If the absolute value of T stat is greater than T critical two-tail, then the means are significantly different. 
   If the Data Analysis Tool Pak is not available on your computer, you can use the TTEST function in EXCEL.  The function returns α,  αthe probability of an insignificant difference in the means, which is often called the "P-value".  Use TTEST(range1, range2, 2, 2) to do an Equal Variance test, and TTEST(range1, range2, 2, 3) to do an Unequal Variance test.  If α<0.05, then the difference in the means is significant; if not, then the means are not significantly different.  To compute tcrit, use TINV(0.05, df) and to compute tstat, use TINV(α, df).  For the t test df will be n1 + n2 - 2, the combined degrees of freedom.  If α<0.05, then you should find that tstat>tcrit.
  
                3 Pooled Data:  All the M&M data should be pooled so that the mean and standard deviation for all plain M&Ms and for peanuts M&Ms can be computed for the entire class.  You could merge all the data into one big spreadsheet, but all you really have to do is calculate a weighted mean and a weighted standard deviation.  This requires the mean xi, the standard deviation si, and the number of data ni in each set of data.  Square the standard deviation s to get the variance Var of each set.  Then
                                          wt.av=nixini            wt.std.dev=Var#setswt.av=\frac{∑n_ix_i}{∑n_i} wt.std.dev=\sqrt{\frac{∑Var}{#set}}
Use the weighted mean and weighted standard deviation to construct normal Gaussian curves for each population.  These curves are now functions, so there is no need to try to compute them for each M&M mass; they should be plotted over whatever range of numbers let you see the graphs.  The "right" ranges won't be the same for the two types of M&Ms!  But they both need to appear on the same graph.

Report.  From the MoCoSin network, you should also look at the lab report form for this experiment.  Under Catwoman\chemistry\courses\ch220, look at the lab report called MandMs.doc.  It has already been written for you--to some extent.  There are places in the document file where you are allowed to entire data or conclusions.  Do not think that you can just cut and paste results from the descriptive statistics, the F test, or the t test directly from Excel into the lab report document.  You must edit the results--remove any numbers that are irrelevant to the reader, such as skewness or T critical one-tail.