Monday, April 27, 2009

Too many data!!!!

Monday, April 27, 2009 2

I finally finish entering my data into Excel, and it was very time-consuming. It took me about 3 hours to enter 300 data. It was so time consuming because for each city on my list I have to use Google map to find out how far it is from New Orleans. I have to do that about 75 times for the different places that I have. I then use the chart wizard on Excel to create my histogram. I needed to do a five number summary on my data but I do not know how to do that. So I use the amazing search engine Google to find out how to compute the five number summary in Excel. Just a note a five number summary is just a list of the minimum, the 25th quartile, the median, the 75th quartile, and the maximum of my data. So I find an awesome website that helps me with this problem. http://www.stat.unc.edu/teach/rosu/Stat31/E1_43.html is the website that I used. It is so easy to compute the five number summary of a data set. All I have to do is type the word Quartile in the Excel prompt and highlights all my data and voila my five number summary is given to me. As for the box plot, I will use the website that I have found last week. Just a note: a box plot is a graph of the five number summary, and my box plot will show outliers. I also need to find the mean and the standard deviation of my data. I do not remember how to do that in Excel. So again I used the amazing search engine Google to find out how to do this. http://www.gifted.uconn.edu/siegle/research/Normal/stdexcel.htm is a very thorough and good website. It teaches me how to find the mean and the standard deviation step-by-step with picture. Just a note: the mean is the average value of a data set and the standard deviation is a measure of variability or dispersion around the mean. The more spread a data, the higher the standard deviation. I also need to do a one sample t test on my data. A one sample t test is a test that tests the validity of a claim made. It is used for a simple random sample of size n from a population that has the Normal distribution with mean and standard deviation. It has the t distribution with n-1 degrees of freedom. I use Google to find out about how to do a one sample t test in excel but I could not find anything so I use the official website for Excel. http://office.microsoft.com/en-us/excel/HP052016751033.aspx?pid=CH062528011033 is the website that I used. I have to install the Analysis ToolPak, which took some time and it was confusing. So I use another website http://www.bioss.ac.uk/smart/unix/mbasexc/slides/sl18.htm and it really helps. All in all I am glad that I have finished putting all my data analysis and I am ready to move on to creating a “hot” powerpoint.

Wednesday, April 8, 2009

Analyzing Data

Wednesday, April 8, 2009 3

Now I am ready to analyze my data. First of all I need to gather all my data into one pile, since I have it all over the place. Then I have to spend hours upon hours trying to calculate the distance in miles each place is from New Orleans. I will use Google map to help me calculate the miles. After that I have to put all of my data into Excel. This, too, will be a time consuming process. I will have two columns; one column will have the number of miles, and the other column will be the number of people. After that I will have to arrange the data in ascending order by miles. This is an extremely easy process because all I have to do is click a button and voila the data will be in ascending order by miles. Then I have to figure out how I was going to make a box plot. Well I figure out that making a box plot in Excel is very complicated. So I use this website http://www.shodor.org/interactivate/activities/BoxPlot/?version=1.6.0_10&browser=Mozilla&vendor=Sun_Microsystems_Inc. to help me in my process with making a box plot. This interactive box plot really helps simplify the process of making one. All I have to do is copy and paste the data into this, and tada I got me a box plot that can identify outliers. Another graph I need to do is a histogram. I have learned how to make a histogram in Excel before so it will not be too hard. The only problem I might have is figuring the scale for my category. I want to use a scale that seems appropriate to my data and does not significantly skew my data. I also want a scale that will make my data have a Normal distribution. However I am thinking that I might have to exclude outliers so that my data have a Normal distribution. I will probably encounter a lot of problem with analyzing my data, but I will try my best to do what I want to do.

Monday, April 6, 2009

Data Collection?

Monday, April 6, 2009 1

I am already done with my data collection. I had asked about 300 people what city and state they are from. I got people from around the United States. There were people from New York, Florida, California, Illinois, and many more. There were people even from other countries such as Romania, Singapore, Germany, Japan, and Korea. I have lots of fun while collecting my data. I get to learn a little bit more about my customers and their cultures. The next thing I needed to do is analyze my data. I have found a good website to help me in this process. Here is the link http://home.ubalt.edu/ntsbarsh/stat-data/Topics.htm#rrtopic. This website gave me all sort of information I need to analyze my data. It gave me information on things such as the Central Limit Theorem, which is basically a theorem that states that if a random sample has a size of more than 30 samples then the distribution if this data set approximately follows the Normal distribution. This website also talked about removing outliers because outliers can greatly affect the distribution of a data set. This website also talks about the significant and confidence test and what it is used for. There is also a very interesting section in this website called Kind of Lies: Lies, Damned Lies and Statistics. When I started taking AP Statistics, we have to read a book called How to Lie with Statistics by Darrell Huff. It was a very interesting book and it taught me how to look at statistical facts more closely than I did before. Well this website gave examples of “statistical lies” and it helped remind me that I should not do this in my data analysis. I am planning to do a box plot, a histogram, and a one-sample T test. I am also planning to give the five-number summary, the outliers, the mean, and the standard deviation of my data. I hope that my data analysis will helped people be able to understand my data more.

 
Mai's Blog ◄Design by Pocket, BlogBulk Blogger Templates