Statistics

Statistics 2301 Project Description

You may do this project in a group of NO MORE THAN 4 PEOPLE or by yourself.

Calendar

Between now and March 8: Select your project topic, collect your data, and type the data into Excel.

Before you leave for spring break: Designate a group member to e-mail a brief (two-five sentence) description of the topic to me, along with an Excel file containing the data (it may not be complete, but you need something). Meeting the deadline will give you 5 points on your final project.

Tuesday, April 17 (optional): Submit a complete rough draft of the project report to me, either via e-mail or hard copy. By “complete”, I mean that all of the sections below are written, and grammatical or spelling errors are minimal. The more you have written, and the neater it is, the better suggestions I can make. I will read it and make suggestions by Friday, April 20. No rough drafts received after 3 p.m. on April 17 will be read.

Tuesday, May 1: Final draft due at 3 p.m. in HARD COPY.

Description

For the end of the semester project, you are to collect and analyze your own data.  In particular, you are to examine the classified section of the newspaper (online classified websites like Craig’s List are OK, too) in order to determine a fair asking price for a particular item or service.  You will need to gather prices and some characteristic of the item for at least 20 of the same item for sale.  For example, suppose you want to determine a fair price for a particular car (say, a Toyota Camry).  You might look in the classifieds for the prices of Camrys, the model year, and the mileage.  If you’re interested in shopping for a desk, you need to collect prices for at least 30 desks, along with other variables that might determine the price (age, type of wood used, etc).  You need at least one quantitative variable other than price and one qualitative variable (e.g. brand, region of the country where the item is listed, etc). The qualitative variable should have only two categories (or you can reduce a multi-category variable to only two categories). This restriction is necessary because we will not go beyond the analysis of data with more than two categories in this class.

Once the data are collected, you will need to write a SHORT report (maximum of 7 pages, including graphics) with five sections as described below.

Introduction: Briefly describe the data you collected and the question you want to answer.  Without giving away too much, talk about the conclusions of the study.  Hint:  Sometimes it works best to write the introduction last.

Data Collection: Describe the data for the study. What variables were collected and why?  How was each variable measured (in other words, how might the person placing the ad have known the value of the variable)?  What problems did you encounter along the way and how did you solve them?  For example, maybe you wanted to gather the price, age, and brand for sets of golf clubs, but for several of the advertisements you found, the age was missing. What did you do?

Descriptive Statistics: Analyze the data using basic descriptive statistics (mean, medians, etc).  Please don’t report all of the descriptive statistics that Excel will give you. Do not include all of the decimal places given, either. Two or three decimal places will suffice. If you include irrelevant statistics and/or many decimal places, large numbers of points will be deducted.  Report only those statistics that are appropriate for the data.  Display an appropriate graph (bar chart, histogram, boxplot, etc) and describe the distribution of the data.  A scatter plot of the data is essential, since we want to see how the price varies with some quantitative characteristic of the item for sale.

The Model: Calculate the least-squares regression line, the correlation, and R-squared for your data.  Examine how the qualitative variable you selected changes the price-quantitative variable relationship. Get confidence intervals for the price based on the levels of the qualitative variable selected. Describe the conclusion of any relevant tests. Which variables in the regression were statistically significant, and what does that mean in the context of the problem?

Conclusion: Wrap-up the content of the previous three sections.  What was the conclusion about your problem from the data analysis?  Were your original ideas/guesses supported by the data?  What would you do differently if you had to do it again?  What are the weaknesses of your model and to what population does it apply?  Could the model apply more generally?  Given reasons for each answer.

Grading Guidelines

Please do not write your report as a list of answers to the questions I have posed in the outline of the sections above.  The questions for each section are there to stimulate your thought.  If you do not answer some of them, but still adequately describe your model, analyze the results and interpret them, then you will still receive a good grade.  If you answer every question in a list (e.g.  My conclusion was ______. My original ideas were supported.  I would gather more data if I had to do it again, etc.), you will receive a poor grade.

Your project will be graded by how well you address each of the following:

  • Correct use of statistical language
  • Correct calculations (for any that you perform)
  • Correct graphs (for any that you create – includes proper labeling and interpretation)
  • Accurate summaries of information and correct conclusions
  • Clarity of your writing (see general guidelines below)

The top four items are worth 80% of the grade, and the bottom one is worth 10%.  The other 10% comes from how well you meet the deadlines given above.  Note that writing clarity is only 10% of the grade, but a project that is unorganized and sloppy will be graded severely.

A more detailed rubric is attached to this document. Please print the rubric and attach it to the final draft of your project (2 points for doing this).

Note: Anyone caught plagiarizing the work of anyone else will be given a grade of zero for the project and an F for the course, and will be taken in front of the Honor Council.

General Guidelines

Proofread and correct silly mistakes!!!! In this age of computer spell checkers, misspelled words are intolerable. A general rule of thumb is to aim for at least (not at most – at least) three drafts of the paper.  Read the drafts aloud as you go over them.

If you paper has more than three spelling errors (that includes use of homonyms and words spell checkers will not catch – see below), I will return it to you. (By the way, did you catch the error in the first sentence of this paragraph? Such errors count!) You will be charged 5 points per day for every day it takes to correct the mistakes and return the paper.

The tips below are meant to help you correct the most common mistakes made in typewritten papers.

Style Tips

Remember to spell check.  
You will still need to proofread, though. Spell checkers cannot correct your mistake if you type “there” instead of “their” or “sever” instead of “severe.”

Come up with a creative title.  “My Project” or “Statistics 2301 Project” will not do.

Capitalization.  Some words are capitalized in common usage.  Examples: Internet, Excel, Word (or MS Word) – in general, any names of software packages.

Be careful with homonyms.  
Their vs. there.  
It’s (= it is) vs. its (indicates possession)
 Principle vs. principal

Write your first draft with time to spare
 You will have fresher eyes for catching errors and improving the style if you take a break before making revisions.

Everyone is singular.  Correct: Everyone turned in his (or her or his/her) project on time.  Incorrect: Everyone turned in their project on time.

Avoid overused words, such as “basically” and “hopefully”

Data are plural
 “These data are” rather than “This data is”

Don’t start a sentence with a symbol
.  1996 marked the first year of production. (incorrect) 
X varied more than Y. (incorrect)

Use parentheses to indicate multiplication
.  4,000(1.1) (correct)
 4,000*1.1 (incorrect)
 4,000 x 1.1 (incorrect)

Use a superscript for power.  
0.52 (correct)
 0.5^2 (incorrect)

Put a space before and after an = sign.  
x = y (correct) 
x=y (incorrect)

Put a 0 in front of a decimal point if needed
.  0.5 (correct) 
.5 (incorrect)

Use an em-dash—in place of a double hyphen–.  
“The economy will boom—at least until the election—and profits should be robust.”

Use smart quotes instead of dumb quotes.  
“The eagle has landed.” (correct)  ‘’The eagle has landed.’’ (incorrect)

Grammar Tips

Avoid split infinitives.  
I need to carefully check my work. (incorrect)  
I need to check my work carefully. (correct)

Avoid hopefully
.  Hopefully, the dog will be able to safely cross the road. (incorrect)  
I hope that the dog will be able to cross the road safely. (correct)

Hyphens 
Well-built car vs. reasonably priced car (both correct)
. Good-fielding shortstop vs. wonderfully talented shortstop (both correct)

That vs. Which 
This class is in the Carnegie Building, which was constructed in 1929. (correct) 
This is the only Pomona building that was constructed in 1929. (correct)

Avoid “I believe, I think”.  
I believe that we should use a histogram to display these data. (incorrect)
  We used a histogram to display these data. (correct)  
Figure 1 uses a histogram to display these data. (correct)
  **Remember: data = plural form of datum

Passive vs. Active 
It is to be hoped that the pattern in these data is apparent. (incorrect) 
These data show a clear pattern. (correct)  
Upon close inspection, it can be seen clearly that there is no relationship between presidential ages and life expectancies. (incorrect)
  Clearly, life expectancies are not correlated with presidential ages. (correct)

A repetitive structure can be deadly; parallel construction can be effective
.  We went to the library and found the data. We talked to the reference librarian and she was very helpful. We went to the computer center and graphed the data. (incorrect)
  We came. We saw. We conquered. (correct)

Vary sentence lengths and structure
.  Jasper will be a terrific employee—easily among the best we have sent you in the past five years. Hire him. Promote him. Treasure him.

Citing Sources

You will have to cite the publications/websites/newspapers from which you obtained your data.  Other than that, I don’t expect that it will be necessary to cite any sources.  But, if you do, here are some guidelines.  As long as you are consistent, any style of referencing sources is acceptable.

Parenthetical Citations

Author’s name is mentioned in the sentence.
  According to H.G. Wells, “one day statistical thinking will be as necessary to efficient citizenship as the ability to read and write”(102).

Author’s name is not mentioned in sentence.
  “One of the difficulties with the classical approach to probabilities is that there may be compelling evidence that the possible events are not equally likely” (Smith 75).

Works Cited

Book 
Last name of author, First name. Title of book. Place of publication: Publisher, Year.
Example: Smith, Gary. Statistical Reasoning. United States: McGraw-Hill, 1994.

Author with an editor 
Last name of author, First name. Title of book. Ed. Editor’s Name. Place of publication: Publisher, Year.
Example: Reiser, Paul. Couplehood . Ed. Rob Weisbach. New York: Bantam Books, 1994.

Work in an anthology
 Last name of author, First name. “Title.” Name of anthology. Ed. Name of editor. Place of publication: Publisher, Year. Page number/s.
Example: Friedman, Milton ,and Leonard J. Savage. “The Utility Analysis of Choices Involving Risk.” Readings in Price Theory. Eds. J.G. Stigler and K.E. Boulding. Homewood, Illinois: Richard D. Irwin, Inc., 1952. 318.

Article in a monthly magazine
 Last name, First name. “Title of article.” Name of magazine Month Year: Page number/s.
Example: Dash, Judy. “You Take Care of Your Family.” Family Circle. May 1998: 97-99.

Article in a Journal 
Last name, First name. “Title of article.” Name of journal Volume number.Issue number (Year): Page number/s.
Example: Modigliani, Franco. “New Developments on the Oligopoly Front.” Journal of Political Economy 66.3 (1958): 204-17.

Online Information 
Last name, First name. “Title of article.” Name of source Date of source. Online. Name of internet provider. Date of access.
Example: Morgenstern, Oskar. “Demand Theory Reconsidered.” Quarterly Journal of Economics. Feb. 1995. Online. Microsoft Network. 8 Jan. 1999.

Government Publications 
Name of government. Name of government agency. Name of material. Place of publication: Publisher, Year.
Example: United States. Bureau of Labor Statistics. Unemployment Trends in 1998. Washington: GPO, 1999.

In the spirit of citing sources, the above was adapted from StatSite at Pomona College, http://www.economics.pomona.edu/StatSite/framepg.html and from Bob delMas at University of Minnesota.

A rubric for the project is given on the following page.  Please attach it to your final project.


Names of Students in Group:

Item Points Possible Points Received
Data GatheredQuantitative variable

Qualitative variable

Reasonable explanation for choices of variables

5

5

10

CalculationsAppropriate for type of variable

Descriptive statistics given for each level of qualitative variable

Confidence Intervals correctly calculated and interpreted

Descriptive statistics correctly calculated

Regression model correctly calculated

5

5

5

10

10

Appropriate graphsScatter plot

Histograms or bar charts as appropriate for other variables

Correct Labeling on all graphs

8

7

5

Accurate interpretation of graphs and calculations

10

Organization and spellingSpelling errors

Grammatical errors

Good transition between ideas

Topics well organized

2

2

3

3

DeadlinesData submitted on time

Rubric attached

3

2

Total Points

100

 

Still stressed from student homework?
Get quality assistance from academic writers!

WELCOME TO OUR NEW SITE. We Have Redesigned Our Website With You In Mind. Enjoy The New Experience With 15% OFF