Math 6 Final Exam
Fall 1998 - Hartlaub
December 14, 1998

1. The University of California at Berkeley was charged with having discriminated against women in the graduate admissions process for the fall quarter of 1973. The table below identifies the number of acceptances and denials for both male and female applicants in each of the six largest graduate programs at the institution at that time:

Men Accepted

Men Denied

Women Accepted

Women Denied

Program A

511

314

89

19

Program B

352

208

17

8

Program C

120

205

202

391

Program D

137

270

132

243

Program E

53

138

95

298

Program F

22

351

24

317

Total

1195

1486

559

1276

(5) a. Start by ignoring the program distinction, collapsing the data into a two-way table of gender by admission status. To do this, find the total number of men accepted and denied and the total number of women accepted and denied.

Admitted Denied Total
Men
Women
Total

(10) b. Consider for the moment just the men applicants. Of the men who applied to one of these programs, what proportion were admitted? Now consider the women applicants; what proportion of them were admitted? Do these proportions seem to support the claim that men were given preferential treatment in admissions decisions? Set up and test the appropriate hypotheses.

(10) c. Suppose that the proportion of women accepted by Program B in 1972 was .55. Did this proportion significantly increase in 1973? Conduct the appropriate hypothesis test.

(10) d. To try to isolate the program or programs responsible for the mistreatment of women applicants, calculate the proportion of men and the proportion of women within each program who were admitted. Record your results in a table like the one below.

Proportion of men admitted Proportion of women admitted
Program A
Program B
Program C
Program D
Program E
Program F

(10) e. Does it seem as if any program is responsible for the large discrepancy between men and women in the overall proportions admitted?


2. Ninety-two percent of the trees planted by a landscaping firm survive. Suppose that the firm plants 20 trees on the Kenyon campus and we are willing to assume that these 20 trees are representative of all trees grown by this firm.

(5) a. How many of the 20 trees can we expect to die?

(5) b. What is the standard deviation of the number of trees that will die?

(5) c. What is the probability that three or fewer trees will die?

(5) d. What is the probability that 18 or more trees will survive?

(5) e. What is the probability that exactly 18 trees will survive?


3. If 30% of all students entering a certain university drop out during or at the end of their first year, find the following values for this year's entering class of 1800:


(6) a. The mean and standard deviation of the number of students who are expected to drop out during or at the end of their first year;

(6) b. The mean and standard deviation of the proportion of students who are expected to drop out during or at the end of their first year;

(10) c. The probability that more than 600 will drop out during or at the end of their first year.


4. The pulse rates for 13 adult women were

83, 58, 70, 56, 76, 64, 80, 76, 70, 97, 68, 78, and 108.

(5) a. Identify the population parameter of interest to the researchers.

(5) b. Find a 95% confidence interval for the appropriate parameter.

(5) c. Interpret the confidence interval in part (b). That is, explain this interval in plain language that your friends would be able to understand.

(5) d. How many adult women would the researchers need to examine to obtain a 95% confidence interval of length 5 if they assume that the standard deviation of the pulse rates in the population is 15?


5. Each Saturday in the summer, a student earns extra money by playing the guitar and singing for donations on the town square. The collections vary from week to week, with a mean of $60 and a standard deviation of $20. The student is trying to plan ahead for the five Saturdays of the month of August and is willing to assume that the money to be received on each of the five Saturdays may be represented as an independent observation of a random variable with these values as its mean and standard deviation.

(5) a. Find the mean value of the total expected to be received for these five Saturdays.

(5) b. Find the standard deviation of this total.

(5) c. Find the probability that the student collects a total of at least $280 during the month of August.

(5) d. Find the mean value of the average received per Saturday for each of the five days in August.

(5) e. Find the standard deviation of this average.

(5) f. Find the probability that the average earnings are between $50 and $65.


6. Gasoline pumped from a supplier's pipeline is supposed to have an octane rating of 87.5. On 13 consecutive days a sample was taken and analyzed with the following results.

88.6, 86.4, 87.2, 88.4, 87.2, 87.6, 86.8, 86.1, 87.4, 87.3, 86.4, 86.6, and 87.1

(10) a. Is there sufficient evidence to show that these octane readings were taken from gasoline with a mean octane significantly less than 87.5 at the 0.05 level?

(5) b. List three methods of checking the normal distribution assumption that you made in conducting the test in part (a).

(5) c. Comment on the validity of the normal distribution assumption for the number of octane ratings in this problem. (i.e., Use at least one of the methods you listed in part (b) to check the normal assumption.)


7. The management of a factory wants to try out a new assembly technique for one of its large assembly lines. Fifteen employees are randomly selected, and the number of units each employee assembles in one week is noted. These 15 employees are then taught the new technique. Management records the number of units each of these employees assembles in one week using the new technique. The data are in the file p:\data\math\stats\assemble.mtw

(5) a. State the appropriate null and alternative hypotheses for testing to see if the new technique is effective in increasing worker productivity.

(5) b. Find and interpret the p-value for the appropriate test.

(5) c. State your conclusions in plain English if = .01.

(5) d. Find a 90% confidence interval for the mean increase in productivity.


8. The fertility rate of a country is defined as the number of children a woman citizen bears, on average, in her lifetime. The dataset p:\data\math\stats\fertility.mtw gives the fertility rate for several developing countries, along with contraceptive prevalence (measured as the percentage of married women who use contraception).

(5) a. Is there any association between fertility rate and contraceptive prevalence?

(5) b. Find and interpret the value of the correlation coefficient for fertility rate and contraceptive prevalence.

(5) c. Find the least squares regression equation for predicting fertility rate from contraceptive prevalence.

(5) d. Interpret the value of the slope parameter in the least squares regression line.

(3) e. Predict the fertility rate for a country where the contraceptive prevalence is 42.

(3) f. What is the value of the residual for the country with a fertility rate of 6.5 and a contraceptive prevalence of 28?

(5) g. Plot the residuals against contraceptive prevalence. Comment on the appearance of the plot and any implications this may have to the adequacy of the linear model.


9. A company bakes computer chips in two ovens, oven A and oven B. The chips are randomly assigned to an oven and hundreds of chips are baked each hour. The percentage of defective chips coming from these ovens for each hour of production throughout the day is shown below and provided in p:\data\math\stats\oven.mtw.

Percentage of Defective Chips

Hour

Oven A

Oven B

1

45

36

2

32

37

3

34

33

4

31

34

5

35

33

6

37

32

7

31

33

8

30

30

9

27

24

(5) a. Do you think it is reasonable to assume that all 18 of the observed percentages are independent? Explain.

(15) b. Does there appear to be a difference between oven A and oven B with respect to the mean percentages of defective chips produced? Give appropriate statistical evidence to support your answer.


10. Rainfall specimens from various sites in New Zealand were analyzed to determine the amount of sulfur in the rainwater. The sulfur concentrations are provided in p:\data\math\stats\sulfur.mtw. The sites were classified according to whether they were closer to the east or the west coast of the island. Is there sufficient evidence to conclude that the mean sulfur concentration in rainwater differs for the eastern and western parts of the island?

(5) a. State H0 and Ha.

(5) b. Which type of t procedure is appropriate: one sample, matched pairs, two-sample, or pooled? List your assumptions and explain.

(5) c. Using = 0.05, carry out the appropriate t test. Give the p-value and report your conclusion.

(5) d. Give a 90% confidence interval for the appropriate parameter(s).

(5) e. Can the interval from part (d) be used to conduct a test of the hypotheses in part (a)? If so, explain how. If not, explain why not.


Math 6 Final Exam

Spring 1996 - Hartlaub
May 9, 1996

1. Suppose that the students in an introductory statistics class are surveyed, and that this is a stem-and-leaf plot of their ages:

1 6 7 7 7

1 8 8 8 8 9 9 9 9 9

2 0 0 0 0 0 1 1 1 1

2 2 2 2 3 3 3

2 4 4 5

2

2 9

3 0


(3) a. How many students are in the class?

(2) b. How old is the youngest student?

(2) c. How old is the oldest student?

(3) d. Is the distribution symmetric or skewed?


2. Consider the number of deaths caused by 15 major earthquakes:

8, 62, 81, 115, 146, 250, 1000, 1000, 1200, 1300, 1621, 4000, 4200, 40000, 55000.

(5) a. Calculate the mean and standard deviation for the number of deaths.

(10) b. Do you think that the mean and standard deviation provide appropriate summaries of these data? If so, explain why. If not, suggest measures of center and spread that are appropriate for these data.


3. An economic forecaster suggests that interest rates will be 9% next year and that the uncertainty in the forecast is indicated by a standard deviation of 2 percentage points. Assume that the actual interest rate is normally distributed with this mean and standard deviation.

(5) a. Find the probability that the forecaster is correct to within 1 percentage point on either side of the 9% forecasted.

(5) b. Find the probability that interest rates will be no larger than 12.5%.

(5) c. Find the probability that interest rates will be 8% or smaller.


4. The Nation, a weekly magazine of politics and the arts, occasionally sponsors poetry contests. The results of one were disputed by a reader, whose comments follow:


"Your poetry competition and its result were remarkable. Competitiveness is one of those traits of our present system I'd think you'd eschew. But what really irks me is your result: Four female poets win a competition cosponsored by a publication with a female poetry editor. Yes, there were males on the judging panel just as the queens of old had eunuchs to attend them. Does The Nation mean to tell us that there were no entries by male poets that remotely approached the quality (however that's judged) of the winners? I can imagine the screams from the gallery if the results were as one-sided in the other direction ..."


The editor responded that the contest had been judged anonymously and that none of the judges knew the names or genders of the authors they were judging. She went on to make a brief argument that sometimes things like this just happen.


Suppose that equal numbers of males and females enter the contest and that the total number of entries is large enough so that there is no effective difference between sampling with and without replacement.


(5) a. If we select four winners randomly with respect to gender, what is the probability that they are all female?

(5) b. If we select four winners randomly with respect to gender, what is the probability that they are all male?

(5) c. If we select four winners randomly with respect to gender, what is the probability that they are all the same gender?

(5) d. Suppose we ran this hypothetical poetry contest annually for 80 years. In about how many years would we expect all the winners to be of the same gender?

(5) e. Using your solutions to parts (a)-(d), describe your feelings about the results of the poetry contest.


5. Suppose that an instructor grades quizzes by standing at the top of the stairs and throwing the quizzes down. Quizzes that land on the top third of the stairs are given A's and the rest are given C's. There is a .4 chance of getting an A. Over the course of the semester, ten sets of quizzes are graded this way.

(5) a. How many different ways can a student get exactly 5 A's?

(5) b. What is the probability of getting exactly 5 A's?

(5) c. What is the probability of getting at most 5 A's?

(5) d. What is the probability of getting at least 5 A's?


6. Suppose that 50% of the adolescents in a certain city have been arrested at one time or another. You take a random sample of 100. Calculate the probability that fewer than 40 have ever been arrested by using:

(5) a. the binomial distribution,

(10) b. the normal approximation with the continuity correction.


7. A researcher honestly believes that a new medication reduces blood pressure. However, a test on five people did not show a significant average reduction in blood pressure. Is it possible that the medication is effective? What might be done to find out? (20)


8. Here are some measurements of the wing lengths of killer bees from French Guiana.

8.56, 8.51, 8.51, 8.57, 8.96, 8.82, 8.39, 8.54, 8.62

(5) a. Identify the population parameter of interest to the researchers.

(5) b. Find a 95% confidence interval for the appropriate parameter.

(5) c. Interpret the confidence interval in part (b). That is, explain this interval in plain language that your friends would be able to understand.

(5) d. How many killer bees would the researchers need to examine to obtain a 95% confidence interval of length .2 if they assume that the standard deviation of the wing lengths in the population is .18?


9. Each Saturday in the summer, a student earns extra money by playing the guitar and singing for donations on the town square. The collections vary from week to week, with a mean of $58 and a standard deviation of $23. The student is trying to plan ahead for the five Saturdays of the month of August and is willing to assume that the money to be received on each of the five Saturdays may be represented as an independent observation of a random variable with these values as its mean and standard deviation.


(5) a. Find the mean value of the total expected to be received for these five Saturdays.

(5) b. Find the standard deviation of this total.

(5) c. Find the probability that the student collects a total of at least $250 during the month of August.

(5) d. Find the mean value of the average received per Saturday for each of the five days in August.

(5) e. Find the standard deviation of this average.

(5) f. Find the probability that the average earnings are not between $50 and $65.


10. Suppose that 75% of all students using the Roth classroom write their names on their diskettes. A random sample of 300 students is selected.

(5) a. What is the mean value of the proportion among the 300 students sampled who have their names on their diskettes?

(5) b. What is the standard deviation of the sample proportion?

(10) c. What is the chance that at least 78% of those sampled have their names on their diskettes?


11. Levy et al. (1993) studied the effect of exercise on heart filling rate in older (age 60 to 82) and younger (age 24 to 32) healthy males. (Exercise improved heart fill rates for both groups.) As part of the study, they looked at the relationship between maximum oxygen consumption - the ability to exercise - and the fill rate at the start of the study. Data from this study is in the file p:\data\math\stats\heart.mtw. Retrieve the data and answer the questions below.

(5) a. Is there any association between the heart fill rate and oxygen utilization capacity in younger men? in older men?

(3) b. Find the value of the correlation coefficient for the heart fill rate and oxygen utilization capacity for younger men.

(5) c. Repeat part (b) for older men and compare the values of r.

(10) d. Find the least squares regression equation for predicting heart fill rate from oxygen utilization capacity for older men.

(5) e. Predict the heart fill rate for an older man whose oxygen utilization capacity is 28.

(5) f. What is the value of the residual for the gentleman with fill rate 75 and oxygen capacity 27?

(10) g. Explain how to obtain the least squares line for predicting heart fill rate from oxygen utilization capacity for younger men by using only descriptive statistics.


12. Use the data from the previous exercise (heart.mtw) to test the following hypotheses. In both cases, be sure to state your hypotheses using appropriate population parameters, specify the value of the test statistic, find the p-value, and state your conclusions in terms of the practical problem of interest. Make sure that you specify and check all of your assumptions!

(15) a. Test the null hypothesis that there is no difference in the oxygen utilization capacities for younger and older men against the alternative that the capacities are higher for younger men.

(15) b. Test the null hypothesis that the heart fill rate for older men is equal to 115 against the alternative that the fill rates are lower than 115.

(15) c. Do you think that there is a significant difference between the heart fill rates for older and younger men? Explain how to check your hunch by using confidence intervals.


13. Give a short (two or three sentences) explanation for each of the following.

(10) a. Normal Probability Plots.

(10) b. Simpson's Paradox.

(10) c. The Central Limit Theorem.

(10) d. The difference between a parameter and a statistic.

(10) e. The difference between causation and association.

(10) f. The importance of randomization in design and sampling.


Math 6 Final Exam

Fall 1995 - Hartlaub
December 15, 1995

1.(10) Suppose that the present speeds on Ohio highways are normally distributed with a mean of 68 miles per hour and a standard deviation of 5 miles per hour. If the state police decide to ticket the fastest 20% of the motorists, how fast could you drive without risking a ticket? (Be careful on your way home for the holidays!)


2. A radio station that plays classical music has a "by request" program each Saturday evening. The percentages of requests for composers on a particular night are as follows.

Bach 5% Mozart 21%

Beethoven 26% Schubert 12%

Brahms 9% Schumann 7%

Dvorak 2% Tchaikovsky 14%

Mendelssohn 3% Wagner 1%

Suppose that one of the requests is randomly selected.

(5) a. What is the probability that the request is for a composer who's last name begins with the letter B?

(5) b. What is the probability that the request is not for one of the two composers with an S as the first letter in their last name?

(5) c. Neither Bach nor Wagner wrote any symphonies. What is the probability that the request is for a composer who wrote at least one symphony?

3. Thirty percent of all automobiles undergoing a headlight inspection at a certain inspection station fail the inspection.

(5) a. Among 15 randomly selected cars, what is the probability that at most five fail the inspection?

(5) b. Among 15 randomly selected cars, what is the probability that exactly five fail the inspection?

(5) c. Among 15 randomly selected cars, what is the probability that at least 5 fail the inspection?

(10) d. Among 25 randomly selected cars, what is the mean value of the number that pass the inspection, and what is the standard deviation of the number that pass the inspection?


4. (10) How many households should be surveyed to form a 95% confidence interval of width $400 for the mean income in a suburb of a large city if the standard deviation is $500?

5. After all students have left the classroom, a professor notices that four copies of the text were left under desks. At the beginning of the next lecture, the professor distributes the four books in a completely fashion to each of the four students who claim to have left their books. One possible outcome is 1 receives 2's book, 2 receives 4's book, 3 receives his or her own book, and 4 receives 1's book. This outcome can be abbreviated (2, 4, 3, 1).

(10) a. List the 23 other possible outcomes. (A tree diagram may be helpful.)

(5) b. Which outcomes are contained in the event that exactly two of the books are returned to their correct owners? Assuming equally likely outcomes, what is the probability of this event?

(5) c. What is the probability that exactly one of the four students receives his or her own book?

(5) d. What is the probability that exactly three of the students receive their own books?


6. (15) Managers assume that twenty-five percent of the customers entering a grocery store between 5:00 pm and 9:00 pm use an express checkout. Approximate the probability that more than 78 of 300 randomly selected customers will use the express checkout.


7. A water-quality control board reports that the water is unsafe for drinking if the mean nitrate concentration exceeds 30 ppm.

(10) a. Water specimens from a well will be analyzed and appropriate hypotheses tested to determine if the well should be closed. If your drinking water comes from this well, what hypotheses would you want the water-quality board to test? Explain.

(15) b. Suppose that results of an appropriate test indicate that the well described in part (a) above should be closed. Now, the water control board is concerned that there may be a serious problem in four of the housing complexes surrounding this well. Large housing developments are located to the North, South, East, and West of the contaminated well. Assume that each home has its own well. Give a detailed description of a design that you would recommend to the water-quality control board. That is, design an experiment and list the statistical methods you would use to analyze the data and investigate potential wide spread problems with the nitrate concentrations in the four developments surrounding the contaminated well.

8. (10) Suppose the null distribution of the test statistic Z is normal with mean µ and standard deviation 1. Also, if H0 is true, then µ = 0. The observed value of the test statistic Z in a two-tailed test is -1.48. Find the p-value and state your conclusion.


9. Ten high school seniors taking the ACT test received the following scores: 28, 26, 30, 24, 25, 29, 31, 26, 23, and 27. Several faculty members have stated that this group of ten students is an unusually talented group of individuals. In the past, ACT scores have been normally distributed with a mean of µ = 25.

(10) a. Test H0: µ = 25 versus H1: µ > 25 at level = .05.

(5) b. List three methods of checking the normal distribution assumption that you made in conducting the test in part a.

(10) c. Comment on the validity of the normal distribution assumption for the ACT scores in this problem. (i.e., Use at least one of the methods you listed in part (b) to check the normal assumption.)


10. A teacher randomly selects ten students to participate in a week of training designed to improve their typing speed. The teacher times their speed before and after the course, to see if the course is worth the time and expense. You will find the results in the file P:\DATA\MATH\STATS\TYPE.MTW.

(10) a. Retrieve the minitab worksheet and find a 96% confidence interval for the mean difference in typing speed.

(5) b. Explain how to use the confidence interval in part (a) to conduct a two-sided hypothesis test of H0: µdiff=0 against the appropriate two-sided alternative at level =.04.

(10) c. Test the null hypothesis that the training session does not improve typing speed against the alternative that the training session does improve typing speed using a significance level of =.02. Please state your conclusions in terms of the typing problem of interest.


11. Although our class periods are exactly 50 minutes in length, my actual lecture time on a particular day can be represented by a random variable with mean 51.5 minutes and standard deviation 1.5 minutes. Suppose that times of different lectures are independent of one another and I give 36 lectures during the semester.

(10) a. What is the probability that my lecture on any one particular day, say Monday, Dec. 18, will be less than 50 minutes in length?

(10) b. What is the probability that my average lecture time over the entire semester will be less than 50 minutes in length?


12. In 1989, the results of a study on the effect of alcoholism on muscular strength were reported. The report contained muscular strength test measurements and estimates of total lifetime consumption of alcohol for each of the 50 alcoholic men which were carefully selected for a larger group of alcoholic patients to form a homogeneous group. These men consumed mainly wine, beer, and brandy. The strength measurements, which are listed as the dependent variable in this study, are the strength of the deltoid muscle in each person's nondominant arm. This strength measurement was determined by making five measurements over a 20 minute period to measure muscular force against a fixed resistance. Retrieve the file P:\DATA\MATH\STATS\ALCOHOL.MTW, which contains these 50 pairs of observations, and use minitab to answer the following questions.

(5) a. Is there any association between the total lifetime dose of alcohol (X) and muscular strength (Y).

(5) b. Find the value of the correlation coefficient for the strength and alcohol consumption data.

(10) c. Explain the method of least squares.

(5) d. Find the least squares regression equation for predicting muscular strength from total lifetime alcohol consumption.

(5) e. Interpret the value of the slope parameter in the least squares regression line.

(5) f. Predict the muscular strength for a male whose total lifetime consumption of alcohol is 30.8.

(5) g. What is the value of the residual for the individual with muscular strength 21.1 and total lifetime consumption of alcohol 20.0?


13. Suppose that 55% of all students using the Roth classroom write their names on their diskettes. A random sample of 300 students is selected.

(5) a. What is the mean value of the proportion among the 300 students sampled who have their names on their diskettes?

(5) b. What is the standard deviation of the sample proportion?

(10) c. What is the chance that at most 51% of those sampled have their names on their diskettes?


14. Give a short (one or two sentences) explanation for each of the following.

(5) a. The five number summary.

(5) b. Simpson's Paradox.

(5) c. The central limit theorem.

(5) d. The difference between a parameter and a statistic.

(5) e. The difference between the sample mean and the sample median.