Project:  Hazardous Waste

Name _______________________        Name____________________________

1. Obtain the Data
Your instructor will provide you with a printed copy showing RCRA waste and population data for 1 of 12 states. Enter (or download) the data into the following TI-83+ lists:

RCRA data: 1991=L1, 1993=L2, 1995=L3, 1997=L4
Population data = L5

Compute "mean RCRA waste" for each county, and store the results in list L6.  Transfer the mean values to the data sheet. What are the units of measure for the mean?

2. Analyze the data
Find the county with the biggest change in RCRA waste generation from one biennium to the next:  which county, how much waste one year, how much in the next report?  What is the percent change from one biennium to the next?

Such extreme changes in hazard waste production do not seem reasonable, maybe the numbers are in error.  But maybe not!  Give one reasonable explanation why RCRA waste generation might change so much in one biennium.

Use your TI-83+ to make a frequency histogram of the mean RCRA waste values. (For review information, consult Chapter 3 in your text.)  Sketch the histogram on graph paper. Label axes appropriately.

Are the mean RCRA waste values normally distributed?  How can you easily tell without doing any computations?

Compute the mean and standard deviation of the mean RCRA waste values.  Use the TI-83+ for assistance. Is the standard deviation less than, equal to, or greater than the mean?

In your opinion, is the standard deviation "small", "medium" or "large"?  Explain briefly.

Compute the following 7 numbers: Do any of the 7 numbers come out negative? ___________ If so, do these numbers have any physical meaning, can you have negative mean RCRA waste in reality?  What do the negative numbers tell you?

Sometimes there are data that seem to be "way out of bounds."  These numbers can be accurate or they can be caused by error.  In either case they tend to dominate the calculations.   Statisticians call these numbers outliers; outliers are numbers that lie more than 3 standard deviations away from the mean.  Are there any outliers in your mean RCRA waste values?  If so, what are the names of the counties?

3. Per Cap Waste
The EPA hires you as a consultant, to impose fines on counties that are "environmentally bad."   Your supervisor suggests that counties that generate the most RCRA waste should be fined the most.  Discuss why this system might not be fair.

Another method of fines is to punish the people, not the counties.  In other words, fine the counties that have the highest mean RCRA waste per capita (per person).  Compute the mean RCRA waste per capita for each county.  Convert the result so that the units are in  pounds per person. (Note: 1 ton = 2000 pounds)  Store the final result in L7 and record on the data sheet.

Use your TI-83+ to make a frequency histogram of the per capita mean RCRA waste values.  Sketch the histogram on separate graph paper. Label axes appropriately.

What is the mean of the mean per capita RCRA waste? What is the standard deviation?   (Use correct symbols when writing values.)

Is the standard deviation large, medium or small compared to the mean?

Measuring spread in skewed data using standard deviation is problematic because standard deviation is often many times bigger than the mean. Has normalization by population "improved" the standard deviation of the data?  In other words, is the per capita waste data less skewed than the unnormalized waste values?

4. Transform the data
When data are skewed to the right, we can often make the distribution more symmetrical by logging the data. Do this now: log the mean per capita RCRA values for each county, and store the results in list L8. Record the logged values on your data sheet. Then sketch a frequency histogram of the logged values. Include units and labels.

How does the histogram of the transformed data (log of the per capita mean RCRA values) compare to the two histograms that you sketched previously?

Compute the mean and standard deviation for the transformed data. Include units of measure.

Is the standard deviation less than, equal to, or greater than the mean?

Is the standard deviation "small", "medium" or "large", as compared to the mean?  Explain briefly.

For the transformed data, calculate the 7 numbers: Use these 7 numbers to determine if the transformed data are normally distributed. Show work.

5. Carrots and Sticks
You have transformed the county data into a distribution that is closer to normal. Now you come up with the following idea to impose waste fines.  Based on the transformed data, impose the highest fines on counties that lie more than 3 standard deviations above the mean, impose moderate fines on counties that lie between 2 and 3 standard deviations above the mean, impose small fines on counties that lie between 1 and 2 standard deviations above the mean, and very small fines for those counties between the mean and 1 standard deviation above the mean.  On your data sheet, under the column "st. dev. category", indicate which counties are in the categories:   ">3", "2 to 3", "1 to 2", or "0 to 1".

To reward counties that produce the least amount of RCRA waste per person, you will give waste credits that can be sold in the market.   On your data sheet, for those counties whose RCRA wastes are below the mean, mark categories "<-3", "-3 to -2", "-2 to -1", and "-1 to 0".

Now you get good results with this penalty and reward system.  Overall, polluters are given monetary incentives to improve their standard deviation score.  In fact, you suggest that all states take up your system.  Your boss likes the idea, but she has some questions:

Is it possible that in some state most of the counties would be in the "above 3" or "below -3" categories?  This could be seen as politically "heavy handed", with lots of money flowing back and forth in fines and credits.  What is your answer?

How would this system work with a state like South Dakota , whose mean per capita RCRA waste is very low?  Won't most of the counties in South Dakota be getting pollution credits?

You've convinced your boss that this system will work, but now she has a third question.  When two counties lie in the same standard deviation category they are penalized or rewarded the same, even if their mean RCRA waste per capita numbers are different.  Is there some way to refine the rewards and incentives so that there is a continuous scale?

A continuous scale can be based on "z-scores" for each county.  A z-score is a number that indicates how many standard deviations each county lies above or below the mean.  Z-scores are computed with the simple formula: Here x is each county's logged per capita mean RCRA waste, xbar is the mean of logged per capita wastes, and s is the standard deviation.  The z-scores are positive if the county lies above the mean, and negative if they lie below.   Fill out the last column on the data sheet with the z-score for each county; round to 2 decimal places of accuracy.

Your boss thinks your z-score idea is great.  She now gives you enough money to impose fines and give credits.  She suggests a \$100,000 fine or credit per z-score (fines for positive z-scores, credits for negative z-scores).  Will your agency lose money, earn money, or break even?  Explain in detail.