18
Page | 1 Estimating Health Effects of Exposure to Ambient PM2.5 and Ozone A Quantitative Data Analysis Report Author: Ankita Zaveri, Manidipa Banerjee, Wanda Benson, Nikisha Suraparaju Table of Contents 1. Introduction 2. Methodology a. AQI Median and AQI Maximum b. Ozone and PM2.5 Concentration for AQI c. Mortality Rate d. Prediction 3. Results and Discussion a. AQI Analysis b. Analysis of Ozone and PM2.5 c. Analysis of Deaths due to Respiratory Diseases d. Prediction Analysis 4. Conclusion 5. References

IBM Watson Project

Embed Size (px)

Citation preview

Page 1: IBM Watson Project

P a g e | 1

Estimating Health Effects of Exposure to Ambient

PM2.5 and Ozone

A Quantitative Data Analysis Report

Author: Ankita Zaveri, Manidipa Banerjee, Wanda Benson, Nikisha Suraparaju

Table of Contents

1. Introduction

2. Methodology

a. AQI Median and AQI Maximum

b. Ozone and PM2.5 Concentration for AQI

c. Mortality Rate

d. Prediction

3. Results and Discussion

a. AQI Analysis

b. Analysis of Ozone and PM2.5

c. Analysis of Deaths due to Respiratory Diseases

d. Prediction Analysis

4. Conclusion

5. References

Page 2: IBM Watson Project

P a g e | 2

Introduction

Air quality, in simplest terms is just the quality of the air that surrounds us. Poor

air quality can be described as the air that has been contaminated by high levels of harmful

gas emissions such as Sulfur Dioxide (SO2), Nitrogen Dioxide (NO2), Ozone (O3),

Particulate Matter (PM2.5), and Carbon Monoxide (CO). Each of these gases is a result of

the burning of fossil fuels such as oil, coal, and gasoline. Through everyday activities,

humans burn these fuels, causing increased emissions of harmful gases and ultimately

polluting the air and affecting the level of air quality.

Air quality levels differ in location. To find the air quality for various locations,

an Air Quality Index (AQI) would be used. An AQI allows us to see the level at which air

is polluted and determines whether the air quality is poor or not. Each nation carries its

own AQI that holds different standards. According to AirNow, the United States AQI

levels range from 0 to 500. The higher the AQI level is, the more polluted an area is. The

levels are split into 6 of the categories: “good” ranging from 0 to 50, “moderate” ranging

from 51 to 100, “unhealthy for sensitive groups” ranging from 101 to 150, “unhealthy”

ranging from 151 to 200, “very unhealthy” ranging from 201 to 300, and “hazardous”

ranging from 301 to 500. The AQI model and its 6 categories describe the levels of danger

in respects to one’s health. (2016)

For many years now, poor air quality has been a major environmental concern

when relating to health effects. For this study, we are going into depth with two of the

five popular gases mentioned earlier, which are Ozone (O3) and Particulate Matter

(PM2.5). “The ozone layer found high in the upper atmosphere (the stratosphere) shields

us from much of the sun’s ultraviolet radiation. However, ozone air pollution at ground

Page 3: IBM Watson Project

P a g e | 3

level where we can breathe it (in the troposphere) causes serious health problems.” (State

of the Air, 2014) O3 is not directly caused by human activity alone but it is definitely

influenced by what we do in our day to day lives. When the Ozone from the sun combines

with Nitrogen Oxide and Volatile Organic Compounds (which are often caused by

vehicles and heat combustion), it creates a ground level Ozone smog. (State of the Air,

2014) This makes O3 “the most complex, difficult to control, and pervasive of the six

principal air pollutants.”(Environmental Protection Agency, 2016a) Particulate Matter

(PM2.5) on the other hand is a combination of various liquid and solid particles in the air.

These particles range in size and are products of, dust, burned fuels/substances, smoke,

and much more.

Both O3 and PM2.5 pose a threat to our respiratory health that can affect us in more

ways than we realize. Believe it or not, respiratory diseases caused by such gases can even

result in death. Within our 50 United States, air quality levels are showing constant change

and differentiation depending on location and year. By looking at the U.S AQI, we can

examine average median and maximum AQI levels and see how they correlate with

respiratory related deaths over a certain period of time.

Page 4: IBM Watson Project

P a g e | 4

Methodology

We have used a quantitative research methodology for this study. Online

repositories such as AirNow, CDC WONDER, and the Environmental Protection

Agency provide the data of AQI median and maximum levels for all the states that

correlate with the increased number of respiratory diseases and percentage of deaths or

people at risk. It can be categorized according to our scope of the project.

1. AQI Median and AQI Maximum

To find the relation between AQI Median and AQI Maximum, we considered

data for five consecutive years (2011-2015) that depicts the correlation between AQI

and two groups of states with high and low O3 and PM2.5 levels simultaneously.

Figure 1: AQI Median and Maximum vs States- Dashboard

Page 5: IBM Watson Project

P a g e | 5

Fig 1.1 shows the AQI concentration in all 50 states as a GeoMap visualization. The

map shows the average calculation of AQI Median and AQI maximum as point value

and heat points respectively. The adjacent figure shows the same average amount for the

mentioned years. Using this figure we can find the correlation between the levels of AQI

Median and AQI Maximum for various states.

Fig 1.2 and 1.3 differentiates the top 10 states that have highest AQI maximum

and AQI median average values. The overall report is based on the facts and figures,

gathered from disparate sources that helped us to aggregate the states, having highest

AQI median and maximum levels.

2. Ozone and PM2.5 Concentration for AQI

To achieve specific data on the AQI level that are leading to respiratory diseases,

we found data that reveals that Ozone and PM2.5 are the two main pollutants for the

increased number of lung diseases in all the mentioned states.

Figure 2: Ozone and PM2.5 Vs Twenty States- Dashboard

Page 6: IBM Watson Project

P a g e | 6

Fig 2.1 depicts the O3 and PM heat regions for ten consecutive years( 2006-

2015) where we have considered 4th maximum Ozone level for every 8-hr and the

weighted mean of PM2.5 level per 24 hr as the relevant factors.

Yearly wise comparison between these two pollutants in Fig 2.2 presents the data

that are not identifiable in Fig 2.1.

Fig 2.3 and 2.3 depicts the Ozone and PM2.5 levels for those states that we have

aggregated in the Fig 1.3 and 1.4 respectively. According to Fann, Neal, Amy D.

Lamson, Susan C. Anenberg, Karen Wesson, David Risley, and Bryan J. Hubbell.

"Estimating the National Public Health Burden Associated with Exposure to Ambient

PM2.5 and Ozone." Risk Analysis 32.1 (2011): 81-95. Web, PM 2.5 and O3 are directly

related to increase number of mortality. So we wanted to distinguish all the states that

have the highest and lowest levels of those pollutants.

3. Mortality Rate

Furthermore, we wanted to project a view that would show the mortality rate

relative to crude rate of the mentioned states.

Figure 3: Mortality Rate (due to respiratory diseases) Vs Twenty States- Dashboard

Page 7: IBM Watson Project

P a g e | 7

Fig 3.1 presents the combined study of yearly death rate in the states that have

highest and lowest O3 and PM2.5 levels , whereas Fig 3.2 depicts the crude rate per

100,000 deaths due to respiratory diseases caused by the pollutants

Fig 3.3 and 3.4 assembles the states that are having the highest levels of O3 and

PM2.5 and consequently higher rate of deaths due to respiratory diseases.

4. Prediction

IBM Watson Analytics is capable of providing an efficient predictive analysis on

the gathered data. So we wanted to explore the software to acquire an estimation of

mortality rate in the states with respect to Crude Rate.

Figure 4: Mortality Rate vs States - Predictive Dashboard

We considered “Percentage of Deaths” to be the “target” value for the prediction

to have a general prediction on the mortality rate for those states.

Page 8: IBM Watson Project

P a g e | 8

Results and Discussion

1. AQI Analysis

Fig. 1 shows us the geographic distribution of AQI median and AQI maximum

concentrations across the United States over a period of five years (2011-2015). The

interactive component of Fig 1 allows us to make multiple observations of AQI trends

for various states.

Figure 1: California (interactive dashboard)

The above figure shows the relationship between AQI median and AQI

maximum for the state of California for a span of five years. As we select California in

the map visualization the rest of the figures in the dashboard align to display state-wise

statistics. It can be observed that California has the highest average AQI median (46)

and AQI maximum (285) compared to other states. Fig 1.1 bubble visualization shows

the yearly trend for AQI median and AQI maximum. From these visualizations we can

Page 9: IBM Watson Project

P a g e | 9

deduce an overall reduction of 14% in California’s AQI median from 34 in 2011 to 29 in

2015.

Figure 1: New Mexico (interactive dashboard)

Similarly, in the above figure we can view detailed statistics for New Mexico.

Fig 1.2 point visualization section shows a rapidly reducing trend of AQI maximum

from 209 in 2011 to 89 in 2015. This shows a significant reduction of 57%.

Figs 1.2 and 1.3 show us detailed visuals for ten states with the highest AQI

median and AQI maximum. As California is the only common state between the figures

we can conclude that it is not necessary for AQI median and AQI maximum to have a

positive correlation. The states with high AQI maximum can have a very low AQI

median and vice versa. In the above figure we can view that New Mexico, a state with

significantly high AQI maximum doesn’t have a high median as well. We considered

only the AQI median variable for further analysis as it shows a more stable trend over

the years.

Page 10: IBM Watson Project

P a g e | 10

2. Analysis of Ozone and PM2.5 levels

Fig. 2 shows the visualizations on concentration of ground level Ozone and

PM2.5 in various states for a span of ten years (2006-2015). AQI median of a state is

positively correlated with its ozone and PM2.5 levels. Five out of ten states with high

levels of AQI median also have high ozone and PM2.5 levels.

Figure 2: California (interactive dashboard)

The above figure helps us analyze the statistics for the state of New Mexico with

detailed visualizations. New Mexico falls under the states with lowest ozone and PM2.5

levels with Ozone 4th Max of 26 (41.7% lower than California) and PM2.5 Weighted

Mean of 38 (74.8% lower than California). The ozone and PM2.5 trends in the bubble

visualization show a reduction of 26% from 2006 to 2015.

Figs. 2.2 and 2.3 help us categorize the states according to their highest and

lowest levels ozone and PM2.5. The comparison between the air qualities of these states

can be drawn more clearly by viewing if they belong to a category of high or low ozone

and PM2.5.

Page 11: IBM Watson Project

P a g e | 11

3. Analysis of Deaths due to Respiratory Diseases

The twenty states with the highest and lowest levels of ozone and PM2.5 are

further analyzed by observing their crude death rate per 100,000 due to respiratory

diseases from 2006 to 2014. The trend for overall rate for all the states has increased

from 79.8% to 86.4% from 2006 to 2014.

Figure 3: 2006 Statistics (interactive dashboard)

The above figure shows an interactive analysis of the dashboard for the year

2006. In fig 3.1 We can view that the level of date rate for all twenty states doesn’t

fluctuate much.

Page 12: IBM Watson Project

P a g e | 12

Figure 3: 2014 Statistics (interactive dashboard)

When comparing the statistics for 2006 to those of 2014 we can view the

difference in the levels of death rates over the years. We can observe the states with

higher ozone and PM2.5 levels having marginally higher number of deaths. Similarly,

states with lower ozone and PM2.5 levels having lower number of deaths. The

comparison of these statistics gives us insight on the relation between the levels of ozone

and PM2.5, and mortality rate due to respiratory diseases.

.

Page 13: IBM Watson Project

P a g e | 13

4. Prediction Analysis

Considering the fact that we have acquired an excellent quality of data, it was pretty

easy to predict data out of the target data.

Figure 5: Data Quality Report

The percentage of deaths varying

symmetrically from minimum 0.04 to

maximum 0.12.

The percentage of deaths is predicted

based on the distributive variance of the

mortality rate data gathered for the

previous years.

Figure 6: Percentage of Deaths

Page 14: IBM Watson Project

P a g e | 14

Figure 6: Predicted Data - Highest Percentage of Deaths Vs States

The predictive data in the graph provides a range of mean percentage of deaths with

respect to the states where we found both high and low levels of O3 and PM2.5. On the

other hand, it also identifies the highest mean value of deaths in Kentucky.

The statistical details shows the high and low

mean value of deaths.

Additionally, it also reveals the unusual low

death rates that might be a concern in Alaska.

Figure 7: Statistical Analysis

Page 15: IBM Watson Project

P a g e | 15

5. Additional Information

Figure 8: Percentage of Deaths Vs Crude Rate

The figure shows how the percentage of death is predicted as a continuous target

to provide a linear regression approach to the predictive logic where Crude rate reveals

that there are thirty six records/ states where average percentage of deaths will be more

than 88%.

Page 16: IBM Watson Project

P a g e | 16

Conclusion

Based on the results that we found, we have concluded that O3 and PM2.5 levels

definitely have an effect on the number of respiratory related deaths. When comparing

Figure 3.3 and Figure 3.4, it is clear that the number of deaths is relatively high in the

states that have the highest O3 and PM2.5 levels and is drastically low in the states that

have the lowest O3 and PM2.5. By looking at Figure 3.2, you can see that the O3 and PM2.5

levels for all 50 states are increasing although fluctuating. This tells us that if these levels

continue to increase, the death by respiratory illnesses will increase as well.

As we see the trend heading in a negative direction, it is best that we find ways to

lower the O3 and PM2.5 levels in hopes to decrease the number of respiratory related deaths

per year. Government agencies in each state have come up with ways and solutions to

monitor and improve their air quality by aligning themselves with the requirements of the

Clean Air Act (CAA), passed by Congress in 1990. The Clean Air Act was and is still

driven by the Environmental Protection Agency (EPA). The Environmental Protection

Agency as a whole “provides guidance and technical assistance to assist state planning,

issues national emissions standards for new stationary sources, and reviews state plans to

ensure that they comply with the Act. Preconstruction permits are required for major new

and modified stationary sources. (Environmental Protection Agency, 2016b)

One thing we cannot do is compare one state’s air quality program to another

because every state is different in respects to how developed the environment is. A state

that has a rural setting compared to a state that has an urban setting will have significantly

different programs. The only thing that can be definitely done to bring improvement in

AQI levels is to have each state continue to follow the CAA guidelines. “The Clean Air

Page 17: IBM Watson Project

P a g e | 17

Act calls for state, local, federal and tribal governments to implement the Act in

partnership to reduce pollution. Roles vary depending on the nature of the air pollution

problem.” (Environmental Protection Agency, 2016b) As long as each state complies

with the rules and regulations of the CAA, the growth of poor AQI levels should slow

down, even to a point where the AQI levels might start decreasing. If particular areas end

up getting worse, then the only suggestion would be for the CAA guidelines to get stricter

all together or the individual programs for those specific areas would have to be revised.

These suggestions could possibly result in a lower number of respiratory related deaths.

When coming up with our results, we had quite a few limitations. Finding the right

data for health effects pertaining to poor AQI levels, or any health effects was fairly

difficult. Data was present none the less, but recent data for health effects on any topic

was scarce. Even if we did find recent health data, it was maybe for one year; nothing

strong enough to make predictions with. Secondly, due to confidentiality given by

hospitals to patients, we did not have access to data that could have given us better

evidence. Due to limitations such as these, we were left with data that were somewhat

inconsistent with each other. This also limited us on what comparisons could make.

As a result of being in the stages of initial research, we were not financially

equipped to purchase private health data from hospitals or any medical agencies. If

someone wanted to extend this research, they might have to purchase data to discover new

findings and correlations. By doing so, they may be able to compare the O3 and PM2.5

levels to other health risks/diseases such as asthma, chronic obstructive pulmonary

disease, and cardiovascular diseases. Extensive research on this topic could facilitate in

the process of making stronger predictions for the future.

Page 18: IBM Watson Project

P a g e | 18

References

AirNow. (2016, January 28). Air Quality Index (AQI) Basics. Retrieved March 30, 2016,

from https://airnow.gov/index.cfm?action=aqibasics.aqi

American Lung Association (2014). State of the Air 2014. Retrieved from

http://www.stateoftheair.org/2014/assets/ALA-SOTA-2014-Full.pdf

"CDC WONDER." CDC WONDER. Web. 31 Retrieved Mar. 2016, from

https://wonder.cdc.gov/

Caiazzo, Fabio, Akshay Ashok, Ian A. Waitz, Steve H.l. Yim, and Steven R.h. Barrett.

"Air Pollution and Early Deaths in the United States. Part I: Quantifying the Impact of

Major Sectors in 2005." Atmospheric Environment 79 (2013): 198-208. Web.

Environmental Protection Agency. (2016a, February 22). Ozone (O3). Retrieved March

30, 2016, from https://www3.epa.gov/airtrends/aqtrnd95/o3.html

Environmental Protection Agency. (2016b, March 29). The Clean Air Act: A

Partnership Among Governments. Retrieved March 31, 2016, from

https://www.epa.gov/clean-air-act-overview/clean-air-act-partnership-among-

governments

Fann, Neal, Amy D. Lamson, Susan C. Anenberg, Karen Wesson, David Risley, and

Bryan J. Hubbell. "Estimating the National Public Health Burden Associated with

Exposure to Ambient PM2.5 and Ozone." Risk Analysis 32.1 (2011): 81-95. Web.

HCUPnet: A Tool for Identifying, Tracking, and Analyzing National Hospital Statistics.

Web. Retrieved 31 Mar. 2016, from

https://hcupnet.ahrq.gov/HCUPnet.jsp?Id=B4147418EBDCAFAC