Poverty is a multidimensional issue that is difficult to describe based on income alone. The United States is still affected by poverty and will likely continue to be.
In The United States, poverty is defined by an income/economic threshold: an individual under age 65 that earns less than 13,064 per year, and an individual that is aged 65 or older that earns less 12,043-are considered to be living in poverty 1. There are more thresholds considering the family size.
The United Nations however, describes poverty in a way that describes the issue in a more comprehensive way, that looks beyond the economic impact. "Poverty is also social, political and cultural. Moreover, it is considered to undermine human rights - economic (the right to work and have an adequate income), social (access to health care and education), political (freedom of thought, expression and association) and cultural (the right to maintain one's cultural identity and be involved in a community's cultural life) 2 ."
For the purposes of this project, poverty will be analyzed in economic terms as defined by The United States census. The United States census data for year 2017, regarding poverty was downloaded for use in this project. Variables and descriptions are included in the workbook. Overall counts and percentages are included per state, county, as well as The United States as a whole.
The purpose of this project is to visualize poverty by county and state in The United States, to better understand the magnitude of this particular issue.
Please download the csv file "PovertyEstimates.xlsx" at: https://www.ers.usda.gov/data-products/county-level-data-sets/download-data/
Poverty estimates for the U.S., States, and counties, 2017 (see second tab in this workbook for variable name descriptions)
Source: U.S. Census Bureau, Model-based Small Area Income & Poverty Estimates (SAIPE) - https://www.census.gov/programs-surveys/saipe.html
#import in pandas,numpy and seaborn, matplotlib and offline plotly
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
import cufflinks as cf
#Read in dataset, read in the FIPS codes as strings for leading zeros are usually dropped.
pov=pd.read_excel("PovertyEstimates.xls", dtype={'FIPSTATE':'str', 'FIPSCOUNTY':str})
pov_states=pd.read_excel("Poverty States.xlsx")
pov_us=pd.read_excel("United States Poverty.xlsx")
#Change Columns of interest to strings (they are easier to work with as strings)
pov[['State', 'Area_Name']] = pov[['State', 'Area_Name']].astype(str)
pov.columns
#Viewing the first 10 rows for validation reasons.
pov.head(5)
In this dataset, we are depending upon heavily upon two columns: 'POVALL_2017'
which are the overall counts of poverty and 'PCTPOVALL_2017'
, which are the overall percentages of poverty per that particular area.
These two columns of interest have only one NaN or missing value and that is associated with The United States as it does not have an overall poverty percentage associated with it. NaN values will not be dropped given this information. For data pertaining to states and counties, there are no missing values.
#Missing values in data
count_nan = len(pov) - pov.count()
count_nan
#Descriptive Statistics
pov[['POVALL_2017', 'PCTPOVALL_2017']].describe()
First, we will look at poverty overall within the United States, and then take a look into state and county.
#Count of persons living in poverty UNited States as a whole, along with total percentage.
overall_count=pov_us['POVALL_2017'].max()
overall_pct_us=pov_us['PCTPOVALL_2017'].max()
print("Overall Estimated poverty count by persons in The United States: {}".format(overall_count))
print("Overall Estimated poverty percentage in The United States: {}".format(overall_pct_us))
# Pie chart to show poverty
labels = 'Living Above Poverty Threshold', 'Living Below Poverty Threshold'
sizes = [86.6, 13.4]
colors = ['lightblue', 'dodgerblue']
explode = (0, 0.1)
fig1, ax1 = plt.subplots()
ax1.pie(sizes, explode=explode, colors=colors,labels=labels, autopct='%1.1f%%',textprops={'color':"black"},
shadow=True, startangle=90)
ax1.axis('equal')
fig = plt.gcf()
fig.set_size_inches(12,8)
plt.title("Living Above Poverty vs. Living Below " + "United States, 2017", color='black',bbox=dict(facecolor='none', edgecolor='black', boxstyle='round,pad=1'))
plt.show()
In the United States in 2017, the population was 324,459,463 3. There was an estimated 42,583,651 persons (13.4%) of the total population living in poverty. The pie chart above conveys the large percentage of our population that poverty applied to. In other words, nearly 1 out of 8 people in the United States live in poverty.
Here we will look into poverty in regards to state.
#States included in output
overall_count_state=pov_states['POVALL_2017'].count()
#States with an overall percentage of poverty greater than or equal to 10%, and 20%
overall_pct_state_10=len(pov_states[pov_states['PCTPOVALL_2017'] >= 10.0])
overall_pct_state_20=len(pov_states[pov_states['PCTPOVALL_2017'] >= 20.0])
#10 States with the highest poverty counts and percentages
pct_state=pov_states.sort_values('PCTPOVALL_2017', ascending=False)[:10]
cnt_state=pov_states.sort_values('POVALL_2017', ascending=False)[:10]
print("Total States included: {}".format(overall_count_state))
print("States with overall poverty greater than or equal to 10%: {}".format(overall_pct_state_10))
print("States with overall poverty greater than or equal to 20%: {}".format(overall_pct_state_20))
#10 States with the highest count of persons living in poverty
print ("{}".format(cnt_state.State))
#10 States with the highest percentage of population living in poverty
print("{}".format(pct_state.State))
plt.figure(figsize=(20,12))
ax = sns.barplot(x="State", y="POVALL_2017", data=pov_states,
plt.figure(figsize=(20,12)))
plt.xlabel("States")
plt.ylabel("Poverty Counts")
plt.title("State by Poverty Counts, 2017")
plt.show()
sorted_pov_states = pov_states.sort_values('PCTPOVALL_2017')
plt.figure(figsize=(20,12))
ax = sns.barplot(x="State", y="PCTPOVALL_2017", data=sorted_pov_states,
palette="Blues_d")
plt.xlabel("States")
plt.ylabel("Poverty Percentages")
plt.title("State by Poverty Percentages, 2017")
plt.show()
Many states have a poverty percentage greater than or equal to 10%. There are no states that have a poverty percentage equal to or greater than 20%. Provided above are visualizations for state by poverty counts (individuals living in poverty) and state by poverty percentages (percentage of state's population that is living in poverty), and the results are vastly different.
The 10 states with the highest poverty counts overall are: California, Texas, Florida, New York, Ohio, Illinois, Pennsylvania, Georgia, North Carolina, and Michigan. The 10 states with the highest percentage of poverty are: Mississippi, Louisiana, New Mexico, West Virginia, Kentucky, Alabama, District of Columbia, Arkansas, Oklahoma, and South Carolina.
A likely explanation for the vast differences is population density. New Mexico, for instance, is a state that has roughly over 2 million residents total. A large portion of the state is still very rural, with few industries and opportunities to earn higher wages in those areas. The state of California is home to nearly 40 million residents, and thus greater opportunities for individuals to live at or below the poverty threshold, but the overall percentage of those living in poverty is lower than New Mexico.
This section will focus on poverty by county, and focus on the counties which are the most impoverished.
#Extracting the states (and overall United States) from the dataset to focus on counties
extract_states=sum(pov['FIPSCOUNTY'].apply(lambda x: x[:]) == '000')
print(extract_states)
counties_only=pov[pov['FIPSCOUNTY']!='000']
# Total number of counties
sum_counties=len(counties_only)
# Number of counties where poverty level is higher than 10%, 20%, 30%, 40%, 50%
n_greater_10 = len(counties_only[counties_only['PCTPOVALL_2017'] >= 10.0])
n_greater_20 = len(counties_only[counties_only['PCTPOVALL_2017'] >= 20.0])
n_greater_30 = len(counties_only[counties_only['PCTPOVALL_2017'] >= 30.0])
n_greater_40 = len(counties_only[counties_only['PCTPOVALL_2017'] >= 40.0])
n_greater_50 = len(counties_only[counties_only['PCTPOVALL_2017'] >= 50.0])
print("Total counties: {}".format(sum_counties))
print("Counties with poverty equal to or greater than 10%: {}".format(n_greater_10))
print("Counties with poverty equal to or greater than 20%: {}".format(n_greater_20))
print("Counties with poverty equal to or greater than 30%: {}".format(n_greater_30))
print("Counties with poverty equal to or greater than 40%: {}".format(n_greater_40))
print("Counties with poverty equal to or greater than 50%: {}".format(n_greater_50))
#The counties that have the highest count of individuals living in poverty.
pct_county=counties_only.sort_values('PCTPOVALL_2017',ascending=False)[:10]
cnt_county=counties_only.sort_values('POVALL_2017',ascending=False)[:10]
print (cnt_county[['Area_Name', 'State', 'POVALL_2017']])
#The counties that have the highest percentage of population living in poverty.
print (pct_county[['Area_Name', 'State', 'PCTPOVALL_2017']])
Included in the dataset are a total of 3,142 counties. Similar to the visualizations of counts of poverty by individuals in the United States, the counties that are located in highly populated states such as California and Texas, have the highest number of individuals living in poverty.
The vast majority of counties (over 80%), has 10% of residents that were living in poverty for year 2017. There are less counties that have 20% or greater of people that live in poverty. There are two counties with a 50% or more percentage of residents living in poverty: Ziebach County and Todd County, both located in South Dakota. Out of the 10 counties with the highest percentage of poverty in the United States, 5 are located in South Dakota. Below is an interavctive map of the poverty in The United States for the year of 2017.
#import plotly in order to visualize data in map form
import plotly.plotly as py
import plotly.figure_factory as ff
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import warnings
warnings.filterwarnings('ignore')
pov['FIPSTATE'] = pov['FIPSTATE'].apply(lambda x: str(x).zfill(2))
pov['FIPSCOUNTY'] = pov['FIPSCOUNTY'].apply(lambda x: str(x).zfill(3))
pov['FIPStxt'] = pov['FIPSTATE'] + pov['FIPSCOUNTY']
colorscale = ["#f7fbff", "#ebf3fb", "#deebf7", "#d2e3f3", "#c6dbef", "#b3d2e9", "#9ecae1",
"#85bcdb", "#6baed6", "#57a0ce", "#1f77b4","#4292c6", "#3082be", "#2171b5", "#1361a9",
"#08519c", "#0b4083", "#08306b"
]
endpts = list(np.linspace(1, 50, len(colorscale) - 1))
fips = pov['FIPStxt'].tolist()
values = pov['PCTPOVALL_2017'].tolist()
fig = ff.create_choropleth(
fips = fips, values = values, scope = ['usa'],
binning_endpoints = endpts, colorscale = colorscale,
state_outline={'color': 'rgb(15, 15, 55)','width': .5},
show_state_data = True,
show_hover = True, centroid_marker = {
'opacity': 0
},
asp = 3.0,
title = 'USA by County Poverty Percentages, 2017 ',
legend_title = '% Poverty'
)
py.iplot(fig, filename = 'choropleth_full_usa')
In conclusion, poverty is an issue that warrants concern. It affects a large portion of the population of The United States and can have lasting effects for the area. Special care should be made to the multifaceted issues that can cause poverty and address them respectfully.