Exploratory Data Analysis of World Happiness Report
![]() |
Photo by Tabea Damm on Unsplash |
Everyone has a favorite country that remains close to their heart no matter where they go or want to visit at least once in a lifetime. Some of the places I wish to visit are Australia, England, Italy and Switzerland, and the USA and there are numerous reasons for it in my mind like country's infrastructure, natural scenery, food and a lot more. However, I have always wondered, what must be the happiest country in the World and why. Is it strong economy or mutual respect that makes people in a country happy or is it health or higher per capita income? I couldn't get the answers to my questions before I stepped into the World of Data Science but now, as I have the skills to explore the data why not just discover what makes people in a country happy?
The Dataset
I found the World happiness report dataset on Kaggle. Click here to explore it yourself! It has 5 files that include World happiness data from 2015-2019. We will be using 2019 dataset having 9 columns.
Limitations
The dataset was last updates two years ago in 2019. Therefore, the happiness rank and reasons associated to it might have been changed.
The Project
1. Data Preparation
Start by importing pandas and loading the dataset. Now, let's explore the data before diving deeper into it. happiness_df.shape gives the total number of rows and columns in our dataset where happiness_df is our dataset and .shape is the function that outputs the shape (rows and columns) of a dataset.
Shape of our dataset |
We have 156 rows and 9 columns.
Now lets look onto the names of those 9 columns. happiness_df.columns will help us do this.
Column names |
We will use happiness_df.info() to check the datatype and non-null count of each column.
Datatype and non-null count |
See that every column has 156 non-null count, meaning 0 null values in each column. All the columns except Country or region are numeric.
2. Data Exploration
happiness_df.describe() gives basic statistics of all the numeric columns of a dataset.Let's just use it to have a look on ours
Statistics |
This is a very simple dataset and so the analyses. Therefore, we don't need all these statistics actually. However, minimum and maximum score, GDP and other variables might amaze you.
Now lets have a look on top and last five countries in our dataset.
Head of our dataset |
Tail of our dataset |
Finland is the most happiest country and South Sudan is the least.
3. Data Analysis and Visualisation
First import all the libraries we are gonna need and preset our chart styles.
Libraries and style preset |
Happiest countries
Lets look on the scores of the top 10 happiest countries and plot them in a bar plot.
happiest_countries = happiness_df[['Country or region', 'Score']].sort_values('Score', ascending = False).head(20) will give us top 20 happiest countries.
Happiest countries |
Plot the results in a bar plot
Happiest countries chart |
Least happy countries
Lets do the same with least happy countries. least_happy = happiness_df[['Country or region', 'Score']].sort_values('Score', ascending = False).tail(20)
Plotting the results we get
Least happy countries |
Score vs GDP per capita
First have a look on countries with highest GDP per capita. highest_gdp = happiness_df[['Country or region', 'GDP per capita', 'Score']].sort_values('GDP per capita', ascending=False).head(20)
Countries with highest GDP per capita |
Amazed to see Qatar having highest GDP per capita which is not among the happiest countries? Let's explore a relationship between happiness score and GDP per capita.
Subplots |
Line chart and scatter plot |
Can't see any pattern between both, right?
Score vs Social support
Repeat the steps above and explore the relationship between happiness score and social support in a country.
countries with highest social support |
Finland being the happiest country has the second highest social support as well. Other countries that are among the top happiest are also present here like Denmark, Norway, Iceland, Netherlands, New Zealand etc
Score vs Healthy life expectancy
Singapore having the highest healthy life expectancy but its not among top happiest countries. However, other countries like Switzerland, Canada and Australia etc are among the top happiest countries
countries with highest healthy life expectancy |
Line chart and scatter plot |
In the scatter plot of Healthy life expectancy we can see clusters as well with some outliers
Score vs Free dom to make life choices
Uzbekistan having the highest Freedom to make life choices is not among the top 20 happiest countries but Finland, Denmark, Norway and others are present here.
countries with highest freedom to make life choices |
Line chart and scatter plot |
Score vs GenerosityCan you see some clusters along with outliers?
See that along with the happiest countries, Syria is among countries with highest generosity rate which is one of the least happy countries.
countries with highest generosity |
Line chart and scatter plot |
See the clusters near happiness score 4, 5 and 7?
Score vs Perceptions of curroption
4. Impactful features?
We have analysed all the features and their relationship with happiness score and rank of a country but we still can't answer 'what makes people in a country happy?'. Lets plot a heatmap of correlation of all the features in our dataset.
Correlation is a statistical term that identifies relationship between variables or features. Let's see the variables with highest correlation with happiness score.
happiness_df.corr() will give the correlation. Assign the correlation to a variable, we have assigned it to happiness_rel;ationship to use later.
sns.heatmap(happiness_relationship, annot = True) will give us the desired results. Here, happiness_relationship is our correlation and annot = True just prints data value on each cell of the heatmap.
heatmap of correlation |
See that GDP per capita, social support and healthy life expectancy has correlation equal to or greater than 75% with score which indicates strong positive relationship.
Lets have a look on charts of these variables.
Score vs GDP per capita |
Score vs Social support |
Score vs Healthy life expectancy |
Look how rate of each of these variables increases with score. Hence, these three variables have the highest impact on happiness score of a country.
5. Conclusions
- Finland being the happiest country in the World doesn't have the highest GDP per capita. However, GDP per capita has strong positive relationship with the happiness score
- Qatar has the hgihest GDP per capita with 28th rank in the World happiness report
- Countries with the highest happiness score also have Highest social support and healthy life expectancy. However, countries who don't fall among the happiest countries also have high Social support and Healthy life expectancy rate
- GDP per capita, Social support and Healthy life expectancy has strong positive relationship with happiness score meaning, these factors together make people in a country happy
Using simple python syntax we have identified the countries with happiest people and the reasons behind their happiness. In the upcoming articles, we will explore complicated datasets and answer even complicated questions. Till then, happy exploring.
Thank You for reading. Feel free to give any feedback!
0 Comments