Photo by Tabea Damm on Unsplash

Everyone has a favorite country that remains close to their heart no matter where they go or want to visit at least once in a lifetime. Some of the places I wish to visit are Australia, England, Italy and Switzerland, and the USA and there are numerous reasons for it in my mind like country's infrastructure, natural scenery, food and a lot more. However, I have always wondered, what must be the happiest country in the World and why. Is it strong economy or mutual respect that makes people in a country happy or is it health or higher per capita income? I couldn't get the answers to my questions before I stepped into the World of Data Science but now, as I have the skills to explore the data why not just discover what makes people in a country happy?

The Dataset

I found the World happiness report dataset on Kaggle. Click here to explore it yourself! It has 5 files that include World happiness data from 2015-2019. We will be using 2019 dataset having 9 columns.

Limitations

The dataset was last updates two years ago in 2019. Therefore, the happiness rank and reasons associated to it might have been changed.

The Project

1. Data Preparation

Start by importing pandas and loading the dataset. Now, let's explore the data before diving deeper into it. happiness_df.shape gives the total number of rows and columns in our dataset where happiness_df is our dataset and .shape is the function that outputs the shape (rows and columns) of a dataset.

Shape of our dataset

We have 156 rows and 9 columns.

Now lets look onto the names of those 9 columns. happiness_df.columns will help us do this.

Column names


We will use happiness_df.info() to check the datatype and non-null count of each column.

Datatype and non-null count

See that every column has 156 non-null count, meaning 0 null values in each column. All the columns except Country or region are numeric.

2. Data Exploration

happiness_df.describe() gives basic statistics of all the numeric columns of a dataset.Let's just use it to have a look on ours

Statistics

This is a very simple dataset and so the analyses. Therefore, we don't need all these statistics actually. However, minimum and maximum score, GDP and other variables might amaze you.

Now lets have a look on top and last five countries in our dataset.

Head of our dataset

Tail of our dataset

Finland is the most happiest country and South Sudan is the least.

3. Data Analysis and Visualisation

First import all the libraries we are gonna need and preset our chart styles.

Libraries and style preset

Happiest countries

Lets look on the scores of the top 10 happiest countries and plot them in a bar plot.

happiest_countries = happiness_df[['Country or region', 'Score']].sort_values('Score', ascending = False).head(20) will give us top 20 happiest countries.

Happiest countries

Plot the results in a bar plot


Happiest countries chart

Least happy countries

Lets do the same with least happy countries. least_happy = happiness_df[['Country or region', 'Score']].sort_values('Score', ascending = False).tail(20) Plotting the results we get

Least happy countries

Score vs GDP per capita

First have a look on countries with highest GDP per capita. highest_gdp = happiness_df[['Country or region', 'GDP per capita', 'Score']].sort_values('GDP per capita', ascending=False).head(20)

Countries with highest GDP per capita

Amazed to see Qatar having highest GDP per capita which is not among the happiest countries? Let's explore a relationship between happiness score and GDP per capita.

Subplots


Line chart and scatter plot

Can't see any pattern between both, right?

Score vs Social support

Repeat the steps above and explore the relationship between happiness score and social support in a country.

countries with highest social support

Finland being the happiest country has the second highest social support as well. Other countries that are among the top happiest are also present here like Denmark, Norway, Iceland, Netherlands, New Zealand etc

Line chart and scatter plot

Can you see some clusters in scatter plot?

Score vs Healthy life expectancy

Singapore having the highest healthy life expectancy but its not among top happiest countries. However, other countries like Switzerland, Canada and Australia etc are among the top happiest countries

countries with highest healthy life expectancy

Line chart and scatter plot

In the scatter plot of Healthy life expectancy we can see clusters as well with some outliers

Score vs Free dom to make life choices

Uzbekistan having the highest Freedom to make life choices is not among the top 20 happiest countries but Finland, Denmark, Norway and others are present here.

countries with highest freedom to make life choices


Line chart and scatter plot


Score vs GenerosityCan you see some clusters along with outliers?

See that along with the happiest countries, Syria is among countries with highest generosity rate which is one of the least happy countries.

countries with highest generosity

Line chart and scatter plot

See the clusters near happiness score 4, 5 and 7?

Score vs Perceptions of curroption

Remember Singapore from the bar chart of highest healthy life expectancy? It has the highest perception of corruption as well. Isn't it strange?

countries with highest perceptions of corruption


Line chart and scatter plot

This again makes no concrete sense. Lets look at the broader picture in the next step.

4. Impactful features?

We have analysed all the features and their relationship with happiness score and rank of a country but we still can't answer 'what makes people in a country happy?'. Lets plot a heatmap of correlation of all the features in our dataset.

Correlation is a statistical term that identifies relationship between variables or features. Let's see the variables with highest correlation with happiness score.

happiness_df.corr() will give the correlation. Assign the correlation to a variable, we have assigned it to happiness_rel;ationship to use later.

sns.heatmap(happiness_relationship, annot = True) will give us the desired results. Here, happiness_relationship is our correlation and annot = True just prints data value on each cell of the heatmap.

heatmap of correlation

See that GDP per capita, social support and healthy life expectancy has correlation equal to or greater than 75% with score which indicates strong positive relationship.

Lets have a look on charts of these variables.

Score vs GDP per capita


Score vs Social support

Score vs Healthy life expectancy

Look how rate of each of these variables increases with score. Hence, these three variables have the highest impact on happiness score of a country.

5. Conclusions

  • Finland being the happiest country in the World doesn't have the highest GDP per capita. However, GDP per capita has strong positive relationship with the happiness score
  • Qatar has the hgihest GDP per capita with 28th rank in the World happiness report
  • Countries with the highest happiness score also have Highest social support and healthy life expectancy. However, countries who don't fall among the happiest countries also have high Social support and Healthy life expectancy rate
  • GDP per capita, Social support and Healthy life expectancy has strong positive relationship with happiness score meaning, these factors together make people in a country happy

Using simple python syntax we have identified the countries with happiest people and the reasons behind their happiness. In the upcoming articles, we will explore complicated datasets and answer even complicated questions. Till then, happy exploring.

Thank You for reading. Feel free to give any feedback!




0 Comments