Skip to main content

Command Palette

Search for a command to run...

Exploring the Relationship Between GDP and FDI: A Data-Driven Analysis

Updated
6 min read

Introduction

In a world increasingly defined by the movement of capital and economic interdependence, foreign direct investment (FDI) and gross domestic product (GDP) are two of the most telling indicators of a nation’s economic health. GDP reflects the total value of goods and services produced within a country, while FDI measures the flow of investments from foreign entities into domestic enterprises.

But are these two metrics connected? Do wealthier countries naturally attract more foreign investment? This article dives into these questions using a data-driven approach — combining global economic data, performing statistical analysis, and visualizing insights through Python.

Prerequisites

Before diving in, ensure you have a few technical foundations covered:

  • Python 3.7+ installed on your system

  • Jupyter Notebook or Google Colab for running code interactively

  • Basic familiarity with pandas, matplotlib, and scipy.stats

  • Understanding of concepts like data cleaning, merging datasets, and correlation analysis

Problem Statement

Foreign Direct Investment (FDI) has long been viewed as a catalyst for economic growth. However, the strength and nature of its relationship with GDP vary widely across countries and time periods.

Our challenge is to explore whether, for the year 2013, countries with higher GDP also tend to attract more FDI , and to what extent this relationship is statistically significant.

Aim

The primary objective of this analysis is to investigate whether a correlation exists between a country's GDP and the FDI it attracts. Specifically, we aim to:

  • Clean and prepare World Bank datasets for analysis

  • Merge GDP and FDI data from 2013 into a unified dataset

  • Calculate the statistical correlation between these indicators

  • Visualize the relationship through an effective scatterplot

  • Identify outliers and interpret the practical implications of our findings

Data Methodology

Step 1: Getting the Data

The analysis uses two World Bank datasets:

  • GDP (Gross Domestic Product) – indicator code: NY.GDP.MKTP.CD

  • FDI (Foreign Direct Investment, net inflows) – indicator code: BX.KLT.DINV.CD.WD

These datasets were imported and filtered for the year 2013:

import pandas as pd
import warnings
warnings.simplefilter('ignore', FutureWarning)

YEAR = 2013
fdi = pd.read_csv('API_BX.KLT.DINV.CD.WD_DS2_en_csv_v2.csv', skiprows=4)
gdp = pd.read_csv('API_NY.GDP.MKTP.CD_DS2_en_csv_v2.csv', skiprows=4)

# Reshape to Year–Value format
fdi_melted = fdi.melt(id_vars=['Country Name', 'Country Code'], var_name='Year', value_name='FDI_Value')
gdp_melted = gdp.melt(id_vars=['Country Name', 'Country Code'], var_name='Year', value_name='GDP_Value')

# Filter for 2013 and drop missing values
fdi_2013 = fdi_melted[(fdi_melted['Year'] == str(YEAR)) & (fdi_melted['FDI_Value'].notna())]
gdp_2013 = gdp_melted[(gdp_melted['Year'] == str(YEAR)) & (gdp_melted['GDP_Value'].notna())]

Step 2: Transforming the Data

Since both datasets were reported in US dollars, we converted them into millions of British pounds (£m) using the 2013 average exchange rate (1 USD = 1.564768 GBP):

def roundToMillions(value):
    return round(value / 1_000_000)

def usdToGBP(usd):
    return usd / 1.564768

fdi_2013['FDI (£m)'] = fdi_2013['FDI_Value'].apply(usdToGBP).apply(roundToMillions)
gdp_2013['GDP (£m)'] = gdp_2013['GDP_Value'].apply(usdToGBP).apply(roundToMillions)

This made the figures more readable and comparable across countries.

Step 3: Combining the Data

To analyze the relationship, the GDP and FDI datasets were merged using an inner join on the common country column:

fdi_clean = fdi_2013[['Country Name', 'FDI (£m)']].rename(columns={'Country Name': 'country'})
gdp_clean = gdp_2013[['Country Name', 'GDP (£m)']].rename(columns={'Country Name': 'country'})

merged_data = pd.merge(fdi_clean, gdp_clean, on='country', how='inner')
print(merged_data.head())

Sample output:

countryFDI (£m)GDP (£m)
Afghanistan3112,875
Angola-4,55084,574
Albania8018,178
UAE6,240255,769
Argentina6,277352,784

The Analysis Process

Step 4: Measuring Correlation

To quantify the strength of the relationship, the Spearman rank correlation coefficient was calculated:

from scipy.stats import spearmanr

corr, p_value = spearmanr(merged_data['GDP (£m)'], merged_data['FDI (£m)'])
print(f"The correlation is {corr}")
print(f"The p-value is {p_value}")

Output:

The correlation is 0.6385
The p-value is 7.48e-21
The correlation is statistically significant.

A coefficient of 0.64 indicates a moderate to strong positive correlation — meaning, generally, countries with higher GDP attract higher FDI inflows. The very low p-value confirms that this relationship is statistically significant.

Visualizing the Relationship

Step 5: Visualizing the Data

A scatter plot helps visualize the relationship more intuitively:

import matplotlib.pyplot as plt

merged_data.plot(
    x='GDP (£m)', 
    y='FDI (£m)', 
    kind='scatter', 
    grid=True, 
    logx=True, 
    figsize=(10, 4),
    title='FDI vs GDP (2013)'
)
plt.xlabel('GDP (£m) [log scale]')
plt.ylabel('FDI (£m)')
plt.show()

The logarithmic scale on the x-axis is crucial here because GDP values span several orders of magnitude—from millions to trillions of pounds. Without it, smaller economies would be crammed into an unreadable cluster. Most smaller economies cluster toward the lower-left, while major economies like the US, China, and the UK sit at the upper-right, confirming the correlation visually.

Project Insights

After completing this analysis, several important insights emerged:

  1. Strong but not perfect correlation: The 0.639 correlation confirms that larger economies generally attract more FDI, but it's far from a perfect relationship

  2. Outliers reveal the story: Countries like Belgium, Switzerland, and Finland show negative FDI despite being wealthy nations. This suggests they're mature economies exporting capital rather than importing it

  3. Small economies vary wildly: Nations with similar GDP levels show dramatically different FDI levels, indicating that factors beyond economic size matter—political stability, natural resources, business-friendly policies, and geographic location all play roles

  4. The logarithmic pattern: When GDP is plotted logarithmically, we see that most small economies cluster in the lower-left, while economic giants like the US, China, and Germany occupy the upper-right

Challenges Faced

Like many real-world analyses, this project wasn’t without challenges:

1. Data Format Complexity

The World Bank's wide-format CSV files required careful reshaping. Getting the melt() function parameters right took several iterations, especially ensuring the year filtering worked correctly.

2. Handling Negative Values

Negative FDI values initially seemed like data errors but turned out to be economically meaningful. This required research to understand that capital outflows are legitimate (if unusual) FDI measurements.

3. Country Name Inconsistencies

Removing aggregated countries(for example world, black carribean etc) was quite a hassle but I had to develop a list of all countries in the world and filter to it while carrying out the analysis

4. Scale Visualization

My first scatterplot was nearly unreadable because the linear scale compressed most data points. Switching to a logarithmic scale was essential for revealing the actual distribution.

Conclusion

This analysis confirms a statistically significant, positive relationship between Foreign Direct Investment and Gross Domestic Product in 2013. Countries with robust economies tend to attract greater FDI inflows, underscoring the cyclical link between national wealth and investor confidence.

However, the relationship isn’t absolute — smaller nations can outperform expectations through strategic policies, natural advantages, or openness to trade.

If you're interested in extending this analysis, consider:

  • Analyzing trends across multiple years rather than a single snapshot

  • Incorporating additional variables like political stability indices or ease-of-doing-business rankings

  • Investigating sector-specific FDI patterns

Data analysis isn't just about running statistical tests—it's about asking interesting questions, handling messy real-world data, and interpreting results in context. I hope this walkthrough inspires you to explore your own economic datasets and uncover the stories hidden within the numbers!

N

Thanks for the tutorial. You use Jupiter, and what other option do you recommend besides Jupiter?

1
O

Hi, thank you too

I will recommend Google collab, it also works fine

3

More from this blog

My Soft Launch into Tech Blogging

12 posts