Exploring the Relationship Between GDP and FDI: A Data-Driven Analysis
Introduction
In a world increasingly defined by the movement of capital and economic interdependence, foreign direct investment (FDI) and gross domestic product (GDP) are two of the most telling indicators of a nation’s economic health. GDP reflects the total value of goods and services produced within a country, while FDI measures the flow of investments from foreign entities into domestic enterprises.
But are these two metrics connected? Do wealthier countries naturally attract more foreign investment? This article dives into these questions using a data-driven approach — combining global economic data, performing statistical analysis, and visualizing insights through Python.
Prerequisites
Before diving in, ensure you have a few technical foundations covered:
Python 3.7+ installed on your system
Jupyter Notebook or Google Colab for running code interactively
Basic familiarity with pandas, matplotlib, and scipy.stats
Understanding of concepts like data cleaning, merging datasets, and correlation analysis
Problem Statement
Foreign Direct Investment (FDI) has long been viewed as a catalyst for economic growth. However, the strength and nature of its relationship with GDP vary widely across countries and time periods.
Our challenge is to explore whether, for the year 2013, countries with higher GDP also tend to attract more FDI , and to what extent this relationship is statistically significant.
Aim
The primary objective of this analysis is to investigate whether a correlation exists between a country's GDP and the FDI it attracts. Specifically, we aim to:
Clean and prepare World Bank datasets for analysis
Merge GDP and FDI data from 2013 into a unified dataset
Calculate the statistical correlation between these indicators
Visualize the relationship through an effective scatterplot
Identify outliers and interpret the practical implications of our findings
Data Methodology
Step 1: Getting the Data
The analysis uses two World Bank datasets:
GDP (Gross Domestic Product) – indicator code:
NY.GDP.MKTP.CDFDI (Foreign Direct Investment, net inflows) – indicator code:
BX.KLT.DINV.CD.WD
These datasets were imported and filtered for the year 2013:
import pandas as pd
import warnings
warnings.simplefilter('ignore', FutureWarning)
YEAR = 2013
fdi = pd.read_csv('API_BX.KLT.DINV.CD.WD_DS2_en_csv_v2.csv', skiprows=4)
gdp = pd.read_csv('API_NY.GDP.MKTP.CD_DS2_en_csv_v2.csv', skiprows=4)
# Reshape to Year–Value format
fdi_melted = fdi.melt(id_vars=['Country Name', 'Country Code'], var_name='Year', value_name='FDI_Value')
gdp_melted = gdp.melt(id_vars=['Country Name', 'Country Code'], var_name='Year', value_name='GDP_Value')
# Filter for 2013 and drop missing values
fdi_2013 = fdi_melted[(fdi_melted['Year'] == str(YEAR)) & (fdi_melted['FDI_Value'].notna())]
gdp_2013 = gdp_melted[(gdp_melted['Year'] == str(YEAR)) & (gdp_melted['GDP_Value'].notna())]
Step 2: Transforming the Data
Since both datasets were reported in US dollars, we converted them into millions of British pounds (£m) using the 2013 average exchange rate (1 USD = 1.564768 GBP):
def roundToMillions(value):
return round(value / 1_000_000)
def usdToGBP(usd):
return usd / 1.564768
fdi_2013['FDI (£m)'] = fdi_2013['FDI_Value'].apply(usdToGBP).apply(roundToMillions)
gdp_2013['GDP (£m)'] = gdp_2013['GDP_Value'].apply(usdToGBP).apply(roundToMillions)
This made the figures more readable and comparable across countries.
Step 3: Combining the Data
To analyze the relationship, the GDP and FDI datasets were merged using an inner join on the common country column:
fdi_clean = fdi_2013[['Country Name', 'FDI (£m)']].rename(columns={'Country Name': 'country'})
gdp_clean = gdp_2013[['Country Name', 'GDP (£m)']].rename(columns={'Country Name': 'country'})
merged_data = pd.merge(fdi_clean, gdp_clean, on='country', how='inner')
print(merged_data.head())
Sample output:
| country | FDI (£m) | GDP (£m) |
| Afghanistan | 31 | 12,875 |
| Angola | -4,550 | 84,574 |
| Albania | 801 | 8,178 |
| UAE | 6,240 | 255,769 |
| Argentina | 6,277 | 352,784 |
The Analysis Process
Step 4: Measuring Correlation
To quantify the strength of the relationship, the Spearman rank correlation coefficient was calculated:
from scipy.stats import spearmanr
corr, p_value = spearmanr(merged_data['GDP (£m)'], merged_data['FDI (£m)'])
print(f"The correlation is {corr}")
print(f"The p-value is {p_value}")
Output:
The correlation is 0.6385
The p-value is 7.48e-21
The correlation is statistically significant.
A coefficient of 0.64 indicates a moderate to strong positive correlation — meaning, generally, countries with higher GDP attract higher FDI inflows. The very low p-value confirms that this relationship is statistically significant.
Visualizing the Relationship
Step 5: Visualizing the Data
A scatter plot helps visualize the relationship more intuitively:
import matplotlib.pyplot as plt
merged_data.plot(
x='GDP (£m)',
y='FDI (£m)',
kind='scatter',
grid=True,
logx=True,
figsize=(10, 4),
title='FDI vs GDP (2013)'
)
plt.xlabel('GDP (£m) [log scale]')
plt.ylabel('FDI (£m)')
plt.show()

The logarithmic scale on the x-axis is crucial here because GDP values span several orders of magnitude—from millions to trillions of pounds. Without it, smaller economies would be crammed into an unreadable cluster. Most smaller economies cluster toward the lower-left, while major economies like the US, China, and the UK sit at the upper-right, confirming the correlation visually.
Project Insights
After completing this analysis, several important insights emerged:
Strong but not perfect correlation: The 0.639 correlation confirms that larger economies generally attract more FDI, but it's far from a perfect relationship
Outliers reveal the story: Countries like Belgium, Switzerland, and Finland show negative FDI despite being wealthy nations. This suggests they're mature economies exporting capital rather than importing it
Small economies vary wildly: Nations with similar GDP levels show dramatically different FDI levels, indicating that factors beyond economic size matter—political stability, natural resources, business-friendly policies, and geographic location all play roles
The logarithmic pattern: When GDP is plotted logarithmically, we see that most small economies cluster in the lower-left, while economic giants like the US, China, and Germany occupy the upper-right
Challenges Faced
Like many real-world analyses, this project wasn’t without challenges:
1. Data Format Complexity
The World Bank's wide-format CSV files required careful reshaping. Getting the melt() function parameters right took several iterations, especially ensuring the year filtering worked correctly.
2. Handling Negative Values
Negative FDI values initially seemed like data errors but turned out to be economically meaningful. This required research to understand that capital outflows are legitimate (if unusual) FDI measurements.
3. Country Name Inconsistencies
Removing aggregated countries(for example world, black carribean etc) was quite a hassle but I had to develop a list of all countries in the world and filter to it while carrying out the analysis
4. Scale Visualization
My first scatterplot was nearly unreadable because the linear scale compressed most data points. Switching to a logarithmic scale was essential for revealing the actual distribution.
Conclusion
This analysis confirms a statistically significant, positive relationship between Foreign Direct Investment and Gross Domestic Product in 2013. Countries with robust economies tend to attract greater FDI inflows, underscoring the cyclical link between national wealth and investor confidence.
However, the relationship isn’t absolute — smaller nations can outperform expectations through strategic policies, natural advantages, or openness to trade.
If you're interested in extending this analysis, consider:
Analyzing trends across multiple years rather than a single snapshot
Incorporating additional variables like political stability indices or ease-of-doing-business rankings
Investigating sector-specific FDI patterns
Data analysis isn't just about running statistical tests—it's about asking interesting questions, handling messy real-world data, and interpreting results in context. I hope this walkthrough inspires you to explore your own economic datasets and uncover the stories hidden within the numbers!
