World Spotify Data Analysis
The goal of this mini-project is to investigate how Spotify prices vary globally. In one of the courses I am taking, "Globalization and Economic Policy" (PP418), we discussed the principle that, in the absence of transport costs, product differentiation, and other frictions, prices should converge. This concept known as the law of one price. One might initially assume that digital products, which have low delivery frictions, would exhibit similar prices globally. However, the opposite appears to be true. This project aims to explore this puzzle further.
Findings
My investigation revealed substantial variation in Spotify prices across countries. The focus of the analysis was the price of an individual Spotify subscription. Key statistics include:
- Mean monthly price: $5.67 USD
- Standard deviation: $3.14 USD
- Price range: \$0.84 to $15.40 USD
Interestingly, there was a strong correlation between Spotify prices and GDP per capita, supporting a price discrimination narrative where Spotify tailors prices to each country's willingness to pay. A regression of Spotify price on GDP per capita found a statistically significant relationship at a confidence level exceeding 99%.
The results showed:
- T-value: 12.88
- P-value: 2e-16
- Effect size: For every \$1 increase in GDP per capita, the Spotify price increased by \$0.00007406 USD.
When region-specific dummy variables were added to the model, similar results were observed. Full regression results are presented below.
────────────────────────────────────────────────────────────────────────
(1) (2)
─────────────────────────────────────────
(Intercept) 4.23759370 *** 2.84854979 ***
(0.19895952) (0.24292333)
GDP.per.capita..current.US.. 0.00007315 *** 0.00005132 ***
(0.00000559) (0.00000502)
RegionAmericas 2.35366651 ***
(0.39132262)
RegionAsia 0.84890086 *
(0.37868532)
RegionEurope 3.80082692 ***
(0.40909629)
RegionOceania 3.54120627 ***
(0.53240655)
─────────────────────────────────────────
N 174 174
R2 0.49861273 0.70566480
logLik -382.24840726 -335.90702626
AIC 770.49681452 685.81405253
────────────────────────────────────────────────────────────────────────
*** p < 0.001; ** p < 0.01; * p < 0.05.
Theories While these findings do not disprove the law of one price, several factors may explain the observed price differences:
- Price discrimination: Spotify appears to adjust prices based on each country’s willingness to pay.
- Product differentiation: The Spotify song catalogue may vary significantly between countries and regions, meaning the product being sold is not entirely uniform.
- Market competition: The degree of competition in music streaming services may differ across regions, influencing Spotify's pricing strategy.
- Technical barriers: Spotify likely requires a local bank account to purchase the service which is a large barrier to getting the service for a cheaper price.
Code
I’m neither a programmer nor a data scientist, so I’m not claiming this is the most efficient method, but here’s how I acquired the data. I ran a loop to scrape each Spotify pricing plan page and used the Beautiful Soup package to extract the HTML containing the price for individual plans. From there, I cleaned the extracted data to retain only numeric values.
Unfortunately, a significant amount of manual cleaning was required afterward. First, some of the scraping failed, particularly for languages that did not use Latin characters. Second, I had to manually check which countries priced their plans in USD, EUR, or AUD rather than the local currency. Lastly, some countries, such as India, priced Spotify subscriptions for a minimum of two months, which required adjustments.
After addressing these issues, I had a dataset with Spotify prices in various currency units.
import requests
from bs4 import BeautifulSoup
import pandas as pd
import re
countries = [
# Africa
{"country": "Algeria", "iso_code": "DZ"},
{"country": "Angola", "iso_code": "AO"},
{"country": "Benin", "iso_code": "BJ"},
{"country": "Botswana", "iso_code": "BW"},
{"country": "Burkina Faso", "iso_code": "BF"},
{"country": "Burundi", "iso_code": "BI"},
{"country": "Cameroon", "iso_code": "CM"},
{"country": "Cape Verde", "iso_code": "CV"},
{"country": "Chad", "iso_code": "TD"},
{"country": "Comoros", "iso_code": "KM"},
{"country": "Côte d'Ivoire", "iso_code": "CI"},
{"country": "Democratic Republic of the Congo", "iso_code": "CD"},
{"country": "Djibouti", "iso_code": "DJ"},
{"country": "Egypt", "iso_code": "EG"},
{"country": "Ethiopia", "iso_code": "ET"},
{"country": "Equatorial Guinea", "iso_code": "GQ"},
{"country": "Eswatini", "iso_code": "SZ"},
{"country": "Gabon", "iso_code": "GA"},
{"country": "Gambia", "iso_code": "GM"},
{"country": "Ghana", "iso_code": "GH"},
{"country": "Guinea", "iso_code": "GN"},
{"country": "Guinea-Bissau", "iso_code": "GW"},
{"country": "Kenya", "iso_code": "KE"},
{"country": "Lesotho", "iso_code": "LS"},
{"country": "Liberia", "iso_code": "LR"},
{"country": "Libya", "iso_code": "LY"},
{"country": "Madagascar", "iso_code": "MG"},
{"country": "Malawi", "iso_code": "MW"},
{"country": "Mali", "iso_code": "ML"},
{"country": "Mauritania", "iso_code": "MR"},
{"country": "Mauritius", "iso_code": "MU"},
{"country": "Morocco", "iso_code": "MA"},
{"country": "Mozambique", "iso_code": "MZ"},
{"country": "Namibia", "iso_code": "NA"},
{"country": "Niger", "iso_code": "NE"},
{"country": "Nigeria", "iso_code": "NG"},
{"country": "Republic of the Congo", "iso_code": "CG"},
{"country": "Rwanda", "iso_code": "RW"},
{"country": "São Tomé and Príncipe", "iso_code": "ST"},
{"country": "Senegal", "iso_code": "SN"},
{"country": "Seychelles", "iso_code": "SC"},
{"country": "Sierra Leone", "iso_code": "SL"},
{"country": "South Africa", "iso_code": "ZA"},
{"country": "Tanzania", "iso_code": "TZ"},
{"country": "Togo", "iso_code": "TG"},
{"country": "Tunisia", "iso_code": "TN"},
{"country": "Uganda", "iso_code": "UG"},
{"country": "Zambia", "iso_code": "ZM"},
{"country": "Zimbabwe", "iso_code": "ZW"},
# Asia
{"country": "Armenia", "iso_code": "AM"},
{"country": "Azerbaijan", "iso_code": "AZ"},
{"country": "Bahrain", "iso_code": "BH"},
{"country": "Bangladesh", "iso_code": "BD"},
{"country": "Bhutan", "iso_code": "BT"},
{"country": "Brunei Darussalam", "iso_code": "BN"},
{"country": "Cambodia", "iso_code": "KH"},
{"country": "Georgia", "iso_code": "GE"},
{"country": "Hong Kong", "iso_code": "HK"},
{"country": "India", "iso_code": "IN"},
{"country": "Indonesia", "iso_code": "ID"},
{"country": "Iraq", "iso_code": "IQ"},
{"country": "Israel", "iso_code": "IL"},
{"country": "Japan", "iso_code": "JP"},
{"country": "Jordan", "iso_code": "JO"},
{"country": "Kuwait", "iso_code": "KW"},
{"country": "Kyrgyzstan", "iso_code": "KG"},
{"country": "Lao People's Democratic Republic", "iso_code": "LA"},
{"country": "Lebanon", "iso_code": "LB"},
{"country": "Macao", "iso_code": "MO"},
{"country": "Malaysia", "iso_code": "MY"},
{"country": "Maldives", "iso_code": "MV"},
{"country": "Mongolia", "iso_code": "MN"},
{"country": "Nepal", "iso_code": "NP"},
{"country": "Oman", "iso_code": "OM"},
{"country": "Pakistan", "iso_code": "PK"},
{"country": "Palestine", "iso_code": "PS"},
{"country": "Philippines", "iso_code": "PH"},
{"country": "Qatar", "iso_code": "QA"},
{"country": "Saudi Arabia", "iso_code": "SA"},
{"country": "Singapore", "iso_code": "SG"},
{"country": "South Korea", "iso_code": "KR"},
{"country": "Sri Lanka", "iso_code": "LK"},
{"country": "Taiwan", "iso_code": "TW"},
{"country": "Tajikistan", "iso_code": "TJ"},
{"country": "Thailand", "iso_code": "TH"},
{"country": "Timor-Leste", "iso_code": "TL"},
{"country": "United Arab Emirates", "iso_code": "AE"},
{"country": "Uzbekistan", "iso_code": "UZ"},
{"country": "Vietnam", "iso_code": "VN"},
# Europe
{"country": "Albania", "iso_code": "AL"},
{"country": "Andorra", "iso_code": "AD"},
{"country": "Austria", "iso_code": "AT"},
{"country": "Belarus", "iso_code": "BY"},
{"country": "Belgium", "iso_code": "BE"},
{"country": "Bosnia and Herzegovina", "iso_code": "BA"},
{"country": "Bulgaria", "iso_code": "BG"},
{"country": "Croatia", "iso_code": "HR"},
{"country": "Cyprus", "iso_code": "CY"},
{"country": "Czech Republic", "iso_code": "CZ"},
{"country": "Denmark", "iso_code": "DK"},
{"country": "Estonia", "iso_code": "EE"},
{"country": "Finland", "iso_code": "FI"},
{"country": "France", "iso_code": "FR"},
{"country": "Germany", "iso_code": "DE"},
{"country": "Greece", "iso_code": "GR"},
{"country": "Hungary", "iso_code": "HU"},
{"country": "Iceland", "iso_code": "IS"},
{"country": "Ireland", "iso_code": "IE"},
{"country": "Italy", "iso_code": "IT"},
{"country": "Kazakhstan", "iso_code": "KZ"},
{"country": "Latvia", "iso_code": "LV"},
{"country": "Liechtenstein", "iso_code": "LI"},
{"country": "Lithuania", "iso_code": "LT"},
{"country": "Luxembourg", "iso_code": "LU"},
{"country": "Malta", "iso_code": "MT"},
{"country": "Moldova", "iso_code": "MD"},
{"country": "Monaco", "iso_code": "MC"},
{"country": "Montenegro", "iso_code": "ME"},
{"country": "Netherlands", "iso_code": "NL"},
{"country": "North Macedonia", "iso_code": "MK"},
{"country": "Norway", "iso_code": "NO"},
{"country": "Poland", "iso_code": "PL"},
{"country": "Portugal", "iso_code": "PT"},
{"country": "Romania", "iso_code": "RO"},
{"country": "Serbia", "iso_code": "RS"},
{"country": "Slovakia", "iso_code": "SK"},
{"country": "Slovenia", "iso_code": "SI"},
{"country": "Spain", "iso_code": "ES"},
{"country": "Sweden", "iso_code": "SE"},
{"country": "Switzerland", "iso_code": "CH"},
{"country": "Turkey", "iso_code": "TR"},
{"country": "Ukraine", "iso_code": "UA"},
{"country": "United Kingdom", "iso_code": "GB"},
# North America
{"country": "Antigua and Barbuda", "iso_code": "AG"},
{"country": "Bahamas", "iso_code": "BS"},
{"country": "Barbados", "iso_code": "BB"},
{"country": "Belize", "iso_code": "BZ"},
{"country": "Canada", "iso_code": "CA"},
{"country": "Costa Rica", "iso_code": "CR"},
{"country": "Dominica", "iso_code": "DM"},
{"country": "Dominican Republic", "iso_code": "DO"},
{"country": "El Salvador", "iso_code": "SV"},
{"country": "Grenada", "iso_code": "GD"},
{"country": "Guatemala", "iso_code": "GT"},
{"country": "Haiti", "iso_code": "HT"},
{"country": "Honduras", "iso_code": "HN"},
{"country": "Jamaica", "iso_code": "JM"},
{"country": "Mexico", "iso_code": "MX"},
{"country": "Nicaragua", "iso_code": "NI"},
{"country": "Panama", "iso_code": "PA"},
{"country": "Saint Kitts and Nevis", "iso_code": "KN"},
{"country": "Saint Lucia", "iso_code": "LC"},
{"country": "Saint Vincent and the Grenadines", "iso_code": "VC"},
{"country": "Trinidad and Tobago", "iso_code": "TT"},
{"country": "United States", "iso_code": "US"},
# South America
{"country": "Argentina", "iso_code": "AR"},
{"country": "Bolivia", "iso_code": "BO"},
{"country": "Brazil", "iso_code": "BR"},
{"country": "Chile", "iso_code": "CL"},
{"country": "Colombia", "iso_code": "CO"},
{"country": "Ecuador", "iso_code": "EC"},
{"country": "Guyana", "iso_code": "GY"},
{"country": "Paraguay", "iso_code": "PY"},
{"country": "Peru", "iso_code": "PE"},
{"country": "Suriname", "iso_code": "SR"},
{"country": "Uruguay", "iso_code": "UY"},
{"country": "Venezuela", "iso_code": "VE"},
# Oceania
{"country": "Australia", "iso_code": "AU"},
{"country": "Fiji", "iso_code": "FJ"},
{"country": "Kiribati", "iso_code": "KI"},
{"country": "Marshall Islands", "iso_code": "MH"},
{"country": "Micronesia", "iso_code": "FM"},
{"country": "Nauru", "iso_code": "NR"},
{"country": "New Zealand", "iso_code": "NZ"},
{"country": "Palau", "iso_code": "PW"},
{"country": "Papua New Guinea", "iso_code": "PG"},
{"country": "Samoa", "iso_code": "WS"},
{"country": "Solomon Islands", "iso_code": "SB"},
{"country": "Tonga", "iso_code": "TO"},
{"country": "Tuvalu", "iso_code": "TV"},
{"country": "Vanuatu", "iso_code": "VU"}
]
base_url = "https://www.spotify.com/{iso_code}/premium/#plans"
def scrape_pricing(url):
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
soup = BeautifulSoup(response.text, "html.parser")
price_element = soup.find("p", class_="sc-71cce616-6 fQVkUv")
if not price_element:
return "Pricing not found in HTML"
raw_price_text = price_element.get_text(strip=True)
# Extract only numeric values, ignoring currency symbols and other characters
match = re.search(r"\d+(?:[\.,]\d+)*", raw_price_text)
if match:
# Remove commas and dots for consistency, then convert to float format
numeric_price = match.group(0).replace(",", "").replace(".", "")
return numeric_price
else:
return "N/A"
except Exception as e:
return f"An error occurred: {str(e)}"
results = []
for country in countries:
iso_code = country["iso_code"].lower()
url = base_url.format(iso_code=iso_code)
print(f"Scraping {country['country']} ({iso_code}): {url}")
pricing = scrape_pricing(url)
results.append({
"Country": country["country"],
"ISO Code": iso_code.upper(),
"Pricing": pricing
})
df = pd.DataFrame(results)
# Ensure the 'Pricing' column only contains numeric values
df["Pricing"] = df["Pricing"].apply(lambda x: re.sub(r"[^\d]", "", x) if isinstance(x, str) else x)
output_file = "spotify_pricing_cleaned.csv"
df.to_csv(output_file, index=False)
print(f"\nResults saved to {output_file}")
print("\nScraped Results:")
print(df)
Scraping Algeria (dz): https://www.spotify.com/dz/premium/#plans
Scraping Angola (ao): https://www.spotify.com/ao/premium/#plans
Scraping Benin (bj): https://www.spotify.com/bj/premium/#plans
Scraping Botswana (bw): https://www.spotify.com/bw/premium/#plans
Scraping Burkina Faso (bf): https://www.spotify.com/bf/premium/#plans
Scraping Burundi (bi): https://www.spotify.com/bi/premium/#plans
Scraping Cameroon (cm): https://www.spotify.com/cm/premium/#plans
Scraping Cape Verde (cv): https://www.spotify.com/cv/premium/#plans
Scraping Chad (td): https://www.spotify.com/td/premium/#plans
Scraping Comoros (km): https://www.spotify.com/km/premium/#plans
Scraping Côte d'Ivoire (ci): https://www.spotify.com/ci/premium/#plans
Scraping Democratic Republic of the Congo (cd): https://www.spotify.com/cd/premium/#plans
Scraping Djibouti (dj): https://www.spotify.com/dj/premium/#plans
Scraping Egypt (eg): https://www.spotify.com/eg/premium/#plans
Scraping Ethiopia (et): https://www.spotify.com/et/premium/#plans
Scraping Equatorial Guinea (gq): https://www.spotify.com/gq/premium/#plans
Scraping Eswatini (sz): https://www.spotify.com/sz/premium/#plans
Scraping Gabon (ga): https://www.spotify.com/ga/premium/#plans
Scraping Gambia (gm): https://www.spotify.com/gm/premium/#plans
Scraping Ghana (gh): https://www.spotify.com/gh/premium/#plans
Scraping Guinea (gn): https://www.spotify.com/gn/premium/#plans
Scraping Guinea-Bissau (gw): https://www.spotify.com/gw/premium/#plans
Scraping Kenya (ke): https://www.spotify.com/ke/premium/#plans
Scraping Lesotho (ls): https://www.spotify.com/ls/premium/#plans
Scraping Liberia (lr): https://www.spotify.com/lr/premium/#plans
Scraping Libya (ly): https://www.spotify.com/ly/premium/#plans
Scraping Madagascar (mg): https://www.spotify.com/mg/premium/#plans
Scraping Malawi (mw): https://www.spotify.com/mw/premium/#plans
Scraping Mali (ml): https://www.spotify.com/ml/premium/#plans
Scraping Mauritania (mr): https://www.spotify.com/mr/premium/#plans
Scraping Mauritius (mu): https://www.spotify.com/mu/premium/#plans
Scraping Morocco (ma): https://www.spotify.com/ma/premium/#plans
Scraping Mozambique (mz): https://www.spotify.com/mz/premium/#plans
Scraping Namibia (na): https://www.spotify.com/na/premium/#plans
Scraping Niger (ne): https://www.spotify.com/ne/premium/#plans
Scraping Nigeria (ng): https://www.spotify.com/ng/premium/#plans
Scraping Republic of the Congo (cg): https://www.spotify.com/cg/premium/#plans
Scraping Rwanda (rw): https://www.spotify.com/rw/premium/#plans
Scraping São Tomé and Príncipe (st): https://www.spotify.com/st/premium/#plans
Scraping Senegal (sn): https://www.spotify.com/sn/premium/#plans
Scraping Seychelles (sc): https://www.spotify.com/sc/premium/#plans
Scraping Sierra Leone (sl): https://www.spotify.com/sl/premium/#plans
Scraping South Africa (za): https://www.spotify.com/za/premium/#plans
Scraping Tanzania (tz): https://www.spotify.com/tz/premium/#plans
Scraping Togo (tg): https://www.spotify.com/tg/premium/#plans
Scraping Tunisia (tn): https://www.spotify.com/tn/premium/#plans
Scraping Uganda (ug): https://www.spotify.com/ug/premium/#plans
Scraping Zambia (zm): https://www.spotify.com/zm/premium/#plans
Scraping Zimbabwe (zw): https://www.spotify.com/zw/premium/#plans
Scraping Armenia (am): https://www.spotify.com/am/premium/#plans
Scraping Azerbaijan (az): https://www.spotify.com/az/premium/#plans
Scraping Bahrain (bh): https://www.spotify.com/bh/premium/#plans
Scraping Bangladesh (bd): https://www.spotify.com/bd/premium/#plans
Scraping Bhutan (bt): https://www.spotify.com/bt/premium/#plans
Scraping Brunei Darussalam (bn): https://www.spotify.com/bn/premium/#plans
Scraping Cambodia (kh): https://www.spotify.com/kh/premium/#plans
Scraping Georgia (ge): https://www.spotify.com/ge/premium/#plans
Scraping Hong Kong (hk): https://www.spotify.com/hk/premium/#plans
Scraping India (in): https://www.spotify.com/in/premium/#plans
Scraping Indonesia (id): https://www.spotify.com/id/premium/#plans
Scraping Iraq (iq): https://www.spotify.com/iq/premium/#plans
Scraping Israel (il): https://www.spotify.com/il/premium/#plans
Scraping Japan (jp): https://www.spotify.com/jp/premium/#plans
Scraping Jordan (jo): https://www.spotify.com/jo/premium/#plans
Scraping Kuwait (kw): https://www.spotify.com/kw/premium/#plans
Scraping Kyrgyzstan (kg): https://www.spotify.com/kg/premium/#plans
Scraping Lao People's Democratic Republic (la): https://www.spotify.com/la/premium/#plans
Scraping Lebanon (lb): https://www.spotify.com/lb/premium/#plans
Scraping Macao (mo): https://www.spotify.com/mo/premium/#plans
Scraping Malaysia (my): https://www.spotify.com/my/premium/#plans
Scraping Maldives (mv): https://www.spotify.com/mv/premium/#plans
Scraping Mongolia (mn): https://www.spotify.com/mn/premium/#plans
Scraping Nepal (np): https://www.spotify.com/np/premium/#plans
Scraping Oman (om): https://www.spotify.com/om/premium/#plans
Scraping Pakistan (pk): https://www.spotify.com/pk/premium/#plans
Scraping Palestine (ps): https://www.spotify.com/ps/premium/#plans
Scraping Philippines (ph): https://www.spotify.com/ph/premium/#plans
Scraping Qatar (qa): https://www.spotify.com/qa/premium/#plans
Scraping Saudi Arabia (sa): https://www.spotify.com/sa/premium/#plans
Scraping Singapore (sg): https://www.spotify.com/sg/premium/#plans
Scraping South Korea (kr): https://www.spotify.com/kr/premium/#plans
Scraping Sri Lanka (lk): https://www.spotify.com/lk/premium/#plans
Scraping Taiwan (tw): https://www.spotify.com/tw/premium/#plans
Scraping Tajikistan (tj): https://www.spotify.com/tj/premium/#plans
Scraping Thailand (th): https://www.spotify.com/th/premium/#plans
Scraping Timor-Leste (tl): https://www.spotify.com/tl/premium/#plans
Scraping United Arab Emirates (ae): https://www.spotify.com/ae/premium/#plans
Scraping Uzbekistan (uz): https://www.spotify.com/uz/premium/#plans
Scraping Vietnam (vn): https://www.spotify.com/vn/premium/#plans
Scraping Albania (al): https://www.spotify.com/al/premium/#plans
Scraping Andorra (ad): https://www.spotify.com/ad/premium/#plans
Scraping Austria (at): https://www.spotify.com/at/premium/#plans
Scraping Belarus (by): https://www.spotify.com/by/premium/#plans
Scraping Belgium (be): https://www.spotify.com/be/premium/#plans
Scraping Bosnia and Herzegovina (ba): https://www.spotify.com/ba/premium/#plans
Scraping Bulgaria (bg): https://www.spotify.com/bg/premium/#plans
Scraping Croatia (hr): https://www.spotify.com/hr/premium/#plans
Scraping Cyprus (cy): https://www.spotify.com/cy/premium/#plans
Scraping Czech Republic (cz): https://www.spotify.com/cz/premium/#plans
Scraping Denmark (dk): https://www.spotify.com/dk/premium/#plans
Scraping Estonia (ee): https://www.spotify.com/ee/premium/#plans
Scraping Finland (fi): https://www.spotify.com/fi/premium/#plans
Scraping France (fr): https://www.spotify.com/fr/premium/#plans
Scraping Germany (de): https://www.spotify.com/de/premium/#plans
Scraping Greece (gr): https://www.spotify.com/gr/premium/#plans
Scraping Hungary (hu): https://www.spotify.com/hu/premium/#plans
Scraping Iceland (is): https://www.spotify.com/is/premium/#plans
Scraping Ireland (ie): https://www.spotify.com/ie/premium/#plans
Scraping Italy (it): https://www.spotify.com/it/premium/#plans
Scraping Kazakhstan (kz): https://www.spotify.com/kz/premium/#plans
Scraping Latvia (lv): https://www.spotify.com/lv/premium/#plans
Scraping Liechtenstein (li): https://www.spotify.com/li/premium/#plans
Scraping Lithuania (lt): https://www.spotify.com/lt/premium/#plans
Scraping Luxembourg (lu): https://www.spotify.com/lu/premium/#plans
Scraping Malta (mt): https://www.spotify.com/mt/premium/#plans
Scraping Moldova (md): https://www.spotify.com/md/premium/#plans
Scraping Monaco (mc): https://www.spotify.com/mc/premium/#plans
Scraping Montenegro (me): https://www.spotify.com/me/premium/#plans
Scraping Netherlands (nl): https://www.spotify.com/nl/premium/#plans
Scraping North Macedonia (mk): https://www.spotify.com/mk/premium/#plans
Scraping Norway (no): https://www.spotify.com/no/premium/#plans
Scraping Poland (pl): https://www.spotify.com/pl/premium/#plans
Scraping Portugal (pt): https://www.spotify.com/pt/premium/#plans
Scraping Romania (ro): https://www.spotify.com/ro/premium/#plans
Scraping Serbia (rs): https://www.spotify.com/rs/premium/#plans
Scraping Slovakia (sk): https://www.spotify.com/sk/premium/#plans
Scraping Slovenia (si): https://www.spotify.com/si/premium/#plans
Scraping Spain (es): https://www.spotify.com/es/premium/#plans
Scraping Sweden (se): https://www.spotify.com/se/premium/#plans
Scraping Switzerland (ch): https://www.spotify.com/ch/premium/#plans
Scraping Turkey (tr): https://www.spotify.com/tr/premium/#plans
Scraping Ukraine (ua): https://www.spotify.com/ua/premium/#plans
Scraping United Kingdom (gb): https://www.spotify.com/gb/premium/#plans
Scraping Antigua and Barbuda (ag): https://www.spotify.com/ag/premium/#plans
Scraping Bahamas (bs): https://www.spotify.com/bs/premium/#plans
Scraping Barbados (bb): https://www.spotify.com/bb/premium/#plans
Scraping Belize (bz): https://www.spotify.com/bz/premium/#plans
Scraping Canada (ca): https://www.spotify.com/ca/premium/#plans
Scraping Costa Rica (cr): https://www.spotify.com/cr/premium/#plans
Scraping Dominica (dm): https://www.spotify.com/dm/premium/#plans
Scraping Dominican Republic (do): https://www.spotify.com/do/premium/#plans
Scraping El Salvador (sv): https://www.spotify.com/sv/premium/#plans
Scraping Grenada (gd): https://www.spotify.com/gd/premium/#plans
Scraping Guatemala (gt): https://www.spotify.com/gt/premium/#plans
Scraping Haiti (ht): https://www.spotify.com/ht/premium/#plans
Scraping Honduras (hn): https://www.spotify.com/hn/premium/#plans
Scraping Jamaica (jm): https://www.spotify.com/jm/premium/#plans
Scraping Mexico (mx): https://www.spotify.com/mx/premium/#plans
Scraping Nicaragua (ni): https://www.spotify.com/ni/premium/#plans
Scraping Panama (pa): https://www.spotify.com/pa/premium/#plans
Scraping Saint Kitts and Nevis (kn): https://www.spotify.com/kn/premium/#plans
Scraping Saint Lucia (lc): https://www.spotify.com/lc/premium/#plans
Scraping Saint Vincent and the Grenadines (vc): https://www.spotify.com/vc/premium/#plans
Scraping Trinidad and Tobago (tt): https://www.spotify.com/tt/premium/#plans
Scraping United States (us): https://www.spotify.com/us/premium/#plans
Scraping Argentina (ar): https://www.spotify.com/ar/premium/#plans
Scraping Bolivia (bo): https://www.spotify.com/bo/premium/#plans
Scraping Brazil (br): https://www.spotify.com/br/premium/#plans
Scraping Chile (cl): https://www.spotify.com/cl/premium/#plans
Scraping Colombia (co): https://www.spotify.com/co/premium/#plans
Scraping Ecuador (ec): https://www.spotify.com/ec/premium/#plans
Scraping Guyana (gy): https://www.spotify.com/gy/premium/#plans
Scraping Paraguay (py): https://www.spotify.com/py/premium/#plans
Scraping Peru (pe): https://www.spotify.com/pe/premium/#plans
Scraping Suriname (sr): https://www.spotify.com/sr/premium/#plans
Scraping Uruguay (uy): https://www.spotify.com/uy/premium/#plans
Scraping Venezuela (ve): https://www.spotify.com/ve/premium/#plans
Scraping Australia (au): https://www.spotify.com/au/premium/#plans
Scraping Fiji (fj): https://www.spotify.com/fj/premium/#plans
Scraping Kiribati (ki): https://www.spotify.com/ki/premium/#plans
Scraping Marshall Islands (mh): https://www.spotify.com/mh/premium/#plans
Scraping Micronesia (fm): https://www.spotify.com/fm/premium/#plans
Scraping Nauru (nr): https://www.spotify.com/nr/premium/#plans
Scraping New Zealand (nz): https://www.spotify.com/nz/premium/#plans
Scraping Palau (pw): https://www.spotify.com/pw/premium/#plans
Scraping Papua New Guinea (pg): https://www.spotify.com/pg/premium/#plans
Scraping Samoa (ws): https://www.spotify.com/ws/premium/#plans
Scraping Solomon Islands (sb): https://www.spotify.com/sb/premium/#plans
Scraping Tonga (to): https://www.spotify.com/to/premium/#plans
Scraping Tuvalu (tv): https://www.spotify.com/tv/premium/#plans
Scraping Vanuatu (vu): https://www.spotify.com/vu/premium/#plans
Results saved to spotify_pricing_cleaned.csv
Scraped Results:
Country ISO Code Pricing
0 Algeria DZ 499
1 Angola AO 299
2 Benin BJ 429
3 Botswana BW 299
4 Burkina Faso BF 299
.. ... ... ...
176 Samoa WS 499
177 Solomon Islands SB 499
178 Tonga TO 499
179 Tuvalu TV 1199
180 Vanuatu VU 499
[181 rows x 3 columns]
Converting to USD
After obtaining my dataset, I needed to convert all prices to USD to ensure comparability. First, I identified the currency code for each country, excluding those already in EUR, USD, or AUD. To simplify this step, I used ChatGPT to generate a list of currency codes.
Next, I utilised a free API from ExchangeRate-API, which allowed me to convert prices into USD based on the respective currency codes. The Python code below demonstrates the conversion process.
import pandas as pd
import requests
response = requests.get('https://v6.exchangerate-api.com/v6/26e098c4aabde83f585245b6/latest/USD')
data = response.json()
rates = data['conversion_rates']
df = pd.read_csv('spotify_prices_usd_4.csv')
# Convert Pricing to float
df['Pricing'] = pd.to_numeric(df['Pricing'], errors='coerce')
def convert_to_usd(row):
if row['Currency'] == 'USD':
return row['Pricing']
try:
return row['Pricing'] / rates[row['Currency']]
except KeyError:
print(f"Warning: No rate found for {row['Currency']}")
return None
df['Price_USD'] = df.apply(convert_to_usd, axis=1)
df.to_csv('spotify_prices_usd_5.csv', index=False)
Analysis
Before conducting the final analysis, I needed to combine the price data with GDP per capita data from the World Bank. I used ChatGPT to assist with this step. However, the ISO codes for countries in the two datasets did not align, so I had to match the data based on country names. This required some manual cleaning, as certain country names differed between the datasets.
Once the data was properly merged, the regression analysis was relatively straightforward.
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import OneHotEncoder
import statsmodels.api as sm
spotify_data = pd.read_csv('Spotify_Prices_Merged_with_GDP_Data__By_Country_Name_.csv')
# Simple regression
gdp_per_capita = spotify_data['GDP per capita (current US$)'].values.reshape(-1, 1)
spotify_prices = spotify_data['Price_USD'].values
valid_data_mask = ~(np.isnan(gdp_per_capita).any(axis=1) | np.isnan(spotify_prices))
gdp_clean = gdp_per_capita[valid_data_mask]
prices_clean = spotify_prices[valid_data_mask]
# GDP only model
X_gdp = sm.add_constant(gdp_clean)
X_gdp = pd.DataFrame(X_gdp, columns=['const', 'GDP_per_capita'])
gdp_only_model = sm.OLS(prices_clean, X_gdp).fit()
print("Simple Regression Results (GDP Only):")
print(gdp_only_model.summary())
# Multiple regression with regions
region_encoder = OneHotEncoder(drop='first')
region_dummy_vars = region_encoder.fit_transform(spotify_data[['Region']]).toarray()
region_variable_names = region_encoder.get_feature_names_out(['Region'])
gdp_and_region_vars = np.column_stack([gdp_per_capita, region_dummy_vars])
gdp_and_region_clean = gdp_and_region_vars[valid_data_mask]
# Create DataFrame with named columns for the full model
X_full = sm.add_constant(gdp_and_region_clean)
X_full = pd.DataFrame(X_full,
columns=['const', 'GDP_per_capita'] + list(region_variable_names))
full_model = sm.OLS(prices_clean, X_full).fit()
print("\nMultiple Regression Results (GDP + Regions):")
print(full_model.summary())
Simple Regression Results (GDP Only):
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.491
Model: OLS Adj. R-squared: 0.488
Method: Least Squares F-statistic: 166.0
Date: Sat, 25 Jan 2025 Prob (F-statistic): 5.09e-27
Time: 16:33:54 Log-Likelihood: -384.66
No. Observations: 174 AIC: 773.3
Df Residuals: 172 BIC: 779.6
Df Model: 1
Covariance Type: nonrobust
==================================================================================
coef std err t P>|t| [0.025 0.975]
----------------------------------------------------------------------------------
const 4.2054 0.202 20.846 0.000 3.807 4.604
GDP_per_capita 7.306e-05 5.67e-06 12.883 0.000 6.19e-05 8.43e-05
==============================================================================
Omnibus: 22.466 Durbin-Watson: 1.352
Prob(Omnibus): 0.000 Jarque-Bera (JB): 108.440
Skew: -0.148 Prob(JB): 2.84e-24
Kurtosis: 6.856 Cond. No. 4.26e+04
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 4.26e+04. This might indicate that there are
strong multicollinearity or other numerical problems.
Multiple Regression Results (GDP + Regions):
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.682
Model: OLS Adj. R-squared: 0.673
Method: Least Squares F-statistic: 72.14
Date: Sat, 25 Jan 2025 Prob (F-statistic): 5.06e-40
Time: 16:33:54 Log-Likelihood: -343.69
No. Observations: 174 AIC: 699.4
Df Residuals: 168 BIC: 718.3
Df Model: 5
Covariance Type: nonrobust
========================================================================================
coef std err t P>|t| [0.025 0.975]
----------------------------------------------------------------------------------------
const 2.8453 0.254 11.201 0.000 2.344 3.347
GDP_per_capita 5.244e-05 5.25e-06 9.982 0.000 4.21e-05 6.28e-05
Region_Asia 0.8306 0.396 2.098 0.037 0.049 1.612
Region_Europe 3.6181 0.428 8.457 0.000 2.774 4.463
Region_North America 2.3391 0.409 5.716 0.000 1.531 3.147
Region_Oceania 3.5290 0.557 6.338 0.000 2.430 4.628
==============================================================================
Omnibus: 35.733 Durbin-Watson: 1.959
Prob(Omnibus): 0.000 Jarque-Bera (JB): 149.933
Skew: -0.660 Prob(JB): 2.77e-33
Kurtosis: 7.352 Cond. No. 1.83e+05
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.83e+05. This might indicate that there are
strong multicollinearity or other numerical problems.
# Upload ipynb
from google.colab import files
f = files.upload()
# Convert ipynb to html
import subprocess
file0 = list(f.keys())[0]
_ = subprocess.run(["pip", "install", "nbconvert"])
_ = subprocess.run(["jupyter", "nbconvert", file0, "--to", "html"])
# download the html
files.download(file0[:-5]+"html")
Saving spotify_scraper (1).ipynb to spotify_scraper (1).ipynb