World Spotify Data Analysis
The goal of this mini-project is to investigate how Spotify prices vary globally. In one of the courses I am taking, "Globalization and Economic Policy" (PP418), we discussed the principle that, in the absence of transport costs, product differentiation, and other frictions, prices should converge. This concept known as the law of one price. One might initially assume that digital products, which have low delivery frictions, would exhibit similar prices globally. However, the opposite appears to be true. This project aims to explore this puzzle further.
Findings
My investigation revealed substantial variation in Spotify prices across countries. The focus of the analysis was the price of an individual Spotify subscription. Key statistics include:
- Mean monthly price: $5.67 USD
- Standard deviation: $3.14 USD
- Price range: \$0.84 to $15.40 USD
Interestingly, there was a strong correlation between Spotify prices and GDP per capita, supporting a price discrimination narrative where Spotify tailors prices to each country's willingness to pay. A regression of Spotify price on GDP per capita found a statistically significant relationship at a confidence level exceeding 99%.
The results showed:
- T-value: 12.88
- P-value: 2e-16
- Effect size: For every \$1 increase in GDP per capita, the Spotify price increased by \$0.00007406 USD.
When region-specific dummy variables were added to the model, similar results were observed. Full regression results are presented below.
────────────────────────────────────────────────────────────────────────
(1) (2)
─────────────────────────────────────────
(Intercept) 4.23759370 *** 2.84854979 ***
(0.19895952) (0.24292333)
GDP.per.capita..current.US.. 0.00007315 *** 0.00005132 ***
(0.00000559) (0.00000502)
RegionAmericas 2.35366651 ***
(0.39132262)
RegionAsia 0.84890086 *
(0.37868532)
RegionEurope 3.80082692 ***
(0.40909629)
RegionOceania 3.54120627 ***
(0.53240655)
─────────────────────────────────────────
N 174 174
R2 0.49861273 0.70566480
logLik -382.24840726 -335.90702626
AIC 770.49681452 685.81405253
────────────────────────────────────────────────────────────────────────
*** p < 0.001; ** p < 0.01; * p < 0.05.
Theories While these findings do not disprove the law of one price, several factors may explain the observed price differences:
- Price discrimination: Spotify appears to adjust prices based on each country’s willingness to pay.
- Product differentiation: The Spotify song catalogue may vary significantly between countries and regions, meaning the product being sold is not entirely uniform.
- Market competition: The degree of competition in music streaming services may differ across regions, influencing Spotify's pricing strategy.
- Technical barriers: Spotify likely requires a local bank account to purchase the service which is a large barrier to getting the service for a cheaper price.
Code
I’m neither a programmer nor a data scientist, so I’m not claiming this is the most efficient method, but here’s how I acquired the data. I ran a loop to scrape each Spotify pricing plan page and used the Beautiful Soup package to extract the HTML containing the price for individual plans. From there, I cleaned the extracted data to retain only numeric values.
Unfortunately, a significant amount of manual cleaning was required afterward. First, some of the scraping failed, particularly for languages that did not use Latin characters. Second, I had to manually check which countries priced their plans in USD, EUR, or AUD rather than the local currency. Lastly, some countries, such as India, priced Spotify subscriptions for a minimum of two months, which required adjustments.
After addressing these issues, I had a dataset with Spotify prices in various currency units.
import requests
from bs4 import BeautifulSoup
import pandas as pd
import re
countries = [
# Africa
{"country": "Algeria", "iso_code": "DZ"},
{"country": "Angola", "iso_code": "AO"},
{"country": "Benin", "iso_code": "BJ"},
{"country": "Botswana", "iso_code": "BW"},
{"country": "Burkina Faso", "iso_code": "BF"},
{"country": "Burundi", "iso_code": "BI"},
{"country": "Cameroon", "iso_code": "CM"},
{"country": "Cape Verde", "iso_code": "CV"},
{"country": "Chad", "iso_code": "TD"},
{"country": "Comoros", "iso_code": "KM"},
{"country": "Côte d'Ivoire", "iso_code": "CI"},
{"country": "Democratic Republic of the Congo", "iso_code": "CD"},
{"country": "Djibouti", "iso_code": "DJ"},
{"country": "Egypt", "iso_code": "EG"},
{"country": "Ethiopia", "iso_code": "ET"},
{"country": "Equatorial Guinea", "iso_code": "GQ"},
{"country": "Eswatini", "iso_code": "SZ"},
{"country": "Gabon", "iso_code": "GA"},
{"country": "Gambia", "iso_code": "GM"},
{"country": "Ghana", "iso_code": "GH"},
{"country": "Guinea", "iso_code": "GN"},
{"country": "Guinea-Bissau", "iso_code": "GW"},
{"country": "Kenya", "iso_code": "KE"},
{"country": "Lesotho", "iso_code": "LS"},
{"country": "Liberia", "iso_code": "LR"},
{"country": "Libya", "iso_code": "LY"},
{"country": "Madagascar", "iso_code": "MG"},
{"country": "Malawi", "iso_code": "MW"},
{"country": "Mali", "iso_code": "ML"},
{"country": "Mauritania", "iso_code": "MR"},
{"country": "Mauritius", "iso_code": "MU"},
{"country": "Morocco", "iso_code": "MA"},
{"country": "Mozambique", "iso_code": "MZ"},
{"country": "Namibia", "iso_code": "NA"},
{"country": "Niger", "iso_code": "NE"},
{"country": "Nigeria", "iso_code": "NG"},
{"country": "Republic of the Congo", "iso_code": "CG"},
{"country": "Rwanda", "iso_code": "RW"},
{"country": "São Tomé and Príncipe", "iso_code": "ST"},
{"country": "Senegal", "iso_code": "SN"},
{"country": "Seychelles", "iso_code": "SC"},
{"country": "Sierra Leone", "iso_code": "SL"},
{"country": "South Africa", "iso_code": "ZA"},
{"country": "Tanzania", "iso_code": "TZ"},
{"country": "Togo", "iso_code": "TG"},
{"country": "Tunisia", "iso_code": "TN"},
{"country": "Uganda", "iso_code": "UG"},
{"country": "Zambia", "iso_code": "ZM"},
{"country": "Zimbabwe", "iso_code": "ZW"},
# Asia
{"country": "Armenia", "iso_code": "AM"},
{"country": "Azerbaijan", "iso_code": "AZ"},
{"country": "Bahrain", "iso_code": "BH"},
{"country": "Bangladesh", "iso_code": "BD"},
{"country": "Bhutan", "iso_code": "BT"},
{"country": "Brunei Darussalam", "iso_code": "BN"},
{"country": "Cambodia", "iso_code": "KH"},
{"country": "Georgia", "iso_code": "GE"},
{"country": "Hong Kong", "iso_code": "HK"},
{"country": "India", "iso_code": "IN"},
{"country": "Indonesia", "iso_code": "ID"},
{"country": "Iraq", "iso_code": "IQ"},
{"country": "Israel", "iso_code": "IL"},
{"country": "Japan", "iso_code": "JP"},
{"country": "Jordan", "iso_code": "JO"},
{"country": "Kuwait", "iso_code": "KW"},
{"country": "Kyrgyzstan", "iso_code": "KG"},
{"country": "Lao People's Democratic Republic", "iso_code": "LA"},
{"country": "Lebanon", "iso_code": "LB"},
{"country": "Macao", "iso_code": "MO"},
{"country": "Malaysia", "iso_code": "MY"},
{"country": "Maldives", "iso_code": "MV"},
{"country": "Mongolia", "iso_code": "MN"},
{"country": "Nepal", "iso_code": "NP"},
{"country": "Oman", "iso_code": "OM"},
{"country": "Pakistan", "iso_code": "PK"},
{"country": "Palestine", "iso_code": "PS"},
{"country": "Philippines", "iso_code": "PH"},
{"country": "Qatar", "iso_code": "QA"},
{"country": "Saudi Arabia", "iso_code": "SA"},
{"country": "Singapore", "iso_code": "SG"},
{"country": "South Korea", "iso_code": "KR"},
{"country": "Sri Lanka", "iso_code": "LK"},
{"country": "Taiwan", "iso_code": "TW"},
{"country": "Tajikistan", "iso_code": "TJ"},
{"country": "Thailand", "iso_code": "TH"},
{"country": "Timor-Leste", "iso_code": "TL"},
{"country": "United Arab Emirates", "iso_code": "AE"},
{"country": "Uzbekistan", "iso_code": "UZ"},
{"country": "Vietnam", "iso_code": "VN"},
# Europe
{"country": "Albania", "iso_code": "AL"},
{"country": "Andorra", "iso_code": "AD"},
{"country": "Austria", "iso_code": "AT"},
{"country": "Belarus", "iso_code": "BY"},
{"country": "Belgium", "iso_code": "BE"},
{"country": "Bosnia and Herzegovina", "iso_code": "BA"},
{"country": "Bulgaria", "iso_code": "BG"},
{"country": "Croatia", "iso_code": "HR"},
{"country": "Cyprus", "iso_code": "CY"},
{"country": "Czech Republic", "iso_code": "CZ"},
{"country": "Denmark", "iso_code": "DK"},
{"country": "Estonia", "iso_code": "EE"},
{"country": "Finland", "iso_code": "FI"},
{"country": "France", "iso_code": "FR"},
{"country": "Germany", "iso_code": "DE"},
{"country": "Greece", "iso_code": "GR"},
{"country": "Hungary", "iso_code": "HU"},
{"country": "Iceland", "iso_code": "IS"},
{"country": "Ireland", "iso_code": "IE"},
{"country": "Italy", "iso_code": "IT"},
{"country": "Kazakhstan", "iso_code": "KZ"},
{"country": "Latvia", "iso_code": "LV"},
{"country": "Liechtenstein", "iso_code": "LI"},
{"country": "Lithuania", "iso_code": "LT"},
{"country": "Luxembourg", "iso_code": "LU"},
{"country": "Malta", "iso_code": "MT"},
{"country": "Moldova", "iso_code": "MD"},
{"country": "Monaco", "iso_code": "MC"},
{"country": "Montenegro", "iso_code": "ME"},
{"country": "Netherlands", "iso_code": "NL"},
{"country": "North Macedonia", "iso_code": "MK"},
{"country": "Norway", "iso_code": "NO"},
{"country": "Poland", "iso_code": "PL"},
{"country": "Portugal", "iso_code": "PT"},
{"country": "Romania", "iso_code": "RO"},
{"country": "Serbia", "iso_code": "RS"},
{"country": "Slovakia", "iso_code": "SK"},
{"country": "Slovenia", "iso_code": "SI"},
{"country": "Spain", "iso_code": "ES"},
{"country": "Sweden", "iso_code": "SE"},
{"country": "Switzerland", "iso_code": "CH"},
{"country": "Turkey", "iso_code": "TR"},
{"country": "Ukraine", "iso_code": "UA"},
{"country": "United Kingdom", "iso_code": "GB"},
# North America
{"country": "Antigua and Barbuda", "iso_code": "AG"},
{"country": "Bahamas", "iso_code": "BS"},
{"country": "Barbados", "iso_code": "BB"},
{"country": "Belize", "iso_code": "BZ"},
{"country": "Canada", "iso_code": "CA"},
{"country": "Costa Rica", "iso_code": "CR"},
{"country": "Dominica", "iso_code": "DM"},
{"country": "Dominican Republic", "iso_code": "DO"},
{"country": "El Salvador", "iso_code": "SV"},
{"country": "Grenada", "iso_code": "GD"},
{"country": "Guatemala", "iso_code": "GT"},
{"country": "Haiti", "iso_code": "HT"},
{"country": "Honduras", "iso_code": "HN"},
{"country": "Jamaica", "iso_code": "JM"},
{"country": "Mexico", "iso_code": "MX"},
{"country": "Nicaragua", "iso_code": "NI"},
{"country": "Panama", "iso_code": "PA"},
{"country": "Saint Kitts and Nevis", "iso_code": "KN"},
{"country": "Saint Lucia", "iso_code": "LC"},
{"country": "Saint Vincent and the Grenadines", "iso_code": "VC"},
{"country": "Trinidad and Tobago", "iso_code": "TT"},
{"country": "United States", "iso_code": "US"},
# South America
{"country": "Argentina", "iso_code": "AR"},
{"country": "Bolivia", "iso_code": "BO"},
{"country": "Brazil", "iso_code": "BR"},
{"country": "Chile", "iso_code": "CL"},
{"country": "Colombia", "iso_code": "CO"},
{"country": "Ecuador", "iso_code": "EC"},
{"country": "Guyana", "iso_code": "GY"},
{"country": "Paraguay", "iso_code": "PY"},
{"country": "Peru", "iso_code": "PE"},
{"country": "Suriname", "iso_code": "SR"},
{"country": "Uruguay", "iso_code": "UY"},
{"country": "Venezuela", "iso_code": "VE"},
# Oceania
{"country": "Australia", "iso_code": "AU"},
{"country": "Fiji", "iso_code": "FJ"},
{"country": "Kiribati", "iso_code": "KI"},
{"country": "Marshall Islands", "iso_code": "MH"},
{"country": "Micronesia", "iso_code": "FM"},
{"country": "Nauru", "iso_code": "NR"},
{"country": "New Zealand", "iso_code": "NZ"},
{"country": "Palau", "iso_code": "PW"},
{"country": "Papua New Guinea", "iso_code": "PG"},
{"country": "Samoa", "iso_code": "WS"},
{"country": "Solomon Islands", "iso_code": "SB"},
{"country": "Tonga", "iso_code": "TO"},
{"country": "Tuvalu", "iso_code": "TV"},
{"country": "Vanuatu", "iso_code": "VU"}
]
base_url = "https://www.spotify.com/{iso_code}/premium/#plans"
def scrape_pricing(url):
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
soup = BeautifulSoup(response.text, "html.parser")
price_element = soup.find("p", class_="sc-71cce616-6 fQVkUv")
if not price_element:
return "Pricing not found in HTML"
raw_price_text = price_element.get_text(strip=True)
# Extract only numeric values, ignoring currency symbols and other characters
match = re.search(r"\d+(?:[\.,]\d+)*", raw_price_text)
if match:
# Remove commas and dots for consistency, then convert to float format
numeric_price = match.group(0).replace(",", "").replace(".", "")
return numeric_price
else:
return "N/A"
except Exception as e:
return f"An error occurred: {str(e)}"
results = []
for country in countries:
iso_code = country["iso_code"].lower()
url = base_url.format(iso_code=iso_code)
print(f"Scraping {country['country']} ({iso_code}): {url}")
pricing = scrape_pricing(url)
results.append({
"Country": country["country"],
"ISO Code": iso_code.upper(),
"Pricing": pricing
})
df = pd.DataFrame(results)
# Ensure the 'Pricing' column only contains numeric values
df["Pricing"] = df["Pricing"].apply(lambda x: re.sub(r"[^\d]", "", x) if isinstance(x, str) else x)
output_file = "spotify_pricing_cleaned.csv"
df.to_csv(output_file, index=False)
print(f"\nResults saved to {output_file}")
print("\nScraped Results:")
print(df)
Scraping Algeria (dz): https://www.spotify.com/dz/premium/#plans Scraping Angola (ao): https://www.spotify.com/ao/premium/#plans Scraping Benin (bj): https://www.spotify.com/bj/premium/#plans Scraping Botswana (bw): https://www.spotify.com/bw/premium/#plans Scraping Burkina Faso (bf): https://www.spotify.com/bf/premium/#plans Scraping Burundi (bi): https://www.spotify.com/bi/premium/#plans Scraping Cameroon (cm): https://www.spotify.com/cm/premium/#plans Scraping Cape Verde (cv): https://www.spotify.com/cv/premium/#plans Scraping Chad (td): https://www.spotify.com/td/premium/#plans Scraping Comoros (km): https://www.spotify.com/km/premium/#plans Scraping Côte d'Ivoire (ci): https://www.spotify.com/ci/premium/#plans Scraping Democratic Republic of the Congo (cd): https://www.spotify.com/cd/premium/#plans Scraping Djibouti (dj): https://www.spotify.com/dj/premium/#plans Scraping Egypt (eg): https://www.spotify.com/eg/premium/#plans Scraping Ethiopia (et): https://www.spotify.com/et/premium/#plans Scraping Equatorial Guinea (gq): https://www.spotify.com/gq/premium/#plans Scraping Eswatini (sz): https://www.spotify.com/sz/premium/#plans Scraping Gabon (ga): https://www.spotify.com/ga/premium/#plans Scraping Gambia (gm): https://www.spotify.com/gm/premium/#plans Scraping Ghana (gh): https://www.spotify.com/gh/premium/#plans Scraping Guinea (gn): https://www.spotify.com/gn/premium/#plans Scraping Guinea-Bissau (gw): https://www.spotify.com/gw/premium/#plans Scraping Kenya (ke): https://www.spotify.com/ke/premium/#plans Scraping Lesotho (ls): https://www.spotify.com/ls/premium/#plans Scraping Liberia (lr): https://www.spotify.com/lr/premium/#plans Scraping Libya (ly): https://www.spotify.com/ly/premium/#plans Scraping Madagascar (mg): https://www.spotify.com/mg/premium/#plans Scraping Malawi (mw): https://www.spotify.com/mw/premium/#plans Scraping Mali (ml): https://www.spotify.com/ml/premium/#plans Scraping Mauritania (mr): https://www.spotify.com/mr/premium/#plans Scraping Mauritius (mu): https://www.spotify.com/mu/premium/#plans Scraping Morocco (ma): https://www.spotify.com/ma/premium/#plans Scraping Mozambique (mz): https://www.spotify.com/mz/premium/#plans Scraping Namibia (na): https://www.spotify.com/na/premium/#plans Scraping Niger (ne): https://www.spotify.com/ne/premium/#plans Scraping Nigeria (ng): https://www.spotify.com/ng/premium/#plans Scraping Republic of the Congo (cg): https://www.spotify.com/cg/premium/#plans Scraping Rwanda (rw): https://www.spotify.com/rw/premium/#plans Scraping São Tomé and Príncipe (st): https://www.spotify.com/st/premium/#plans Scraping Senegal (sn): https://www.spotify.com/sn/premium/#plans Scraping Seychelles (sc): https://www.spotify.com/sc/premium/#plans Scraping Sierra Leone (sl): https://www.spotify.com/sl/premium/#plans Scraping South Africa (za): https://www.spotify.com/za/premium/#plans Scraping Tanzania (tz): https://www.spotify.com/tz/premium/#plans Scraping Togo (tg): https://www.spotify.com/tg/premium/#plans Scraping Tunisia (tn): https://www.spotify.com/tn/premium/#plans Scraping Uganda (ug): https://www.spotify.com/ug/premium/#plans Scraping Zambia (zm): https://www.spotify.com/zm/premium/#plans Scraping Zimbabwe (zw): https://www.spotify.com/zw/premium/#plans Scraping Armenia (am): https://www.spotify.com/am/premium/#plans Scraping Azerbaijan (az): https://www.spotify.com/az/premium/#plans Scraping Bahrain (bh): https://www.spotify.com/bh/premium/#plans Scraping Bangladesh (bd): https://www.spotify.com/bd/premium/#plans Scraping Bhutan (bt): https://www.spotify.com/bt/premium/#plans Scraping Brunei Darussalam (bn): https://www.spotify.com/bn/premium/#plans Scraping Cambodia (kh): https://www.spotify.com/kh/premium/#plans Scraping Georgia (ge): https://www.spotify.com/ge/premium/#plans Scraping Hong Kong (hk): https://www.spotify.com/hk/premium/#plans Scraping India (in): https://www.spotify.com/in/premium/#plans Scraping Indonesia (id): https://www.spotify.com/id/premium/#plans Scraping Iraq (iq): https://www.spotify.com/iq/premium/#plans Scraping Israel (il): https://www.spotify.com/il/premium/#plans Scraping Japan (jp): https://www.spotify.com/jp/premium/#plans Scraping Jordan (jo): https://www.spotify.com/jo/premium/#plans Scraping Kuwait (kw): https://www.spotify.com/kw/premium/#plans Scraping Kyrgyzstan (kg): https://www.spotify.com/kg/premium/#plans Scraping Lao People's Democratic Republic (la): https://www.spotify.com/la/premium/#plans Scraping Lebanon (lb): https://www.spotify.com/lb/premium/#plans Scraping Macao (mo): https://www.spotify.com/mo/premium/#plans Scraping Malaysia (my): https://www.spotify.com/my/premium/#plans Scraping Maldives (mv): https://www.spotify.com/mv/premium/#plans Scraping Mongolia (mn): https://www.spotify.com/mn/premium/#plans Scraping Nepal (np): https://www.spotify.com/np/premium/#plans Scraping Oman (om): https://www.spotify.com/om/premium/#plans Scraping Pakistan (pk): https://www.spotify.com/pk/premium/#plans Scraping Palestine (ps): https://www.spotify.com/ps/premium/#plans Scraping Philippines (ph): https://www.spotify.com/ph/premium/#plans Scraping Qatar (qa): https://www.spotify.com/qa/premium/#plans Scraping Saudi Arabia (sa): https://www.spotify.com/sa/premium/#plans Scraping Singapore (sg): https://www.spotify.com/sg/premium/#plans Scraping South Korea (kr): https://www.spotify.com/kr/premium/#plans Scraping Sri Lanka (lk): https://www.spotify.com/lk/premium/#plans Scraping Taiwan (tw): https://www.spotify.com/tw/premium/#plans Scraping Tajikistan (tj): https://www.spotify.com/tj/premium/#plans Scraping Thailand (th): https://www.spotify.com/th/premium/#plans Scraping Timor-Leste (tl): https://www.spotify.com/tl/premium/#plans Scraping United Arab Emirates (ae): https://www.spotify.com/ae/premium/#plans Scraping Uzbekistan (uz): https://www.spotify.com/uz/premium/#plans Scraping Vietnam (vn): https://www.spotify.com/vn/premium/#plans Scraping Albania (al): https://www.spotify.com/al/premium/#plans Scraping Andorra (ad): https://www.spotify.com/ad/premium/#plans Scraping Austria (at): https://www.spotify.com/at/premium/#plans Scraping Belarus (by): https://www.spotify.com/by/premium/#plans Scraping Belgium (be): https://www.spotify.com/be/premium/#plans Scraping Bosnia and Herzegovina (ba): https://www.spotify.com/ba/premium/#plans Scraping Bulgaria (bg): https://www.spotify.com/bg/premium/#plans Scraping Croatia (hr): https://www.spotify.com/hr/premium/#plans Scraping Cyprus (cy): https://www.spotify.com/cy/premium/#plans Scraping Czech Republic (cz): https://www.spotify.com/cz/premium/#plans Scraping Denmark (dk): https://www.spotify.com/dk/premium/#plans Scraping Estonia (ee): https://www.spotify.com/ee/premium/#plans Scraping Finland (fi): https://www.spotify.com/fi/premium/#plans Scraping France (fr): https://www.spotify.com/fr/premium/#plans Scraping Germany (de): https://www.spotify.com/de/premium/#plans Scraping Greece (gr): https://www.spotify.com/gr/premium/#plans Scraping Hungary (hu): https://www.spotify.com/hu/premium/#plans Scraping Iceland (is): https://www.spotify.com/is/premium/#plans Scraping Ireland (ie): https://www.spotify.com/ie/premium/#plans Scraping Italy (it): https://www.spotify.com/it/premium/#plans Scraping Kazakhstan (kz): https://www.spotify.com/kz/premium/#plans Scraping Latvia (lv): https://www.spotify.com/lv/premium/#plans Scraping Liechtenstein (li): https://www.spotify.com/li/premium/#plans Scraping Lithuania (lt): https://www.spotify.com/lt/premium/#plans Scraping Luxembourg (lu): https://www.spotify.com/lu/premium/#plans Scraping Malta (mt): https://www.spotify.com/mt/premium/#plans Scraping Moldova (md): https://www.spotify.com/md/premium/#plans Scraping Monaco (mc): https://www.spotify.com/mc/premium/#plans Scraping Montenegro (me): https://www.spotify.com/me/premium/#plans Scraping Netherlands (nl): https://www.spotify.com/nl/premium/#plans Scraping North Macedonia (mk): https://www.spotify.com/mk/premium/#plans Scraping Norway (no): https://www.spotify.com/no/premium/#plans Scraping Poland (pl): https://www.spotify.com/pl/premium/#plans Scraping Portugal (pt): https://www.spotify.com/pt/premium/#plans Scraping Romania (ro): https://www.spotify.com/ro/premium/#plans Scraping Serbia (rs): https://www.spotify.com/rs/premium/#plans Scraping Slovakia (sk): https://www.spotify.com/sk/premium/#plans Scraping Slovenia (si): https://www.spotify.com/si/premium/#plans Scraping Spain (es): https://www.spotify.com/es/premium/#plans Scraping Sweden (se): https://www.spotify.com/se/premium/#plans Scraping Switzerland (ch): https://www.spotify.com/ch/premium/#plans Scraping Turkey (tr): https://www.spotify.com/tr/premium/#plans Scraping Ukraine (ua): https://www.spotify.com/ua/premium/#plans Scraping United Kingdom (gb): https://www.spotify.com/gb/premium/#plans Scraping Antigua and Barbuda (ag): https://www.spotify.com/ag/premium/#plans Scraping Bahamas (bs): https://www.spotify.com/bs/premium/#plans Scraping Barbados (bb): https://www.spotify.com/bb/premium/#plans Scraping Belize (bz): https://www.spotify.com/bz/premium/#plans Scraping Canada (ca): https://www.spotify.com/ca/premium/#plans Scraping Costa Rica (cr): https://www.spotify.com/cr/premium/#plans Scraping Dominica (dm): https://www.spotify.com/dm/premium/#plans Scraping Dominican Republic (do): https://www.spotify.com/do/premium/#plans Scraping El Salvador (sv): https://www.spotify.com/sv/premium/#plans Scraping Grenada (gd): https://www.spotify.com/gd/premium/#plans Scraping Guatemala (gt): https://www.spotify.com/gt/premium/#plans Scraping Haiti (ht): https://www.spotify.com/ht/premium/#plans Scraping Honduras (hn): https://www.spotify.com/hn/premium/#plans Scraping Jamaica (jm): https://www.spotify.com/jm/premium/#plans Scraping Mexico (mx): https://www.spotify.com/mx/premium/#plans Scraping Nicaragua (ni): https://www.spotify.com/ni/premium/#plans Scraping Panama (pa): https://www.spotify.com/pa/premium/#plans Scraping Saint Kitts and Nevis (kn): https://www.spotify.com/kn/premium/#plans Scraping Saint Lucia (lc): https://www.spotify.com/lc/premium/#plans Scraping Saint Vincent and the Grenadines (vc): https://www.spotify.com/vc/premium/#plans Scraping Trinidad and Tobago (tt): https://www.spotify.com/tt/premium/#plans Scraping United States (us): https://www.spotify.com/us/premium/#plans Scraping Argentina (ar): https://www.spotify.com/ar/premium/#plans Scraping Bolivia (bo): https://www.spotify.com/bo/premium/#plans Scraping Brazil (br): https://www.spotify.com/br/premium/#plans Scraping Chile (cl): https://www.spotify.com/cl/premium/#plans Scraping Colombia (co): https://www.spotify.com/co/premium/#plans Scraping Ecuador (ec): https://www.spotify.com/ec/premium/#plans Scraping Guyana (gy): https://www.spotify.com/gy/premium/#plans Scraping Paraguay (py): https://www.spotify.com/py/premium/#plans Scraping Peru (pe): https://www.spotify.com/pe/premium/#plans Scraping Suriname (sr): https://www.spotify.com/sr/premium/#plans Scraping Uruguay (uy): https://www.spotify.com/uy/premium/#plans Scraping Venezuela (ve): https://www.spotify.com/ve/premium/#plans Scraping Australia (au): https://www.spotify.com/au/premium/#plans Scraping Fiji (fj): https://www.spotify.com/fj/premium/#plans Scraping Kiribati (ki): https://www.spotify.com/ki/premium/#plans Scraping Marshall Islands (mh): https://www.spotify.com/mh/premium/#plans Scraping Micronesia (fm): https://www.spotify.com/fm/premium/#plans Scraping Nauru (nr): https://www.spotify.com/nr/premium/#plans Scraping New Zealand (nz): https://www.spotify.com/nz/premium/#plans Scraping Palau (pw): https://www.spotify.com/pw/premium/#plans Scraping Papua New Guinea (pg): https://www.spotify.com/pg/premium/#plans Scraping Samoa (ws): https://www.spotify.com/ws/premium/#plans Scraping Solomon Islands (sb): https://www.spotify.com/sb/premium/#plans Scraping Tonga (to): https://www.spotify.com/to/premium/#plans Scraping Tuvalu (tv): https://www.spotify.com/tv/premium/#plans Scraping Vanuatu (vu): https://www.spotify.com/vu/premium/#plans Results saved to spotify_pricing_cleaned.csv Scraped Results: Country ISO Code Pricing 0 Algeria DZ 499 1 Angola AO 299 2 Benin BJ 429 3 Botswana BW 299 4 Burkina Faso BF 299 .. ... ... ... 176 Samoa WS 499 177 Solomon Islands SB 499 178 Tonga TO 499 179 Tuvalu TV 1199 180 Vanuatu VU 499 [181 rows x 3 columns]
Converting to USD
After obtaining my dataset, I needed to convert all prices to USD to ensure comparability. First, I identified the currency code for each country, excluding those already in EUR, USD, or AUD. To simplify this step, I used ChatGPT to generate a list of currency codes.
Next, I utilised a free API from ExchangeRate-API, which allowed me to convert prices into USD based on the respective currency codes. The Python code below demonstrates the conversion process.
import pandas as pd
import requests
response = requests.get('https://v6.exchangerate-api.com/v6/26e098c4aabde83f585245b6/latest/USD')
data = response.json()
rates = data['conversion_rates']
df = pd.read_csv('spotify_prices_usd_4.csv')
# Convert Pricing to float
df['Pricing'] = pd.to_numeric(df['Pricing'], errors='coerce')
def convert_to_usd(row):
if row['Currency'] == 'USD':
return row['Pricing']
try:
return row['Pricing'] / rates[row['Currency']]
except KeyError:
print(f"Warning: No rate found for {row['Currency']}")
return None
df['Price_USD'] = df.apply(convert_to_usd, axis=1)
df.to_csv('spotify_prices_usd_5.csv', index=False)
Analysis
Before conducting the final analysis, I needed to combine the price data with GDP per capita data from the World Bank. I used ChatGPT to assist with this step. However, the ISO codes for countries in the two datasets did not align, so I had to match the data based on country names. This required some manual cleaning, as certain country names differed between the datasets.
Once the data was properly merged, the regression analysis was relatively straightforward.
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import OneHotEncoder
import statsmodels.api as sm
spotify_data = pd.read_csv('Spotify_Prices_Merged_with_GDP_Data__By_Country_Name_.csv')
# Simple regression
gdp_per_capita = spotify_data['GDP per capita (current US$)'].values.reshape(-1, 1)
spotify_prices = spotify_data['Price_USD'].values
valid_data_mask = ~(np.isnan(gdp_per_capita).any(axis=1) | np.isnan(spotify_prices))
gdp_clean = gdp_per_capita[valid_data_mask]
prices_clean = spotify_prices[valid_data_mask]
# GDP only model
X_gdp = sm.add_constant(gdp_clean)
X_gdp = pd.DataFrame(X_gdp, columns=['const', 'GDP_per_capita'])
gdp_only_model = sm.OLS(prices_clean, X_gdp).fit()
print("Simple Regression Results (GDP Only):")
print(gdp_only_model.summary())
# Multiple regression with regions
region_encoder = OneHotEncoder(drop='first')
region_dummy_vars = region_encoder.fit_transform(spotify_data[['Region']]).toarray()
region_variable_names = region_encoder.get_feature_names_out(['Region'])
gdp_and_region_vars = np.column_stack([gdp_per_capita, region_dummy_vars])
gdp_and_region_clean = gdp_and_region_vars[valid_data_mask]
# Create DataFrame with named columns for the full model
X_full = sm.add_constant(gdp_and_region_clean)
X_full = pd.DataFrame(X_full,
columns=['const', 'GDP_per_capita'] + list(region_variable_names))
full_model = sm.OLS(prices_clean, X_full).fit()
print("\nMultiple Regression Results (GDP + Regions):")
print(full_model.summary())
Simple Regression Results (GDP Only): OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.491 Model: OLS Adj. R-squared: 0.488 Method: Least Squares F-statistic: 166.0 Date: Sat, 25 Jan 2025 Prob (F-statistic): 5.09e-27 Time: 16:33:54 Log-Likelihood: -384.66 No. Observations: 174 AIC: 773.3 Df Residuals: 172 BIC: 779.6 Df Model: 1 Covariance Type: nonrobust ================================================================================== coef std err t P>|t| [0.025 0.975] ---------------------------------------------------------------------------------- const 4.2054 0.202 20.846 0.000 3.807 4.604 GDP_per_capita 7.306e-05 5.67e-06 12.883 0.000 6.19e-05 8.43e-05 ============================================================================== Omnibus: 22.466 Durbin-Watson: 1.352 Prob(Omnibus): 0.000 Jarque-Bera (JB): 108.440 Skew: -0.148 Prob(JB): 2.84e-24 Kurtosis: 6.856 Cond. No. 4.26e+04 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. [2] The condition number is large, 4.26e+04. This might indicate that there are strong multicollinearity or other numerical problems. Multiple Regression Results (GDP + Regions): OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.682 Model: OLS Adj. R-squared: 0.673 Method: Least Squares F-statistic: 72.14 Date: Sat, 25 Jan 2025 Prob (F-statistic): 5.06e-40 Time: 16:33:54 Log-Likelihood: -343.69 No. Observations: 174 AIC: 699.4 Df Residuals: 168 BIC: 718.3 Df Model: 5 Covariance Type: nonrobust ======================================================================================== coef std err t P>|t| [0.025 0.975] ---------------------------------------------------------------------------------------- const 2.8453 0.254 11.201 0.000 2.344 3.347 GDP_per_capita 5.244e-05 5.25e-06 9.982 0.000 4.21e-05 6.28e-05 Region_Asia 0.8306 0.396 2.098 0.037 0.049 1.612 Region_Europe 3.6181 0.428 8.457 0.000 2.774 4.463 Region_North America 2.3391 0.409 5.716 0.000 1.531 3.147 Region_Oceania 3.5290 0.557 6.338 0.000 2.430 4.628 ============================================================================== Omnibus: 35.733 Durbin-Watson: 1.959 Prob(Omnibus): 0.000 Jarque-Bera (JB): 149.933 Skew: -0.660 Prob(JB): 2.77e-33 Kurtosis: 7.352 Cond. No. 1.83e+05 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. [2] The condition number is large, 1.83e+05. This might indicate that there are strong multicollinearity or other numerical problems.
# Upload ipynb
from google.colab import files
f = files.upload()
# Convert ipynb to html
import subprocess
file0 = list(f.keys())[0]
_ = subprocess.run(["pip", "install", "nbconvert"])
_ = subprocess.run(["jupyter", "nbconvert", file0, "--to", "html"])
# download the html
files.download(file0[:-5]+"html")
Saving spotify_scraper (1).ipynb to spotify_scraper (1).ipynb