Correlation in the Crypto Market
What is this post about?
I’m currently listening to the Life and Work Principles by Ray Dalio - the famous Bridgewater hedge fund manager & founder. In it he briefly touches upon the successes of his investment strategies, out of which one was focused on investing into uncorrelated stocks. This got me thinking about the generally high correlation in the cryptocurrency market, so I decided to see if there are any tokens that behave differently from the rest. Hence, here we are and below I quickly recap the code, the findings and the takeaways from this little intelectual exercise.
Tools and the Game Plan
To get the historical cryptocurrency data I make use of the freely accesible CoinGecko API and its Python3 wrapper. The game plan is to look at cryptocurrencies that have been traded for at least 3 years, and construct a portfolio consisting of uncorrelated tokens. Apart from the CoinGecko API, I use the pandas, numpy, matplotlib, seaborn and the datetime packages. The code is quite heavily commented so hopefully the interested reader can comprehend all the steps just by reading through it.
# Imports
#--------
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sb
import datetime as dt
#---------------
# CoingGecko API
#---------------
from pycoingecko import CoinGeckoAPI
cg = CoinGeckoAPI()
# Current data for top 500 coins by Market Cap - 100 coins per page
#------------------------------------------------------------------
coins = pd.DataFrame(cg.get_coins_markets(vs_currency = 'usd'))
for page in range(2,6):
coins = pd.concat([coins, pd.DataFrame(cg.get_coins_markets(vs_currency = 'usd', page = page))])
coins = coins.reset_index(drop = True)
coins
id | symbol | name | image | current_price | market_cap | market_cap_rank | fully_diluted_valuation | total_volume | high_24h | ... | total_supply | max_supply | ath | ath_change_percentage | ath_date | atl | atl_change_percentage | atl_date | roi | last_updated | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | bitcoin | btc | Bitcoin | https://assets.coingecko.com/coins/images/1/la... | 37978.000000 | 723476227588 | 1 | 7.984846e+11 | 2.291839e+10 | 38747.000000 | ... | 2.100000e+07 | 2.100000e+07 | 69045.000000 | -44.91950 | 2021-11-10T14:24:11.849Z | 67.810000 | 55984.30690 | 2013-07-06T00:00:00.000Z | None | 2022-05-01T07:56:55.013Z |
1 | ethereum | eth | Ethereum | https://assets.coingecko.com/coins/images/279/... | 2773.960000 | 334692571096 | 2 | NaN | 1.430243e+10 | 2838.580000 | ... | NaN | NaN | 4878.260000 | -43.05475 | 2021-11-10T14:24:19.604Z | 0.432979 | 641487.36811 | 2015-10-20T00:00:00.000Z | {'times': 96.5436231164556, 'currency': 'btc',... | 2022-05-01T07:55:34.686Z |
2 | tether | usdt | Tether | https://assets.coingecko.com/coins/images/325/... | 1.000000 | 83223328899 | 3 | NaN | 5.450827e+10 | 1.008000 | ... | 8.315288e+10 | NaN | 1.320000 | -24.33295 | 2018-07-24T00:00:00.000Z | 0.572521 | 74.86646 | 2015-03-02T00:00:00.000Z | None | 2022-05-01T07:56:39.281Z |
3 | binancecoin | bnb | BNB | https://assets.coingecko.com/coins/images/825/... | 383.420000 | 64449760501 | 4 | 6.444976e+10 | 1.641067e+09 | 399.140000 | ... | 1.681370e+08 | 1.681370e+08 | 686.310000 | -44.11107 | 2021-05-10T07:24:17.097Z | 0.039818 | 963213.88135 | 2017-10-19T00:00:00.000Z | None | 2022-05-01T07:55:51.903Z |
4 | usd-coin | usdc | USD Coin | https://assets.coingecko.com/coins/images/6319... | 0.997881 | 49148736774 | 5 | NaN | 4.843504e+09 | 1.008000 | ... | 4.925730e+10 | NaN | 1.170000 | -14.83687 | 2019-05-08T00:40:28.300Z | 0.891848 | 11.98245 | 2021-05-19T13:14:05.611Z | None | 2022-05-01T07:56:16.585Z |
500 rows × 26 columns
# Keeping only coins with 3+ year history on the market
#------------------------------------------------------
from datetime import datetime, timezone
#--------------------------------------
jan19 = datetime.strptime('2019-01-01 00:00:00', '%Y-%m-%d %H:%M:%S') # 1st Jan 2019
jan19 = dt.replace(tzinfo=timezone.utc).isoformat() # formatting like in the API
coins["atl_date"] = pd.to_datetime(coins["atl_date"]) # changing type to datetime
coins = coins.loc[coins["atl_date"] < jan19] # keeping only coins older than 3 years
coins = coins.reset_index(drop = True)
coins.head(6)
id | symbol | name | image | current_price | market_cap | market_cap_rank | fully_diluted_valuation | total_volume | high_24h | ... | total_supply | max_supply | ath | ath_change_percentage | ath_date | atl | atl_change_percentage | atl_date | roi | last_updated | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | bitcoin | btc | Bitcoin | https://assets.coingecko.com/coins/images/1/la... | 38357.000000 | 730178553834 | 1 | 8.059056e+11 | 2.004939e+10 | 38801.000000 | ... | 2.100000e+07 | 2.100000e+07 | 69045.000000 | -44.45523 | 2021-11-10T14:24:11.849Z | 67.810000 | 56457.03434 | 2013-07-06 00:00:00+00:00 | None | 2022-04-30T18:13:40.062Z |
1 | ethereum | eth | Ethereum | https://assets.coingecko.com/coins/images/279/... | 2793.330000 | 336986058000 | 2 | NaN | 1.253951e+10 | 2841.350000 | ... | NaN | NaN | 4878.260000 | -42.75991 | 2021-11-10T14:24:19.604Z | 0.432979 | 644809.25105 | 2015-10-20 00:00:00+00:00 | {'times': 96.33730439594328, 'currency': 'btc'... | 2022-04-30T18:12:48.747Z |
2 | tether | usdt | Tether | https://assets.coingecko.com/coins/images/325/... | 1.001000 | 83254158899 | 3 | NaN | 4.428167e+10 | 1.002000 | ... | 8.315288e+10 | NaN | 1.320000 | -24.30444 | 2018-07-24T00:00:00.000Z | 0.572521 | 74.93234 | 2015-03-02 00:00:00+00:00 | None | 2022-04-30T18:12:11.563Z |
3 | binancecoin | bnb | BNB | https://assets.coingecko.com/coins/images/825/... | 390.840000 | 65826723112 | 4 | 6.582672e+10 | 1.150386e+09 | 399.800000 | ... | 1.681370e+08 | 1.681370e+08 | 686.310000 | -43.06657 | 2021-05-10T07:24:17.097Z | 0.039818 | 981217.09381 | 2017-10-19 00:00:00+00:00 | None | 2022-04-30T18:14:11.273Z |
4 | ripple | xrp | XRP | https://assets.coingecko.com/coins/images/44/l... | 0.610433 | 29353852829 | 7 | 6.102008e+10 | 2.304907e+09 | 0.628349 | ... | 1.000000e+11 | 1.000000e+11 | 3.400000 | -82.13775 | 2018-01-07T00:00:00.000Z | 0.002686 | 22498.40152 | 2014-05-22 00:00:00+00:00 | None | 2022-04-30T18:13:46.614Z |
5 | dogecoin | doge | Dogecoin | https://assets.coingecko.com/coins/images/5/la... | 0.131988 | 17501590218 | 12 | NaN | 8.279380e+08 | 0.137505 | ... | NaN | NaN | 0.731578 | -81.97014 | 2021-05-08T05:08:23.458Z | 0.000087 | 151679.95328 | 2015-05-06 00:00:00+00:00 | None | 2022-04-30T18:13:24.449Z |
6 | litecoin | ltc | Litecoin | https://assets.coingecko.com/coins/images/2/la... | 99.170000 | 6971161593 | 22 | 8.343390e+09 | 5.248294e+08 | 101.510000 | ... | 8.400000e+07 | 8.400000e+07 | 410.260000 | -75.78326 | 2021-05-10T03:13:07.904Z | 1.150000 | 8547.97098 | 2015-01-14 00:00:00+00:00 | None | 2022-04-30T18:14:31.331Z |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
48 rows × 26 columns
In the next step I first create the price_history
dataframe using the historical prices of bitcoin after which I itterativelly populate it with new columns corresponding to all the other coins. I use the time index returned by the API to match the correct rows and set all other rows to NaN
.
# Historical Price Data for up to 2000 days before 30/04/2022
#------------------------------------------------------------
coin_ids = coins['id']
bitcoin = cg.get_coin_market_chart_by_id(id = 'bitcoin', days = '2000', vs_currency = 'usd')["prices"]
price_history = pd.DataFrame(bitcoin, columns = ["index", "bitcoin"])
for coin_id in coin_ids[1:]:
coin = cg.get_coin_market_chart_by_id(id = coin_id, days = '2000', vs_currency = 'usd')["prices"]
coin = pd.DataFrame(coin, columns = ["index", coin_id])
price_history = price_history.join(coin.set_index("index"), on = "index")
price_history
index | bitcoin | ethereum | tether | binancecoin | ripple | dogecoin | litecoin | tron | bitcoin-cash | ... | steem | numeraire | verge | ark | asd | aragon | stratis | maidsafecoin | iexec-rlc | augur | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1478563200000 | 708.940000 | 10.890106 | 1.000000 | NaN | 0.008239 | 0.000228 | 3.838584 | NaN | NaN | ... | 0.133704 | NaN | 0.000021 | NaN | NaN | NaN | 0.069805 | 0.079964 | NaN | 4.790000 |
1 | 1478649600000 | 721.177500 | 10.664918 | 1.000000 | NaN | 0.008087 | 0.000229 | 3.849884 | NaN | NaN | ... | 0.148576 | NaN | 0.000022 | NaN | NaN | NaN | 0.064342 | 0.077728 | NaN | 4.450000 |
2 | 1478736000000 | 713.214143 | 10.519281 | 0.999997 | NaN | 0.008137 | 0.000230 | 3.812256 | NaN | NaN | ... | 0.153180 | NaN | 0.000021 | NaN | NaN | NaN | 0.069549 | 0.076865 | NaN | 4.900000 |
3 | 1478822400000 | 715.642500 | 10.293087 | 1.000000 | NaN | 0.008062 | 0.000226 | 3.817239 | NaN | NaN | ... | 0.127085 | NaN | 0.000021 | NaN | NaN | NaN | 0.087305 | 0.076074 | NaN | 4.840000 |
4 | 1478908800000 | 703.760000 | 9.664325 | 1.000000 | NaN | 0.008047 | 0.000223 | 3.754012 | NaN | NaN | ... | 0.122182 | NaN | 0.000021 | NaN | NaN | NaN | 0.086133 | 0.076165 | NaN | 4.850000 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1996 | 1651017600000 | 38134.215451 | 2806.748836 | 1.000501 | 385.027613 | 0.642326 | 0.138032 | 98.472060 | 0.061942 | 296.100585 | ... | 0.439145 | 25.476484 | 0.009321 | 0.922246 | 0.198598 | 3.725051 | 0.984225 | 0.276582 | 1.736587 | 12.812804 |
1997 | 1651104000000 | 39237.949317 | 2889.592223 | 0.999849 | 391.285543 | 0.652835 | 0.140246 | 100.514884 | 0.063180 | 307.473039 | ... | 0.546993 | 25.110015 | 0.009547 | 0.943347 | 0.198518 | 3.859121 | 1.013390 | 0.283103 | 1.784559 | 13.165825 |
1998 | 1651190400000 | 39741.766646 | 2932.455084 | 0.999654 | 406.326688 | 0.644127 | 0.137214 | 103.105912 | 0.063653 | 306.538849 | ... | 0.477371 | 25.115812 | 0.009321 | 0.948837 | 0.191595 | 3.848096 | 1.004725 | 0.289798 | 1.780006 | 13.394428 |
1999 | 1651276800000 | 38650.550138 | 2817.489882 | 1.001222 | 392.964375 | 0.612456 | 0.135080 | 100.369428 | 0.063630 | 294.633053 | ... | 0.478391 | 23.651649 | 0.008742 | 0.906787 | 0.185225 | 3.609127 | 0.975365 | 0.278096 | 1.644444 | 12.885204 |
2000 | 1651342678000 | 38326.115966 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2001 rows × 49 columns
# Removing stable coins
#----------------------
stable_coins = (price_history.mean() < 1.02) & (price_history.mean() > 0.98)
price_history = price_history.loc[:, ~stable_coins.values]
Finally we get to see what we were after. Below the correlation matrix reveals that there indeed are some tokens that seem to behave quite differently from the rest. The white(-er) squares correspond to pairs of cryptocurrencies that are highly uncorrelated, thus useful for diversification of one’s crypto portfolio. To further find the actual pairs, in the next code chunk I re-format the correlation matrix into a so-called long format and only look at the pairs of coins that have correlation of less than 0.05 in absolute value.
# Correlation matrix
#-------------------
corr = price_history.iloc[1:, 1:].corr() # ignore the API's time index column (and row)
sb.heatmap(corr, cmap = 'PuOr', center = 0, annot = False, yticklabels = False, xticklabels = False)
plt.show()
# Create a long format correlation table
#---------------------------------------
corr_long = corr.reset_index().melt(id_vars="index") # transform to long format
corr_long = corr_long.loc[(corr_long["value"] != 1.0).values].reset_index(drop = True) # drop self-correlations
corr_long["abs_val"] = np.abs(corr_long["value"]) # create absolute value column
corr_long = corr_long.sort_values("abs_val") # sort by absolute value
corr_long = corr_long.iloc[::2] # drop duplicates
corr_long.loc[corr_long["abs_val"] < 0.05] # display uncorrelated tokens
coin1 | coin2 | correlation | abs_val | |
---|---|---|---|---|
2037 | decentraland | augur | 0.001086 | 0.001086 |
920 | dash | enjincoin | 0.001412 | 0.001412 |
1092 | decentraland | nem | 0.002478 | 0.002478 |
630 | bitcoin | eos | 0.002628 | 0.002628 |
1229 | eos | rocket-pool | 0.003400 | 0.003400 |
367 | bitcoin-cash | chainlink | 0.004519 | 0.004519 |
80 | steem | ethereum | 0.012647 | 0.012647 |
398 | ark | chainlink | 0.014588 | 0.014588 |
1910 | enjincoin | stratis | -0.015294 | 0.015294 |
547 | bitcoin-cash | decentraland | -0.015555 | 0.015555 |
1444 | dogecoin | lisk | 0.020131 | 0.020131 |
1460 | enjincoin | lisk | -0.020163 | 0.020163 |
1894 | dogecoin | stratis | -0.026796 | 0.026796 |
647 | quant-network | eos | 0.027587 | 0.027587 |
1890 | bitcoin | stratis | -0.033122 | 0.033122 |
103 | eos | binancecoin | -0.039213 | 0.039213 |
1722 | decentraland | verge | 0.041503 | 0.041503 |
317 | binancecoin | bitcoin-cash | 0.041590 | 0.041590 |
581 | stratis | decentraland | -0.042639 | 0.042639 |
1440 | bitcoin | lisk | 0.043619 | 0.043619 |
949 | dogecoin | dash | 0.043845 | 0.043845 |
734 | eos | nexo | 0.044147 | 0.044147 |
947 | binancecoin | dash | -0.047622 | 0.047622 |
914 | eos | enjincoin | -0.047646 | 0.047646 |
825 | bitcoin-cash-sv | quant-network | -0.048451 | 0.048451 |
As we can see, there are quite a few pairs of tokens that exhibit correlation of up to 5%. And there are even 6 pairs that have a correlation value of less than 1%. One such pair is Bitcoin and EOS, whose correlation is only $\approx 0.002$. This seems to make EOS a very good candidate to pair with bitcoin in order to achieve a higher degree of diversification. This is just one such pair that I mention though, and by no means I am giving here any sort of investment advice. In fact, I know about investing probably as much as I know about making ketchup. That is, I know what are the ingredients but I am lacking the experience in carrying out the recipe and there is a very high chance my ketchup wouldn’t taste very good. So feel free to play around with this code, explore the correlations on your own, and if you come up with some sensible strategy for your crypto investments thanks to it, then I am happy I could help :)
Finally, below I make a quick time series plot of Bitcoin, EOS and Nexo to see for myself the correlatedness/uncorrelatedness of those tokens.
# Create dates for the x-axis
#----------------------------
from datetime import datetime, timedelta
#---------------------------------------
def datetime_range(start = None, end = None, interval = 1):
span = end - start
for i in range(0, span.days + 1, interval):
yield start + timedelta(days = i)
today = dt.datetime.today()
start = today - dt.timedelta(days = 2000)
xaxis = list(datetime_range(start, today, 200))
xaxis = [date.strftime("%d/%m/%Y") for date in xaxis]
# Plot the price evolution of
#----------------------------
np.log(price_history["nexo"]).plot(color = "dodgerblue")
np.log(price_history["eos"]).plot(color = "darkorange")
np.log(price_history["bitcoin"]).plot(color = "gold")
import matplotlib.dates as mdates
plt.ylabel("Log Price")
plt.legend(["nexo", "eos", "bitcoin"])
plt.xticks(range(0, 2001, 200), xaxis, rotation = 45)
plt.show()
Final thoughts
As with every “little” coding exercise, this one also took longer than expected. But I have learned how to use the CoinGecko API and got the read a bit about the importance of diversifying one’s investment portfolio with uncorrelated investments. Moreover, I learned about new crypto tokens I wasn’t aware of and found out that not the whole crypto market moves un unison - which I suppose is a nice finding.