On-chain Analytics for Sentiment-driven Statistical Causality in Cryptocurrencies

Ioannis Chalkiadakis, Anna Zaremba, Gareth Peters, Michael John Chantler

Research output: Working paperPreprint


This paper presents an efficient algorithm for multimodal statistical causality analysis based on Multiple-Output Gaussian Processes. Signals from different information sources (modalities) are jointly modelled as a Multiple-Output Gaussian Process, and then, using a novel approach to statistical causality based on Gaussian Processes (GP), we study linear and non-linear causal effects between the different modalities. We demonstrate the effectiveness of our approach in a novel machine learning application on studying the relationship between electronic cryptocurrency spot price dynamics and Natural Language Data specific to the crypto sector, which we conjecture influences retail investor behaviour. The investor sentiment is extracted from the Natural Language Data via methods developed in the area of statistical machine learning known as Natural Language Processing (NLP), and we develop novel sentiment index models that add to existing approaches. To capture sentiment, we present a novel framework for text to time series embedding, which we then use to construct a sentiment index from publicly available news articles. We compare our sentiment statistical index model to alternative methods in the NLP literature. Furthermore, in regards to the multimodal causality, the investor sentiment is our primary modality of exploration, in addition to price and a technology-related indicator (hash rate). Analysis shows that our approach is effective in modelling causal structures of variable degree of complexity between heterogeneous data sources, and illustrates the impact that certain modelling choices for the different modalities can have on detecting causality.
Original languageEnglish
Publication statusPublished - 11 Feb 2021


  • Multiple-Output Gaussian Process
  • Granger causality
  • sentiment index
  • sentiment analysis
  • text mining
  • multimodal systems
  • heterogeneous data
  • cryptocurrencies
  • cryptocoin markets
  • natural language processing


Dive into the research topics of 'On-chain Analytics for Sentiment-driven Statistical Causality in Cryptocurrencies'. Together they form a unique fingerprint.

Cite this