We are working on a project for a client which functions as an analysis tool for stocks using LLMs. Ingesting 10ks, presentations, news, etc. and doing comparative analysis and other reports. It works great, but one of the things we have learned (and it makes sense) is that traceability of the information for financial professionals is very important - where did the facts and information come from in what the AI is producing. A hard problem to solve completely.
LLMs labor savings will only help financial market participants if they manage to do it without hallucinations / can maintain ground truth.
Sure its great if your analysts save 10 hours because they don't need to read 10Ks / earnings / management call transcripts .. but not if it spits out incorrect/made up numbers.
With code you can run it and see if it works, rinse & repeat.
With combing financial documents to then make decisions, you'll realize it made up some financial stat after you've lost money. So the iteration loop is quite different.
There were some developments using LLMs in the timeseries domain which caught my attention.
I toyed with the Chronos forecasting toolkit [1], and the results were predictably off by wild margins [2]
What really caught my eye though was the "feel" of the predicted timeseries -- this is the first time I've seen synthetic timeseries that look like the real thing. Stock charts have a certain quality to them, once you've been looking at them long enough, you can tell more often than not whether some unlabeled data is a stock price timeseries or not. It seems the chronos LLM was able to pick up on that "nature" of the price movement, and replicate it in its forecasts. Impressive!
I think some of the financial applications around LLMs right now are better suited for things like summarization, aggregation, etc.
We at Tradytics recently built two tools on top of LLMs and they've been super popular with our usercase.
Earnings transcript summary: Users want a simple and easy to understand summary of what happened in an earnings call and report. LLMs are a nice fit for that - https://tradytics.com/earnings
News aggregation & summarization: Given how many articles get written everyday in financial markets, there is need for a better ingestion pipelines. Users want to understand what's going on but don't want to spend several hours reading through news - https://tradytics.com/news
> there is much more noise than signal in financial data.
Spot on. Very few can consistently find small signals and match that with huge amounts of capital and be successful for a long period. Of course Renaissance Technology comes to mind.
Recommended reading this if your interested, was an enjoyable read:The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution
HFTs exploit price inefficiencies that last only milliseconds. The time-series data mentioned in the article is on the scale of seconds. I wonder if its possible to get the time-series data on the scale of milliseconds, and how that would affect the training of the objective function in a LLM.
If I learned anything from a conference by benoit mandelbrot back in my college days
is that gaming financial markets is the only real application of anything scientific
but I vaguely remember what he was actually talking about, I never quite made it as a mathematician
There is no understanding. It is extremely annoying that interpolation is passed off as intelligence.
So far, the biggest contribution to financial markets has been hype and promises. I expect this will eventually dissipate into disappointment for most.
I'm surprised people don't talk more about sentiment analysis -- or is that mostly solved?
Would also be interesting to see more treatises on tranformer(-like) forecasting. Some discussion here: https://www.reddit.com/r/MachineLearning/comments/102mf6v/d_...
Is it really fair to say that 177B is not far from 500B?
Quant has been about finding secrets/patterns that no one knows. Secrets because once they are known, the benefits go away or are greatly reduced.
Rather than finding patterns in historical numbers, LLM can help quantify the current world in ways not possible before. This opens up a new world of finding new secrets.
The synthetic data creation and meta-learning scenario is the only use case that sounds remotely plausible.
Financial market applications of "transformers", not LLMs
The problem with attempting to use a timeseries of historical prices to predict future ones is price is an output, not an input. It would be better to try to gather embedding data for everything and then conduct a sensitivity analysis to see what is correlated to price.
The art here for a human would be to find the sweet spot of how LITTLE data to feed the llm and to get the weights and other goodies just right for it to be realistic to run for a single non-billionaire.
No philosophical discussion about what are we even doing if we’re just operating on the predictions of computers to guess equity pricing? Or operating on the predictions of the predictions of computers to guess equity pricing? This isn’t based on any real evaluation. Just pattern matching.
What the hell is this even for? What the hell are we even doing here? If computers can successfully guess the market, what the hell is it even?
Wouldn't this be "transformer models" rather than LLMs?
is all text, 1 diagram and no data showing anything. im like wtf.
So while the case for GPT-4 like models taking over quantitative trading is currently unlikely…. No shit Sherlock
A lot of words for not bringing much new content to the discussion. I think the most interesting application of LLMs in Finance are
(1) synthetic data models for data cleansing, (2) journal management, (3) anomaly tracking, (4) critiquing investments
All of this should be done by professionals and nothing is "retail" ready.