Alt Data: A Work in Progress

December 13, 2018 | By: Ivy Schmerken

By Ivy Schmerken, Editorial Director

Banks and investment firms have increased their consumption of alternative data sets, but there are still challenges around selecting the most relevant data for trading purposes, according to panelists at a recent industry fintech conference.

In particular, the quality of time stamping and backtesting from third-party alternative data platforms is not always up to the standards of traders and investors, according to panelists at Tabb Group’s FinTech Festival conference in November.

To navigate the space, sell-side firms have hired big data experts to work with their fundamental research analysts who lack the skills to analyze alternative data.

“The success comes from knowing where the data is, knowing what it means and knowing how to process it,” said a big data expert with a global bank on the panel “The Automated Analyst: Leveraging Alt-Data to Trade.”  The expert in predictive analytics said his role is to find alternative data that is sensitive to companies covered by analysts and to help them understand how alt data works.

For instance, when the bank’s forestry and paper analyst learned that a factory was under construction by one his covered companies, the data expert obtained access to satellite photos of the roof of the building which provided visibility from the roof to the ground. Armed with this data, the analyst was able to determine that the factory was partially constructed.

“While this was not deep quantitative analysis, the data was meaningful to someone who understands it,” said the bank’s big data expert.

Banks and brokers are also trying to augment the traditional analyst’s product to create data solutions for institutional clients, and this includes helping fundamental investment managers to analyze data mining.

In one case, a large bank developed an artificial intelligence service that processes an enormous amount of unstructured data (mainly news, Twitter and sentiment analysis) to surface and rank connections between breaking news and investors’ portfolio holdings.

But even in this instance finding the most relevant alternative data is still a challenge.


Data surveillance is part of the process. A data expert with a global bank said it collaborates with its innovation lab which visits Silicon Valley to scout for new alt data companies.

Although there are hundreds of alternative data sources available, the panelists differentiated between primary research versus secondary sources of data.

For instance, instead of relying on vendor aggregated feeds, the global bank creates its own data by surveying companies on the list it received from the bank’s innovation lab. “The integration of different skill sets is what generates exploitable results,” said the speaker.

Rather than blindly diving into data mining, it’s important to have an economic basis for why they are searching the data. “I have an agenda for what I want to predict,” said the big data expert.

Instead of  going for the most novel data sets, banks are also combining alternative data with familiar data sets.  Some have found value in existing data sets which can be supplemented with unstructured data. As part of that process, they are integrating internal data and other data.

Noise vs. Signal

Clearly, the holy grail of sifting through alternative data sets is the search for alpha —or beating the market by capturing an inefficiency before anyone else gets to trade on it.  In general, while there are exceptions to the rule, most commercially available data sets tend have more noise than signal, or what’s known as a weak signal to noise ratio, said several speakers.

So even though hedge funds are crunching credit card data, geolocation data from shopping malls, and other esoteric data sets, some quants are looking elsewhere. One quantitative trading panelist expressed an interest in unconventional data sets that have been underutilized. These data sets are familiar to markets but are difficult to access because of the size and structure of the data.

Two data sets of interest are options data applied to equities and equity micro-market structure data applied to longer-term investment processes, said the panelist who runs a quantitative research and trading firm.

“Options are extraordinarily rich and high-dimensional with well-known non-linear interactions,” said the head of a quantitative research firm on the panel.  “It’s a fascinating data set with so many applications for transaction cost analysis and for alpha generation,” said the panelist, noting that there isn’t anyone talking about it.

“In terms of micro-market structure data, which is typically used for low-latency applications, the much bigger application is alpha generation and TCA,” said the speaker.  “When capital concentration and fragmentation collide, resulting in information leakage, it’s possible to identity large block trades based on their signature in the markets, which is evidence of different kinds of order routers.”

Challenges with Backtesting & Time Stamps

But some professionals admit there are significant challenges to transforming alternative data into actionable insights for investing, such as low historical frequency, which can make back tests difficult.

Several panelists criticized the accuracy of time stamping and backtesting associated with alternative data. Third-party alternative data vendors will often share their backtesting results with firms to prove the data is worthwhile, said the quant trading panelist. When running backtests or simulations of data, quants need to be very careful about biases that arise in data. “A lot of the data sets that are commercially available don’t have accurate time stamps available or have misstatements that lead to implicit biases that can’t be removed,” said the speaker.  When quants go back over the data and provide their own time stamps and liquidity constraints, they discover the highest proportion of alpha comes from illiquid, untradeable securities. And when they restrict the universe to liquid stocks that most asset managers and traders care about, there’s very little alpha.  “That is a common set of challenges that very few alternative data vendors have not been able to solve,” said the speaker.

Cost of Alt Data and Alpha Decay

In addition, financial firms exploring alternative data sets on their own will incur the costs of setting up a pipeline and hiring engineers to procure, clean and maintain data.

“There is a huge cost to exploring any data source, alternative or otherwise,” cautioned the head of an online robo-advisory firm and fintech research firm.

Though alternative data is not cheap, some buy-side firms want to consume alternative data to reduce their research budgets and commissions paid for traditional research, said the speaker.

Take the case of a discretionary macro-hedge fund whose edge is in long-term predictive trading that has been accumulating and paying for a lot of sell-side research, said the CEO of the robo-advisor.

“After running the cost-benefit analysis, the hedge fund could conclude that developing the alt data expertise inhouse reduces its costs and therefore is a worthwhile investment.”

Meanwhile, with so many people using satellite images and credit card data, people are using alt data to find alpha or beat the market, but alpha decays,” said the robo-advisor CEO.

On Nov. 30, Bloomberg reported: “Finding an edge in satellite images or social media posts is getting harder and riskier,” In “Parking Lots Don’t Tell the Whole Story: The Trouble with Alternative Data, investment managers told Bloomberg that some data curated for hedge funds is overused and has deteriorated in its predictive power. Others said the data is cool but lacks the historical frequency, breadth and precision to be related to financial variables, such as connecting cars in parking lots to predicting sales figures.

For this reason, investment professionals are hunting for their own unknown sources of data and running their own backtests to determine the usefulness to criteria that matters to investors.

Despite these challenges, the market for alternative data and analytics is only going to expand. Nasdaq’s acquisition of Quandl, an aggregation platform dubbed the Amazon of alternative data, could trigger other mergers and acquisitions in the space.

It remains to be seen if fundamental investors can earn the quant-like returns they aspire to, but clearly alternative data is the new frontier for investment analysis and trading.

Alt data
Ivy Schmerken

Data Service Integrations Available with FlexTrade’s Execution Management System Technology

For further information, please contact us at

Past blog posts related to Data Issues

BIG Data: Getting Granular with ESG Factors

Data Science Platforms Help the Buy-Side Integrate Alternative Data

Algo Development 2.0 Looks to Open Source, Cloud & Big Data

Alt Data on the March

Buy-Side Delves into Mobile Data

Explore more


How is ChatGPT Making Inroads on Wall Street?


FX Fuels Demand for Multi-Asset Trading Technology

Explore All Posts