“Funds can either embrace alternative data and become the Netflix of their industry, or they can ignore it, and risk becoming the equivalent of Blockbuster.”
--YipitData CEO Vinicius Vacanti
The Internet is not only for shopping, searching recipes and updating sports scores. Savvy alternative data services are scouring company, e-commerce and social websites for digital tells and key performance indicators to give you a huge jump on other investors and analysts.
The technology is known as “web scraping,” which can collect and analyze online data by monitoring company websites.
This data is heavily used by analysts in the consumer sector to get near real-time measures on product pricing, competition and other consumer dynamics driving consumer behavior. These key indicators are then used to track sales, average transaction size, product trend changes and then ultimately predict revenue. Alternative data providers are expanding companies and sectors covered as more predictive data becomes available.
Some of the notable companies in the web data collection space are YipitData, Earnest Research, Thinknum, Wiser, Savvr and Vertical Knowledge (see “Growth market,” below).
YipitData provides web data intelligence to primarily buy-side firms consisting of hedge funds and other institutional investors. It specializes in developing systems and methodologies to collect the gobs of digital data to enable granular analysis of real-time company performance metrics.
The company launched in 2010 as a daily deal aggregator that recommended the best deals to more than one million active users. They were aggregating all the deals from Groupon (GRPN) and learned that they could use data to estimate their financial performance within 2% of what they reported each quarter.
YipitData started selling that data to hedge funds in 2011, and by 2013 they realized they could use web data collection capabilities to build systems for many other companies. They now provide key performance metrics on 55 companies in seven sectors and work with more than 80 of the top funds and asset managers in the world.
YipitData is based in New York and has more than 85 employees, including data analysts, research analysts and data engineers coming from MIT, Bloomberg, The Blackstone Group, Goldman Sachs, among others. We talked with CEO Vinicius Vacanti about the effect and uses of alternative data.
MODERN TRADER: What’s the best use case for your data?
Vinicius Vacanti: We consider every product (company dataset) we produce to be a must-have for investors covering the respective company. It takes hundreds of thousands of dollars to build and maintain web data systems for each company. That means we don’t build a product unless we think it’s going to enable us to provide a game-changing understanding of the company. The advantage to that is that the datasets we do build tend to be highly accurate and highly granular. Our customers can understand exactly what’s happening this quarter as well as understand what’s happening inside the business to better understand long-term trends (see “Expedia case study,” bottom).
MT: What specific markets does your data target?
Vacanti: Our data addresses individual stocks and sectors. We collect data at the most fundamental level possible, which enables us to look at trends within and across businesses.
MT: If you had to invest $100 million in an alternative data trading strategy in the next six months, what would you do?
Vacanti: We are not directional in our analysis nor are we financial advisors. Our metrics and trends help understand the performance and competitive market dynamics of businesses within very short periods of time. Our recommendation to make the best of alternative data is to understand the best applications of different data sources in your analysis. Most times, the best insights are gained from combining different alternative datasets. For example, you can use web data to understand the supply dynamics of delivery companies, but not the demand. You can complement your analysis with e-mail receipt data to understand order trends and metrics. At YipitData, we specialize in combining different datasets to extract the most valuable insights.
MT: What makes you stand apart from your competitors?
Vacanti: There are three clear distinguishing factors from all other alternative data companies:
We own our entire process, so we have full visibility into the inflections, factors and biases that go into our analysis. Most companies do not bother with cleaning, analyzing, or let alone, interpreting the data. We do it all. Others focus on data analysis and distribution but suffer from third-party biases by not owning the collection and cleaning of the raw data.
Also, our main data source is web data, which is public information published by companies themselves, which means it is subject to very little bias. It is very hard to maintain reliability in web data collection systems across millions of pages. We have eight years of experience and a lot of proprietary technology to ensure accurate and consistent collection.
We have built a culture that is dedicated to deep focus and expertise on the products and companies that we track. We have engineers, data analysts and research analysts for each individual product that guarantee the consistency and accuracy we have delivered over years. We track fewer companies than other data providers, but our level of depth, accuracy and understanding is unparalleled.
MT: What is the biggest myth about alternative data?
Vacanti: That funds require a ton of investment to start using and understanding alternative data. There are not many, but some companies, such as YipitData, that invest a lot of time and resources to create products that are easy for investors to incorporate into their analyses. You do not need a data scientist to use our data outputs and read our comprehensive reports.
MT: What are the demands and trends in future alternative data products?
Vacanti: Differentiated data and consistent accuracy are the main drivers of a valuable alternative data product. There are several key demands/trends we have seen that have distinctly differentiated our product.
Research and analytics: There is simply too much data out there and a ton of new data providers. The best alternative data products help their clients understand the data; they digest the information into valuable takeaways and analytics.
Transparency: Being very clear around methodology and performance. Owning up and explaining when data is wrong and communicating often with clients.
Compliance: This has been part of our process since day one, but many data owners, particularly those that do not provide analysis or look to productize their data, are finding how critical a strong compliance framework is.
MT: How can users get a bit of extra performance out of alternative data new datasets?
Vacanti: Look for cross-industry applications between different datasets. This might be one of the hardest parts of alternative data, but if done correctly, it’s where the best insights are found. Great datasets are often just one input on one company/industry; however, two great datasets together can create synergistic insights on companies or industries that are not typically associated. Take for example measuring new auto insurance policies as a proxy for new auto sales.
MT: What’s the economic model of your datasets?
Vacanti: All our products are six months to a year subscriptions and are priced à la carte.
MT: How do clients consume data?
Vacanti: Our clients receive weekly, monthly and quarterly e-mail reports for each product. The e-mails contain summary takeaways, analyses, deep-dive research and the cleaned datasets themselves. We also enable clients to do some customized deep dives on existing datasets, as well as call in to speak with our data or research analysts. It’s a flexible offering designed to meet the needs of hedge funds, long-only investors and analysts.
MT: What are the biggest challenges selling alternative data?
Vacanti: The biggest challenges are cleaning and ingesting different datasets to make sense of them and extract valuable insights. There is no way around this one. You really need to understand the dataset very well and have seen it over a long period of time to understand the challenges, seasonality and biases associated with it. Only then can you get to consistent accuracy. Then it becomes much easier to interpret and gain confidence around the trends and inflection points you see.