Numbers Station, a startup using large-scale language models (LLMs) to power its data analytics platform, today launches its first cloud-based product, the aptly named Numbers Station Cloud. Early access is now open. This service allows virtually anyone in your company to analyze internal data using Numbers Station's chat interface.
Several similar tools focus on converting natural language queries into database languages such as SQL. However, the Numbers Station team argues that this approach has its limitations. One reason for this is that a general-purpose LLM does not understand how a specific company operates, how its data is structured, or how people within the company refer to specific data objects.
Numbers Station co-founder and CEO Chris Aberger told me there's too much noise about how the service allows users to “chat with their data.” Because of this, he seems to be a little tired of talking. “But what's really driving things is the higher level of non-business owners and non-technical users having questions they want to ask and getting answers through these classic structured data sources. It’s there,” he told me. “It takes a lot of data modeling, data plumbing to make these foundational models and large language models work.”
For Numbers Station, this means devoting more engineering resources to building what the company calls a semantic catalog. That catalog is essentially an automatically curated source of company metrics and definitions. That catalog is unique to each company (it is not shared between companies). Mr. Aberger described the catalog as a “horror” that ensures, for example, that a model's definition of “recurring revenue” matches the company's use of that term.
Numbers Station's platform is built on a very specialized set of LLM and machine learning models, but it's this catalog that brings it all together. Ines Chami, co-founder and principal investigator at Numbers Station, told me that the team initially underestimated the challenge of building that part of the platform.
“It goes back to the classics [machine learning] And classic data engineering: How do we create representations of knowledge that models can actually use to answer these questions?'' she told me. “Because it's impossible for a model to understand all these metrics or to understand all the things business users ask.” After all, even humans can't immediately understand all the questions, so models can I need to turn a vague question into a very specific query. According to Numbers Stations research, their approach significantly improves accuracy compared to traditional text-to-SQL pipelines.
The company is launching the chat service today, but its overall vision is much bigger.
“What we're doing is basically building an AI platform for analytics,” Aberger said. “This is also one of the applications.” […]. There are larger, broader initiatives that we're still working on as a company, and we're working on solving various data problems on top of that. Examples include: How can I enrich my data with third-party data? Sources? How can I perform classical algorithms such as fuzzy matching? This platform allows you to build an almost infinite number of spokes. ”
The company already has contracts with several Fortune 500 clients, including global real estate services firm Jones Lang LaSalle. “Numbers Station is at the cutting edge of enterprise AI for structured data,” said Sharad Rastogi, CEO of Jones Lang LaSalle's Work His Dynamics Technology. “We are impressed with Numbers Station's reliable and engaging platform. It continuously learns as it is used, allowing data teams to discover and test hypotheses to drive impactful business outcomes. It will be like this.”