AI tends to make things up. This is unattractive to almost everyone who uses it regularly, but it is especially attractive to businesses where erroneous results can negatively impact their bottom line. Half of employees responding to a recent Salesforce survey said they were concerned about inaccurate answers from their company's AI-powered generation systems.
There are no techniques that can cure these “hallucinations,” but there are things that can help. For example, search augmentation generation (RAG) combines an AI model with a knowledge base to provide supplementary information before the model responds, acting as a type of fact-checking mechanism.
Entire businesses are being built on RAGs due to the huge demand for more reliable AI. Voyage AI is one of them. Founded in 2023 by Stanford University professor Tengyu Ma, Voyage powers RAG systems for companies such as Harvey, Vanta, Replit, and SK Telecom.
“Voyage is on a mission to improve the accuracy and efficiency of search and retrieval in enterprise AI,” Ma told TechCrunch in an interview. “Nautical Solution” [are] Tailored to specific areas such as coding, finance, legal, multilingual applications, and tailored to your company's data. ”
To power the RAG system, Voyage trains an AI model to convert data in text, documents, PDFs, and other formats into numerical representations called vector embeddings. Embedding is useful for search-related applications such as RAGs because it captures the meaning and relationships between different data points in a compact format.
Image credit: Voyage AI
Voyage uses a specific type of embedding called a contextual embedding. This captures not only the semantic meaning of the data, but also the context in which the data appears. For example, if the sentences “I was sitting on the bank of the river'' and “I deposited the money in the bank'' contain the word “bank,'' Voyage's embedding model creates an instance of “bank.'' Each generates a different vector and reflects different vectors. meaning implied by context.
Voyage hosts and licenses its models for use on-premises, in a private cloud, or in the public cloud, and fine-tunes them for clients who choose to pay for this service. While the company is not unique in this regard, and OpenAI also offers customizable embedding services, Ma claims that Voyage's model offers superior performance at a lower cost.
“With RAG, given a question or query, we first retrieve relevant information from an unstructured knowledge base, much like a librarian searching for a book in a library,” he explained. “Traditional RAG techniques often suffer from loss of context while encoding information and fail to retrieve relevant information. Voyage's embedding model has best-in-class retrieval accuracy, which is higher than RAG It leads to the end-to-end response quality of the system.”
Underscoring these bold claims is the endorsement of OpenAI's biggest rival, Anthropic. Anthropic's support documentation describes Voyage's model as “state-of-the-art.”
“Voyage's approach uses vector embeddings trained on enterprise data to provide context-aware search, which significantly improves search accuracy,” said Ma. Masu.
Palo Alto-based Voyage has just over 250 customers, Ma said. He did not answer questions about his income.
In September, Voyage, which has about a dozen employees, closed a $20 million Series A round led by CRV with participation from Wing VC, Conviction, Snowflake, and Databricks. Ma said the funding infusion brings Voyage's total funding to $28 million and will help launch new embedded models and allow the company to double in size.