It's a universal truth of human nature that the developers who build the code shouldn't test it. First of all, most of them really hate the job. Second, as with any good audit protocol, the person doing the work should not be the one verifying it.
So it's no surprise that code testing in all its forms, including usability, language or task-specific testing, and end-to-end testing, is a focus for executives at growing generative AI startups. Every week, TechCrunch features a different project like Antithesis ($47 million raised). CodiumAI (raised $11 million) QA Wolf (raised $20 million). And new companies are emerging all the time, like his Momentic, which just graduated from Y Combinator.
The other company is Nova AI, a one-year-old startup from the Unusual Academy accelerator that raised $1 million in a pre-seed round. Founder CEO Zach Smith told TechCrunch that the company is breaking many of Silicon Valley's rules for how startups should operate and trying to outdo its competitors with end-to-end testing tools.
While the standard Y Combinator approach is to start small, Nova AI targets medium-to-large companies with complex codebases and current growing needs. Smith declined to name the customers using or testing his products, but most of them are in e-commerce, fintech, or later stages of consumer products (Series C and above). ) is a venture-backed startup that says it offers a “heavy user experience.” Downtime for these features is costly. ”
Nova AI's technology scrutinizes customers' code and uses GenAI to automatically build tests. It's specifically aimed at continuous integration and continuous delivery/deployment (CI/CD) environments, where engineers are continually incorporating pieces into production code.
The idea for Nova AI came from Smith and his co-founder Jeffrey See's engineering days working at major technology companies. Mr. Smith is a former Google employee who worked on cloud teams that helped customers use many automation technologies. Shih previously worked at Meta (before that he also worked at Unity and he worked at Microsoft), where he had an unusual AI specialty around synthetic data. Later, they added his third co-founder, AI data scientist Henry Lee.
Another rule that Nova AI doesn't follow: While many AI startups are built on top of OpenAI's industry-leading GPT, Nova AI avoids using OpenAI's Chat GPT-4 as much as possible. I'm only using it to generate code and perform some labeling tasks. . No customer data is fed to OpenAI.
Although OpenAI has promised not to use data from people on paid business plans to train models, companies still don't trust OpenAI, Smith said. “When we talk to large companies, they say, 'We don't want our data to go into OpenAI,'” Smith said.
It's not just engineering teams at large companies who feel this way. OpenAI has received numerous requests from people who do not want their work used to train models, or who believe that their work will be included in the work without their permission and without compensation. is fending off a lawsuit.
Nova AI instead relies heavily on open source models such as Meta and Llama developed by StarCoder (from the BigCoder community developed by ServiceNow and Hugging Face) as well as building its own models. They haven't used Google's Gemma with any customers yet, but they've tested it and “got some good results,” Smith says.
For example, he explains that a common use of OpenAI GPT4 is to “generate vector embeddings” on data so that LLM models can use vectors for semantic search. Because vector embeddings convert chunks of text into numbers, LLM can perform various operations, such as clustering text with other chunks of similar text. Nova AI uses OpenAI's GPT4 on its customers' source code, but strives not to send any data to OpenAI.
“In this case, instead of using OpenAI's embedded model, you deploy your own open source embedded model. This way, when you need to run all the files, you're not just sending the files to OpenAi. ” explained Smith.
While not sending customer data to OpenAI will reassure nervous businesses, open-source AI models are cheaper and more than adequate to perform specific, targeted tasks, Smith found. In this case it works well for writing tests.
“The open LLM industry has really proven that even when you focus very narrowly, you can beat GPT 4 and these big domain providers,” he said. “We don't need to provide a huge model to tell Grandma what she wants for her birthday, right? We need to write a test. That's it. So our model is specially designed for that. It has been adjusted.”
Open source models are also rapidly evolving. For example, Meta recently announced a new version of Llama. This has earned praise in the tech world and may encourage more AI startups to look at OpenAI alternatives.