Always keen to encourage the purchase of the latest GPUs, Nvidia is releasing a tool that allows owners of GeForce RTX 30-series and 40-series cards to run AI-powered chatbots offline on their Windows PCs.
The tool, called Chat with RTX, allows users to customize GenAI models along OpenAI's ChatGPT by connecting them to documents, files, and notes that can be queried.
“Rather than searching through notes or saved content, users can simply enter a query,” Nvidia wrote in a blog post. “For example, you can ask, 'What restaurant did your partner recommend to you while we were in Las Vegas?'” Chat with RTX scans the user-specified local His file and provides context-filled provide answers. ”
Chat with RTX defaults to the open source model from AI startup Mistral, but it also supports other text-based models such as Meta's Llama 2. Nvidia warns that downloading all the necessary files will consume a significant amount of storage (50 GB to 100 GB, depending on the model) selected.
Currently, Chat with RTX works with text, PDF, .doc, .docx, and .xml formats. When you specify a folder that contains supported files in the app, the files are loaded into the model's fine-tuning dataset. Additionally, Chat with RTX can retrieve the URL of a YouTube playlist and load the transcripts of the videos in the playlist, allowing selected models to query that content.
There are some limitations to keep in mind here, which to Nvidia's credit are outlined in the how-to guide.
Chat with RTX can't remember context, so the app won't consider previous questions when answering follow-up questions. For example, if you ask “What is a common bird in North America?” followed by “What color is it?” Chat with RTX knows you're talking about birds. not.
Nvidia also acknowledges that the relevance of an app's responses can be affected by a variety of factors, including the phrasing of the question, the performance of the selected model, and the size of the fine-tuning dataset. , some are easier to control. Asking for facts covered in several documents is likely to yield better results than asking for an overview of one or a set of documents. Also, response quality generally improves with larger datasets. Similarly, you'll be able to see more content on specific subjects in chat with RTX, he says Nvidia.
Therefore, Chat with RTX is more of a toy than something for production use. Still, there's something to be said for apps that make it easy to run AI models locally. This is kind of a growing trend.
In a recent report, the World Economic Forum predicted a “dramatic” increase in affordable devices that can run GenAI models offline, including PCs, smartphones, Internet of Things devices, and networking equipment. The WEF says the reasons are clear advantages. Not only are offline models inherently more private, offline models ensure that the data being processed never leaves the device on which it is running, resulting in lower latency and more cost-effectiveness than cloud-hosted models. .
Of course, democratizing the tools for running and training models opens the door to malicious attackers. A quick Google search will bring up a long list of models that have been tweaked based on malicious content on the web. But proponents of apps like Chat with RTX argue that the benefits outweigh the harm. We'll have to wait and see.