Skip to main content

Digital Trends may earn a commission when you buy through links on our site. Why trust us?

All RTX GPUs now come with a local AI chatbot. Is it any good?

A window showing Nvidia's Chat with RTX.
Nvidia

It’s been difficult to justify packing dedicated AI hardware in a PC. Nvidia is trying to change that with Chat with RTX, which is a local AI chatbot that leverages the hardware on your Nvidia GPU to run an AI model.

It provides a few unique advantages over something like ChatGPT, but the tool still has some strange problems. There are the typical quirks you get with any AI chatbot here, but also larger issues that prove Chat with RTX needs some work.

Meet Chat with RTX

Here’s the most obvious question about Chat with RTX: How is this different from ChatGPT? Chat with RTX is a local large language model (LLM). It’s using TensorRT-LLM compatible models — Mistral and Llama 2 are included by default — and applying them to your local data. In addition, the actual computation is happening locally on your graphics card, rather than in the cloud. Chat with RTX requires an Nvidia RTX 30-series or 40-series GPU and at least 8GB of VRAM.

Get your weekly teardown of the tech behind PC gaming
Check your inbox!

A local model unlocks a few unique features. For starters, you load your own data into Chat with RTX. You can put together a folder full of documents, point Chat with RTX to it, and interact with the model based on that data. It offers a closer level of specificity, allowing the model to provide information on detailed documents rather than the more generic answers you see with something like Bing Chat or ChatGPT.

Chat with RTX answering a question about frame interpolation.
Jacob Roach / Digital Trends

And it works. I loaded up a folder with a range of research papers detailing Nvidia’s DLSS 3, AMD’s FSR 2, and Intel’s XeSS and asked some specific questions about how they’re different. Rather than scraping the internet and rewording an article explaining the differences — a common tactic for something like Bing Chat — Chat with RTX was able to provide detailed responses based on the actual research papers.

I wasn’t shocked that Chat with RTX was able to pull information out of some research papers, but I was shocked that it was able to distill that information so well. The documents I provided were, well, research papers, filled with academic speak, equations that will make your head spin, and references to details that aren’t explained in the paper itself. Despite that, Chat with RTX broke down the papers into information that was easy to understand.

You can also point Chat with RTX toward a YouTube video or playlist, and it will take down information from the transcripts. The pointed nature of the tool is what really shines, aa it allows you to focus the session in a single direction rather than ask questions about anything like you would with ChatGPT.

The other upside is that everything happens locally. You don’t have to send your queries to a server, or upload your documents and fear that they’ll be used to train the model further. It’s a streamlined approach to interacting with an AI model — you use your data, on your PC, and ask the questions you need to without any concerns about what’s happening on the other side of the model.

There are some downsides to the local approach of Chat with RTX, however. Most obviously, you’ll need a powerful PC packing a recent Nvidia GPU and at least 8GB of VRAM. In addition, you’ll need around 100GB of free space. Chat with RTX actually downloads the models it uses, so it takes up quite a bit of disk space.

Hallucinations

You didn’t think Chat with RTX would be free of issues, did you? As we’ve come to see with just about every AI tool, there’s a certain tolerance for flat-out incorrect responses from the AI, and Chat with RTX isn’t above that. Nvidia provides a sampling of recent Nvidia news articles with a new installation, and even then, the AI wasn’t always on the money.

Chat with RTX answering a question about Counter-Strike 2.
Jacob Roach / Digital Trends

For example, above you can see that the model said Counter-Strike 2 supports DLSS 3. It does not. I can only assume the model made some sort of connection between the DLSS 3.5 article it references and another article in the included dataset that mentions Counter-Strike 2. 

Chat with RTX answering a question about latency.
Jacob Roach / Digital Trends

The more pressing limitation is that Chat with RTX only has the sample data to go on. This leads to some weird situations where bias within the small dataset would lead to incorrect answers. For example, you can see above how the model says in one response that DLSS Frame Generation doesn’t introduce additional latency to gameplay, while in the next response, it says frame interpolation introduces additional latency to gameplay. DLSS Frame Generation uses frame interpolation.

Chat with RTX answering a question about Nvidia Reflex.
Jacob Roach / Digital Trends

In another response (above), Chat with RTX said that DLSS 3 doesn’t require Nvidia Reflex to work, and that’s not true. Once again, the model is going off of the data I provided, and it’s not perfect. It’s a reminder that an AI model can be incorrect with a straight face, even when it has a narrow focus like Chat with RTX allows.

I expected some of these oddities, but Chat with RTX still managed to surprise me. At various points in different sessions, I would ask a random question completely unrelated to the data I provided. In most situations, I would get a response noting that there’s not enough information for the model to go on to provide an answer. Makes sense.

Chat with RTX answering a question about tying shoelaces.
Jacob Roach / Digital Trends

Except in one situation, the model provided an answer. Using the default data, I asked it how to tie a shoelace, and the model provided step-by-step instructions and referenced an Nvidia blog post about ACE (Nvidia notes this prerelease version occasionally gets the reference files incorrect). When I asked again immediately after, it provided the same standard response about lacking context information.

I’m not sure what’s going on here. It could be that there’s something in the model that allows it to answer this question, or it might be pulling the details from somewhere else. Regardless, it’s clear Chat with RTX isn’t just using the data you provide to it. It has the capability, at least, to get information elsewhere. That became even more clear once I started asking about YouTube videos.

The YouTube incident

One of the interesting aspects of Chat with RTX is that it can read transcripts from YouTube videos. There are some limitations to this approach. The key is that the model only ever sees the transcript, not the actual video. If something happens within the video that’s not included in the transcript, the model never sees it. Even with that limitation, it’s a pretty unique feature.

I had a problem with it, though. Even when starting a completely new session with Chat with RTX, it would remember videos I had linked previously. That shouldn’t happen, as Chat with RTX isn’t supposed to remember the context of your current or previous conversations.

Chat with RTX answering a question about a YouTube video.
Jacob Roach / Digital Trends

I’ll walk through what happened because it can get a little hairy. In my first session, I linked to a video from the YouTube channel Commander at Home. It’s a Magic: the Gathering channel, and I wanted to see how Chat with RTX would respond to a complex topic that’s not explained in the video. It unsurprisingly didn’t do well, but that’s not what’s important.

I removed the old video and linked to an hourlong interview with Nvidia’s CEO Jensen Huang. After entering the link, I clicked the dedicated button to rebuild the database, basically telling Chat with RTX that we’re chatting about new data. I started this conversation out the same as I did in the previous one by asking, “what is this video about?” Instead of answering based on the Nvidia video I linked, it answered based on the previous Commander at Home video.

Chat with RTX answering a question about a YouTube video.
Note that the URL in this screenshot is different from the previous screenshot. Jacob Roach / Digital Trends

I tried rebuilding the database three more times, always with the same result. Eventually, I started a brand new session, completely exiting out of Chat with RTX and starting fresh. Once again, I linked the Nvidia video and downloaded the transcript, starting off with asking what the video was about. It again answered about the Commander at Home video.

Chat with RTX answering a question about a YouTube video.
Jacob Roach / Digital Trends

I was only able to get Chat with RTX to answer about the Nvidia video when I asked a specific question about that video. Even after chatting for a bit, any time I asked what the video was about, it’s answer would relate to the Commander at Home video. Remember, in this session, Chat with RTX never saw that video link.

Nvidia says this is because Chat with RTX reads all of the transcripts you’ve downloaded. They’re stored locally in a folder, and it will continue answering questions about all the videos you’ve entered, even when you start a new session. You have to delete the transcripts manually.

In addition, Chat with RTX will struggle with general questions if you have multiple video transcripts. When I asked what the video was about, Chat with RTX decided that I was asking about the Commander at Home video, so that was the video it referenced. It’s a little confusing, but you’ll need to manually choose the transcripts you want to chat about, especially if you’ve entered YouTube links previously.

You find the usefulness

If nothing else, Chat with RTX is a demonstration of how you can leverage local hardware to use an AI model, which is something that PCs have sorely lacked over the past year. It doesn’t require a complex setup, and you don’t need to have deep knowledge of AI models to get started. You install it, and as long as you’re using a recent Nvidia GPU, it works.

It’s hard to pin down how useful Chat with RTX is, though. In a lot of cases, a cloud-based tool like ChatGPT is strictly better due to the wide swath of information it can access. You have to find the usefulness with it. If you have a long list of documents to parse, or a stream of YouTube videos you don’t have the time to watch, Chat with RTX provides something you won’t find with a cloud-based tool — assuming you respect the quirks inherent to any AI chatbot.

This is just a demo, however. Through Chat with RTX, Nvidia is demonstrating what a local AI model can do, and hopefully it’s enough to gather interest from developers to explore local AI apps on their own.

Editors' Recommendations

Jacob Roach
Lead Reporter, PC Hardware
Jacob Roach is the lead reporter for PC hardware at Digital Trends. In addition to covering the latest PC components, from…
The RTX 50-series may fix the biggest weakness of its predecessor
Nvidia Blackwell chips.

Nvidia's RTX 50-series is still largely a mystery, but the occasional leak keeps the hype alive. And this time, it's gaming laptop enthusiasts who have reason to be happy. According to Moore's Law Is Dead, Nvidia's Blackwell laptop GPUs are receiving a much-needed boost to the memory interface, implying that Nvidia may have learned from its past mistakes.

Although rumor has it that Nvidia may be aiming to launch the first RTX 50-series desktop GPUs in the final quarter of 2024, we've heard very little about these graphics cards that didn't come from leaks. We've heard even less about their laptop counterparts, but it's possible that those will come later, as NotebookCheck reports that Nvidia may be delaying those models.

Read more
The RTX 4090 is more popular on Steam than any AMD GPU
Nvidia GeForce RTX 4090 GPU.

Despite being easily the fastest graphics card you can buy right now, the RTX 4090 is a niche product. At $1,600, it's out of the conversation for the vast majority of gamers. Still, that hasn't stopped the GPU from reaching a high ranking in Steam's hardware survey. According to the latest survey, the RTX 4090 is in 0.96% of gaming PCs running Steam -- more than any individual AMD GPU.

Although it's no surprise that Nvidia tops the charts in the Steam hardware survey -- the most recent report says Nvidia is represented in 76.59% of PCs compared to AMD's 15.79% -- it's shocking to see such an expensive GPU rank so highly. Compared to last month, the RTX 4090 even gained 0.11%, despite only being available above list price.

Read more
All of the exciting new GPUs still coming in 2024
Intel Arc A770 GPU installed in a test bench.

It's already been an exciting year for graphics cards, but things are really about to start heating up. A series of leaks and rumors over the past year suggest that Nvidia, AMD, and Intel will all launch new GPU generations in the back half of 2024, with the three graphics behemoths' offerings duking it out for a slot among the best graphics cards.

We've already seen graphics cards like the RX 7600 XT and RTX 4070 Super this year, but there could be many more in the tank. Although each of the three major graphics brands have confirmed that next-gen GPUs are on the way, none of them have laid out a specific timeline. So, with the understanding that these release windows are still speculative, these are the GPUs launching this year that we'll all be talking about heading into 2025.
Nvidia -- RTX 50-series (Blackwell)

Read more