英伟达ChatRTX:建立本地AI助手

前言¶

ChatRTX 是英伟达提供的一个demo程序，它可以使用自己的文档、笔记等数据来定制个人版本的大语言模型(LLM) GPT聊天助手。

通过 RAG(retrieval-augmented generation)、TensorRT-LLM 和 RTX加速卡，可以使用定制化的 ChatBot 来快速获取到答案，因为它是运行在本地Windows PC或工作站的，这样可以更快和更安全。

它目前只支持Windows系统。

官网介绍页：https://www.nvidia.com/en-us/ai-on-rtx/chatrtx/
下载安装包: https://us.download.nvidia.com/RTX/ChatWithRTX_installer_3_5.zip

系统要求¶

Item	Requirement
Platform	Windows
GPU	NVIDIA GeForce™ RTX 30 or 40 Series GPU or NVIDIA RTX™ Ampere or Ada Generation GPU with at least 8GB of VRAM
RAM	16GB or greater
OS	Windows 11
Driver	535.11 or later
File Size	35GB

下载文件是35GB，电脑内存至少16GB，显卡至少要RTX30系列以上。

安装¶

使用安装包一键安装很顺利，但之后启动时还是会联网，遇到无法正常访问错误: https://huggingface.co/WhereIsAI/UAE-Large-V1

Hugging face (https://huggingface.co) 是一个模型共享平台，虽然国内无法正常访问，但可以使用其镜像网站: https://hf-mirror.com

关于如何下载模型文件，参考下列资源:

Nvidia Chat With RTX再三体验: https://zhuanlan.zhihu.com/p/683316290
如何快速下载huggingface模型——全方法总结: https://zhuanlan.zhihu.com/p/663712983
如何快速下载huggingface大模型: https://padeoe.com/huggingface-large-models-downloader/

使用官方推荐的 huggingface-cli 命令行工具，提前下载离线模型文件，解决网络导致的无法使用问题。

HuggingFace CLI工具¶

huggingface-cli工具是供下载模型用的，文档: https://huggingface.co/docs/huggingface_hub/main/en/guides/cli

$ pip install -U huggingface_hub -i https://pypi.org/simple

$ huggingface-cli --help
usage: huggingface-cli <command> [<args>]

positional arguments:
  {env,login,whoami,logout,repo,upload,download,lfs-enable-largefiles,lfs-multipart-upload,scan-cache,delete-cache}
                        huggingface-cli command helpers
    env                 Print information about the environment.
    login               Log in using a token from huggingface.co/settings/tokens
    whoami              Find out which huggingface.co account you are logged in as.
    logout              Log out
    repo                {create} Commands to interact with your huggingface.co repos.
    upload              Upload a file or a folder to a repo on the Hub
    download            Download files from the Hub
    lfs-enable-largefiles
                        Configure your repository to enable upload of files > 5GB.
    scan-cache          Scan cache directory.
    delete-cache        Delete revisions from the cache directory.

optional arguments:
  -h, --help            show this help message and exit

下载离线模型文件¶

# 使用镜像站点
$ export HF_ENDPOINT=https://hf-mirror.com

# 下载完整模型目录
$ huggingface-cli download --resume-download WhereIsAI/UAE-Large-V1 \
                           --local-dir-use-symlinks False \
                           --local-dir ~/models/UAE-Large-V1

$ ls -l ~/models/UAE-Large-V1/

onnx/
onnx/model.onnx
onnx/model_fp16.onnx
onnx/model_quantized.onnx
.gitattributes
config.json
model.safetensors
README.md
special_tokens_map.json
tokenizer.json
tokenizer_config.json
vocab.txt

使用离线模型文件¶

$ cd C:\path\to\NVIDIA\ChatWithRTX\

$ type RAG\trt-llm-rag-windows-main\config\app_config.json

{
    "streaming": true,
    "similarity_top_k": 4,
    "is_chat_engine": false,
    "embedded_model": "WhereIsAI/UAE-Large-V1",
    "embedded_dimension": 1024
}

使用绝对路径加载离线模型文件:

{
    "streaming": true,
    "similarity_top_k": 4,
    "is_chat_engine": false,
    "embedded_model": "F:\\models\\UAE-Large-V1",
    "embedded_dimension": 1024
}

使用体验¶

安装成功，界面如下:

成功启动后，分别用英文和中文提问，发现对中文支持不是很好:

使用自己提供的一个PDF文件，基于此提问，会出现中文乱码:

整体而言，仍可以算是一个不错的人工智能聊天助手

这样，在本地电脑成功运行了一个AI聊天助手。Just for fun!

本文作者：「吴羽舒(LeslieZhu) 」创作于Wednesday, March 20, 2024，共819字，需要5分钟阅读时间
微信搜索： 「 MinYiLife 」, 关注公众号!
本文链接： https://www.lesliezhu.com/blog/2024/03/20/what_is_ChatRTX/
版权声明： 原创文章，如需转载请注明文章作者和出处。谢谢！