专门用一篇blog来记录遇到新的AI专业术语，这些术语有时候含义，不是那么容易查询。

GPTQ

stands for “Generative Pre-trained Transformer Quantization”. 表示这个模型是支持GPU

GGML

GGML is a Tensor library for machine learning, it is just a C++ library that allows you to run LLMs on just the CPU or CPU + GPU. It defines a binary format for distributing large language models (LLMs). GGML makes use of a technique called quantization that allows for large language models to run on consumer hardware.

GPT-Generated Model Language，表示该模型支持CPU，目前已经淘汰。

GGUF

GPT-Generated Unified Format，这是最新的版本支持CPU的大模型，替代GGML。

quantization

量化

GGML supports a number of different quantization strategies (e.g. 4-bit, 5-bit, and 8-bit quantization), each of which offers different trade-offs between efficiency and performance.

Retrieval-augmented generation (RAG)

检索增强生成，意思是生成式问答（Generative Question Answering.）

pretrained models and fine-tuned models

预训练模型和微调模型

AI-Terminology

AI专业术语