27.6K Stars!【推荐】告别GPU压力：轻松本地运行AI模型的终极指南

原创

AI云极 2025-01-07 14:33:21 博主文章分类：【AI智能系列】 ©著作权

©著作权归作者所有：来自51CTO博客作者AI云极的原创作品，请联系作者获取转载授权，否则将追究法律责任

1、介绍

项目地址：

https://github.com/mudler/LocalAI

LocalAI 是免费的开源 OpenAI 替代品。LocalAI 充当与 OpenAI 兼容的直接替代 REST API（Elevenlabs、Anthropic...本地 AI 推理的 API 规范。它允许您在本地或本地使用消费级硬件运行 LLM、生成图像、音频（不仅如此），支持多个型号系列。不需要 GPU。它由 Ettore Di Giacinto 创建和维护。

2、部署

运行安装程序脚本：

curl https://localai.io/install.sh | sh

或使用 docker 运行：

# CPU only image:
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-cpu

# Nvidia GPU:
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12

# CPU and GPU image (bigger size):
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest

# AIO images (it will pre-download a set of models ready for use, see https://localai.io/basics/container/)
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu

要加载模型：

# From the model gallery (see available models with `local-ai models list`, in the WebUI from the model tab, or visiting https://models.localai.io)
local-ai run llama-3.2-1b-instruct:q4_k_m
# Start LocalAI with the phi-2 model directly from huggingface
local-ai run huggingface://TheBloke/phi-2-GGUF/phi-2.Q8_0.gguf
# Install and run a model from the Ollama OCI registry
local-ai run ollama://gemma:2b
# Run a model from a configuration file
local-ai run https://gist.githubusercontent.com/.../phi-2.yaml
# Install and run a model from a standard OCI registry (e.g., Docker Hub)
local-ai run oci://localai/phi-2:latest