Running DeepSeek-R1 671B on Linux with Ollama, Docker, and WebUI requires a high-end system with multiple GPUs, a lot of RAM, and efficient storage. This guide walks you through the step-by-step setup.
1. Hardware Requirements
Before proceeding, ensure your Linux system meets the following minimum requirements for DeepSeek-R1 671B:
- CPU: Multi-core AMD Threadripper / Intel Xeon
- RAM: 1TB+ DDR5 ECC
- GPU: Multi-GPU setup with at least 480GB VRAM, such as:
- 20x Nvidia RTX 3090 (24GB each)
- 10x Nvidia RTX A6000 (48GB each)
- Storage: At least 10TB+ NVMe SSD
- Power & Cooling: Sufficient power supply and cooling for multi-GPU usage
2. Install Required Software
Step 1: Install NVIDIA Drivers & CUDA
Since DeepSeek-R1 671B runs best with GPU acceleration, install the latest NVIDIA drivers and CUDA:
- Update System:
sudo apt update && sudo apt upgrade -y
- Install NVIDIA Driver:
sudo apt install -y nvidia-driver-535
- Install CUDA Toolkit:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-repo-ubuntu2204_12.3.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204_12.3.0-1_amd64.deb
sudo apt update
sudo apt install -y cuda
- Verify Installation:
nvidia-smi
Step 2: Install Docker & NVIDIA Container Toolkit
DeepSeek-R1 671B runs efficiently inside a Docker container.
- Install Docker:
sudo apt install -y docker.io
sudo systemctl enable docker
sudo systemctl start docker
- Install NVIDIA Container Toolkit:
sudo apt install -y nvidia-container-toolkit
sudo systemctl restart docker
- Test GPU Integration:
docker run --rm --gpus all nvidia/cuda:12.3.0-base nvidia-smi
Step 3: Install Ollama
Ollama helps run LLMs efficiently with optimizations for inference.
- Download and Install Ollama:
curl -fsSL https://ollama.ai/install.sh | sh
- Verify Installation:
ollama --version
3. Download and Run DeepSeek-R1 671B
Step 1: Pull DeepSeek-R1 671B Model
- Download DeepSeek-R1 671B using Ollama:
ollama pull deepseek/deepseek-r1-671b
This download is large and requires at least 10TB+ SSD storage.
- Run DeepSeek-R1 671B with Ollama:
ollama run deepseek/deepseek-r1-671b
4. Set Up WebUI for DeepSeek-R1
To interact with the model via a web interface, install WebUI.
Step 1: Clone WebUI Repository
git clone https://github.com/deepseek-ai/webui.git
cd webui
Step 2: Build and Run WebUI with Docker
- Build Docker Image:
docker build -t deepseek-webui .
- Run WebUI Container:
docker run --gpus all --shm-size=512g -p 7860:7860 -v deepseek_cache:/root/.cache deepseek-webui
This command allocates shared memory (
--shm-size=512g
) to optimize model execution.
Step 3: Access WebUI
- Open your browser and navigate to:
http://localhost:7860
- Start interacting with DeepSeek-R1 671B via the WebUI.
5. Performance Optimization
Enable Multi-GPU Scaling
For optimal speed, configure NVIDIA NCCL for efficient GPU communication:
export NCCL_P2P_DISABLE=0
export NCCL_IB_DISABLE=0
export NCCL_DEBUG=INFO
Modify Docker for Better Performance
- Allocate More Memory:
docker run --gpus all --shm-size=1T -p 7860:7860 deepseek-webui
- Enable Persistent Storage:
docker run --gpus all -p 7860:7860 -v /mnt/deepseek:/root/.cache deepseek-webui
Run Model in Background with Logs
nohup ollama run deepseek/deepseek-r1-671b > deepseek.log 2>&1 &
6. Conclusion
You have successfully set up DeepSeek-R1 671B on Linux using Ollama, Docker, and WebUI. This setup provides GPU acceleration, a scalable WebUI, and efficient model execution.
Next Steps:
- Monitor GPU performance using
nvidia-smi
- Optimize memory and CPU usage based on your workload
- Experiment with smaller DeepSeek models for faster inference