Running DeepSeek-R1 671B on Linux with Ollama, Docker, and WebUI requires a high-end system with multiple GPUs, a lot of RAM, and efficient storage. This guide walks you through the step-by-step setup.


1. Hardware Requirements

Before proceeding, ensure your Linux system meets the following minimum requirements for DeepSeek-R1 671B:

  • CPU: Multi-core AMD Threadripper / Intel Xeon
  • RAM: 1TB+ DDR5 ECC
  • GPU: Multi-GPU setup with at least 480GB VRAM, such as:
  • 20x Nvidia RTX 3090 (24GB each)
  • 10x Nvidia RTX A6000 (48GB each)
  • Storage: At least 10TB+ NVMe SSD
  • Power & Cooling: Sufficient power supply and cooling for multi-GPU usage

2. Install Required Software

Step 1: Install NVIDIA Drivers & CUDA

Since DeepSeek-R1 671B runs best with GPU acceleration, install the latest NVIDIA drivers and CUDA:

  1. Update System:
   sudo apt update && sudo apt upgrade -y
  1. Install NVIDIA Driver:
   sudo apt install -y nvidia-driver-535
  1. Install CUDA Toolkit:
   wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-repo-ubuntu2204_12.3.0-1_amd64.deb
   sudo dpkg -i cuda-repo-ubuntu2204_12.3.0-1_amd64.deb
   sudo apt update
   sudo apt install -y cuda
  1. Verify Installation:
   nvidia-smi

Step 2: Install Docker & NVIDIA Container Toolkit

DeepSeek-R1 671B runs efficiently inside a Docker container.

  1. Install Docker:
   sudo apt install -y docker.io
   sudo systemctl enable docker
   sudo systemctl start docker
  1. Install NVIDIA Container Toolkit:
   sudo apt install -y nvidia-container-toolkit
   sudo systemctl restart docker
  1. Test GPU Integration:
   docker run --rm --gpus all nvidia/cuda:12.3.0-base nvidia-smi

Step 3: Install Ollama

Ollama helps run LLMs efficiently with optimizations for inference.

  1. Download and Install Ollama:
   curl -fsSL https://ollama.ai/install.sh | sh
  1. Verify Installation:
   ollama --version

3. Download and Run DeepSeek-R1 671B

Step 1: Pull DeepSeek-R1 671B Model

  1. Download DeepSeek-R1 671B using Ollama:
   ollama pull deepseek/deepseek-r1-671b

This download is large and requires at least 10TB+ SSD storage.

  1. Run DeepSeek-R1 671B with Ollama:
   ollama run deepseek/deepseek-r1-671b

4. Set Up WebUI for DeepSeek-R1

To interact with the model via a web interface, install WebUI.

Step 1: Clone WebUI Repository

git clone https://github.com/deepseek-ai/webui.git
cd webui

Step 2: Build and Run WebUI with Docker

  1. Build Docker Image:
   docker build -t deepseek-webui .
  1. Run WebUI Container:
   docker run --gpus all --shm-size=512g -p 7860:7860 -v deepseek_cache:/root/.cache deepseek-webui

This command allocates shared memory (--shm-size=512g) to optimize model execution.

Step 3: Access WebUI

  • Open your browser and navigate to:
  http://localhost:7860
  • Start interacting with DeepSeek-R1 671B via the WebUI.

5. Performance Optimization

Enable Multi-GPU Scaling

For optimal speed, configure NVIDIA NCCL for efficient GPU communication:

export NCCL_P2P_DISABLE=0
export NCCL_IB_DISABLE=0
export NCCL_DEBUG=INFO

Modify Docker for Better Performance

  • Allocate More Memory:
  docker run --gpus all --shm-size=1T -p 7860:7860 deepseek-webui
  • Enable Persistent Storage:
  docker run --gpus all -p 7860:7860 -v /mnt/deepseek:/root/.cache deepseek-webui

Run Model in Background with Logs

nohup ollama run deepseek/deepseek-r1-671b > deepseek.log 2>&1 &

6. Conclusion

You have successfully set up DeepSeek-R1 671B on Linux using Ollama, Docker, and WebUI. This setup provides GPU acceleration, a scalable WebUI, and efficient model execution.

Next Steps:

  • Monitor GPU performance using nvidia-smi
  • Optimize memory and CPU usage based on your workload
  • Experiment with smaller DeepSeek models for faster inference

Was this article helpful?
YesNo

Similar Posts