Deploying DeepSeek-R1 671B on Ubuntu using Ollama, Docker, and WebUI requires a high-end multi-GPU system, proper software setup, and optimizations. Follow this step-by-step guide to set it up efficiently.
1. System Requirements
Minimum Hardware Requirements
To run DeepSeek-R1 671B, you need an extreme hardware setup:
- CPU: AMD Threadripper / Intel Xeon
- RAM: 1TB+ DDR5 ECC
- GPU: Multi-GPU setup with at least 480GB VRAM, such as:
- 20x Nvidia RTX 3090 (24GB each)
- 10x Nvidia RTX A6000 (48GB each)
- Storage: 10TB+ NVMe SSD
- Power & Cooling: Proper PSU and cooling for multi-GPU workloads
Note: Running DeepSeek-R1 671B requires a distributed multi-GPU system due to its extreme resource needs.
2. Install Required Software
Step 1: Update Ubuntu & Install Essential Packages
sudo apt update && sudo apt upgrade -y
sudo apt install -y build-essential git curl wget python3 python3-pip
Step 2: Install NVIDIA Drivers, CUDA, and cuDNN
Since DeepSeek-R1 relies heavily on GPU acceleration, install the latest NVIDIA drivers.
1. Install NVIDIA Drivers
sudo apt install -y nvidia-driver-535
reboot
After rebooting, verify installation:
nvidia-smi
2. Install CUDA 12.3
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-repo-ubuntu2204_12.3.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204_12.3.0-1_amd64.deb
sudo apt update
sudo apt install -y cuda
3. Install cuDNN
sudo apt install -y libcudnn8 libcudnn8-dev
Step 3: Install Docker & NVIDIA Container Toolkit
DeepSeek-R1 runs inside a Docker container.
1. Install Docker
sudo apt install -y docker.io
sudo systemctl enable docker
sudo systemctl start docker
2. Install NVIDIA Container Toolkit
sudo apt install -y nvidia-container-toolkit
sudo systemctl restart docker
3. Verify NVIDIA Docker Support
docker run --rm --gpus all nvidia/cuda:12.3.0-base nvidia-smi
If successful, you should see your NVIDIA GPUs listed.
Step 4: Install Ollama
Ollama is optimized for LLM inference.
curl -fsSL https://ollama.ai/install.sh | sh
Verify Installation
ollama --version
3. Download and Run DeepSeek-R1 671B
Step 1: Pull the DeepSeek-R1 Model
ollama pull deepseek/deepseek-r1-671b
This is a massive download (~10TB), so ensure you have enough storage.
Step 2: Run DeepSeek-R1 671B with Ollama
ollama run deepseek/deepseek-r1-671b
4. Set Up WebUI for DeepSeek-R1
To interact with DeepSeek-R1 via a browser-based UI, install WebUI.
Step 1: Clone WebUI Repository
git clone https://github.com/deepseek-ai/webui.git
cd webui
Step 2: Build & Run WebUI with Docker
- Build WebUI Docker Image
docker build -t deepseek-webui .
- Run WebUI Container
docker run --gpus all --shm-size=1T -p 7860:7860 -v deepseek_cache:/root/.cache deepseek-webui
--shm-size=1T
increases shared memory for better model execution.
Step 3: Access WebUI
- Open your browser and go to:
http://localhost:7860
- Now, you can interact with DeepSeek-R1 671B via WebUI.
5. Performance Optimization
Enable Multi-GPU Scaling
For efficient GPU communication, set NVIDIA NCCL:
export NCCL_P2P_DISABLE=0
export NCCL_IB_DISABLE=0
export NCCL_DEBUG=INFO
Allocate More Memory to Docker
Modify the Docker run command for better performance:
docker run --gpus all --shm-size=2T -p 7860:7860 deepseek-webui
Run in Background with Logs
nohup ollama run deepseek/deepseek-r1-671b > deepseek.log 2>&1 &
6. Conclusion
You have successfully set up DeepSeek-R1 671B on Ubuntu using Ollama, Docker, and WebUI. This setup provides GPU acceleration, a scalable WebUI, and optimized inference performance.
Next Steps
- Monitor GPU performance using
nvidia-smi
- Optimize memory allocation for better efficiency
- Experiment with smaller DeepSeek models for faster inference