Run DeepSeek Locally on Your Mac mini M4 Pro
To run DeepSeek locally on your Mac mini M4 Pro, follow this comprehensive setup guide. This includes using Docker and Open WebUI for a ChatGPT-like experience. Here’s a streamlined process for setting it up:
1. Install Ollama (the AI engine)
First, install the Ollama runtime to handle local AI models.
- Install Ollama using this command in Terminal:
/bin/bash -c "$(curl -fsSL https://ollama.com/download)"
- Check if it’s installed by running:
ollama --version
2. Download DeepSeek R1 Models
Choose a model based on your hardware capabilities. For Mac mini M4 Pro, you can use models like 8B, 14B, or 32B, but larger models may not work well due to RAM/GPU constraints.
- Pull a model:
ollama pull deepseek-r1:8b # Fast, lightweight
ollama pull deepseek-r1:14b # Balanced performance
ollama pull deepseek-r1:32b # Heavy processing
ollama pull deepseek-r1:70b # Max reasoning, slowest
3. Run DeepSeek R1 in Basic Mode
To test the model in Terminal (without the GUI):
ollama run deepseek-r1:8b
This provides AI access via the terminal interface.
4. Upgrade to a ChatGPT-Like Interface Using Docker and Open WebUI
For a better user experience, install Docker and Open WebUI for a browser-based interface similar to ChatGPT.
Install Docker
- Download Docker Desktop for macOS from here.
- Open Docker and leave it running in the background.
Install Open WebUI
- Run the following command in Terminal to set up Open WebUI:
docker run -d --name open-webui -p 3000:3000 -v open-webui-data:/app/data --pull=always ghcr.io/open-webui/open-webui:main
Access the Chat Interface
Open your browser and visit http://localhost:3000 to start interacting with DeepSeek R1 via a modern, user-friendly chat UI.
5. Optimizing Performance
You can adjust performance settings to maximize your system’s capabilities. Key variables include:
- CPU Threads (
OLLAMA_THREADS=N
): Adjust how many threads your CPU uses. - GPU Layers (
--n-gpu-layers N
): Offload layers to your GPU if you have a compatible setup. - Batch Size (
--batch-size N
): Control how many tokens the model processes at once. - Memory Swap: Monitor memory usage and adjust batch size or model as needed.
6. Monitor Usage and Benchmarking
Use Activity Monitor on macOS or htop in Terminal to track CPU and GPU usage. You can also use:
sudo powermetrics
to monitor live GPU activity.
Conclusion
By following this process, you’ll have a fully functional local setup running DeepSeek R1 on your Mac mini M4 Pro. This allows you to handle various AI tasks while keeping control of your data and avoiding cloud dependency.