Run DeepSeek Locally on Your Mac mini M4 Pro

To run DeepSeek locally on your Mac mini M4 Pro, follow this comprehensive setup guide. This includes using Docker and Open WebUI for a ChatGPT-like experience. Here’s a streamlined process for setting it up:

1. Install Ollama (the AI engine)

First, install the Ollama runtime to handle local AI models.

  • Install Ollama using this command in Terminal:
  /bin/bash -c "$(curl -fsSL https://ollama.com/download)"
  • Check if it’s installed by running:
  ollama --version

2. Download DeepSeek R1 Models

Choose a model based on your hardware capabilities. For Mac mini M4 Pro, you can use models like 8B, 14B, or 32B, but larger models may not work well due to RAM/GPU constraints.

  • Pull a model:
  ollama pull deepseek-r1:8b  # Fast, lightweight
  ollama pull deepseek-r1:14b # Balanced performance
  ollama pull deepseek-r1:32b # Heavy processing
  ollama pull deepseek-r1:70b # Max reasoning, slowest

3. Run DeepSeek R1 in Basic Mode

To test the model in Terminal (without the GUI):

ollama run deepseek-r1:8b

This provides AI access via the terminal interface.

4. Upgrade to a ChatGPT-Like Interface Using Docker and Open WebUI

For a better user experience, install Docker and Open WebUI for a browser-based interface similar to ChatGPT.

Install Docker

  • Download Docker Desktop for macOS from here.
  • Open Docker and leave it running in the background.

Install Open WebUI

  • Run the following command in Terminal to set up Open WebUI:
  docker run -d --name open-webui -p 3000:3000 -v open-webui-data:/app/data --pull=always ghcr.io/open-webui/open-webui:main

Access the Chat Interface

Open your browser and visit http://localhost:3000 to start interacting with DeepSeek R1 via a modern, user-friendly chat UI.

5. Optimizing Performance

You can adjust performance settings to maximize your system’s capabilities. Key variables include:

  • CPU Threads (OLLAMA_THREADS=N): Adjust how many threads your CPU uses.
  • GPU Layers (--n-gpu-layers N): Offload layers to your GPU if you have a compatible setup.
  • Batch Size (--batch-size N): Control how many tokens the model processes at once.
  • Memory Swap: Monitor memory usage and adjust batch size or model as needed.

6. Monitor Usage and Benchmarking

Use Activity Monitor on macOS or htop in Terminal to track CPU and GPU usage. You can also use:

sudo powermetrics

to monitor live GPU activity.

Conclusion

By following this process, you’ll have a fully functional local setup running DeepSeek R1 on your Mac mini M4 Pro. This allows you to handle various AI tasks while keeping control of your data and avoiding cloud dependency.

Was this article helpful?
YesNo

Similar Posts