🤖 Buying guide

Best laptop or PC for local AI (2026)

If you want to run AI models on your own machine — without depending on cloud servers — memory is the most critical factor. Language models like Llama, Mistral or Gemma need to fit fully in RAM to work. With the wrong machine, a 13B-parameter model simply won't start.

Which components matter — and how much

Not all components carry the same weight for running local AI. Here's what you actually need and why.

RAM memory

Critical

Language models load entirely in memory. A 7B model in Q4 quantization uses ~5 GB; 13B uses ~9 GB; 34B uses ~20 GB; 70B uses ~40 GB. If your RAM doesn't have room for the model, it simply won't start. On Apple Silicon, unified memory serves as both RAM and GPU memory — this changes the equation completely.

Minimum 16 GB
Recommended 28–32 GB
Ideal 64 GB+

Graphics card (GPU)

Important

On Windows and Linux, a GPU with enough VRAM hugely accelerates inference via CUDA (NVIDIA) or ROCm (AMD). Without a dedicated GPU the model runs CPU-only, which is much slower. On Mac, the M chip integrates CPU, GPU and Neural Engine sharing unified memory — no extra GPU needed and performance is surprisingly good.

Minimum No GPU (CPU only, ~3–5 tokens/s on 7B models)
Recommended RTX 4060 8 GB VRAM (Windows/Linux)
Ideal RTX 4090 24 GB VRAM or Mac with 48–64 GB unified

Processor (CPU)

Secondary

With a GPU for inference, the CPU has little impact. For CPU-only inference, more cores help but it's not the main bottleneck. Apple Silicon chips (M4, M5) are especially efficient thanks to their dedicated Neural Engine which accelerates LLM matrix operations.

Minimum Any modern 4+ core processor
Recommended Apple M4 / M5 · Intel Core i7 · Ryzen 7 (recent generation)
Ideal Apple M4 Pro / M5 Pro · AMD Ryzen 9

Storage

Important

Models are stored on disk and loaded into RAM when run. A 7B model takes 4–8 GB on disk; a 70B can reach 40+ GB. You need space for several models and a fast NVMe drive to reduce initial load times.

Minimum 512 GB NVMe
Recommended 1 TB NVMe
Ideal 2 TB NVMe

Mac vs PC for local AI

For local AI, choosing between Mac and Windows/Linux significantly affects performance and experience. Here are the real differences:

Mac (Apple Silicon)
  • Unified memory: CPU and GPU share the same RAM pool, with no slow transfers between separate memories
  • At the same price, you get more gigabytes available for models than with a dedicated Windows GPU
  • Exceptional energy efficiency: inference at full speed with no fans and all-day battery
  • Perfect compatibility with Ollama, LM Studio and Jan.ai without configuring drivers or environments
🖥️ PC Windows / Linux
  • RTX 4090 with 24 GB of dedicated VRAM: superior for models that fit completely in the GPU
  • Higher ceiling for pure-inference speed with high-end GPUs
  • More budget options (from €700 to workstations)
  • Ideal if you also need to train models with CUDA or work with PyTorch/JAX
⚖️
Our verdict

For budgets up to €2,000 the MacBook Air M5 with 28 GB or the Mac mini M4 Pro are the most balanced choice: more memory for models, low power draw and zero setup friction. If you need to train models or have RTX 4090 budget, a Windows PC can beat the Mac in pure inference speed with large models.

[object Object]

[object Object]

ModelParametersRAM/VRAM (Q4)Speed
Llama 3.2 3B3B2 GB[object Object]
Llama 3.1 8B8B5 GB[object Object]
Llama 3.1 70B70B40 GB[object Object]
Mistral 7B7B4 GB[object Object]
Gemma 2 27B27B16 GB[object Object]
⚠️

[object Object]

How much do I need to spend?

700–1.200 €
Entry-level

Comfortably run models up to 7–8B parameters (Llama 3.2 8B, Mistral 7B, Gemma 2 9B). Enough for code assistant, summaries and conversational chat. Response times are reasonable for personal use.

MacBook Air M4 16 GB · Desktop PC with RTX 4060 8 GB

1.200–1.800 €
Mid-range — sweet spot
Recommended

With 28–32 GB you run models up to 34B parameters in Q4 quantization (Llama 3.1 34B, Qwen 32B). Response quality jumps noticeably over 7B models. This is the range where local AI becomes genuinely comfortable for daily use.

MacBook Air M5 28 GB — the best choice in this range in 2026

2.000 €+
High-end

With 48–64 GB you run 70B quantized models (Llama 3.1 70B, DeepSeek 67B) with fluid responses. Comparable to premium cloud models, fully offline and with no per-query costs.

Mac mini M4 Pro 48 GB · Mac Studio · PC with RTX 4090 24 GB

Our picks

The machines we'd buy in 2026 for each profile.

⭐ Our pick

MacBook Air M5 — 28 GB

Best value

The sweet spot for local AI in 2026. With 28 GB of unified memory it runs models up to 34B in Q4 quantization smoothly, fanless and with all-day battery. Zero configuration: Ollama works in one command. It's the machine we'd pick for daily use combining programming and local AI.

  • 28 GB unified memory (RAM + GPU)
  • Apple M5 with Neural Engine
  • 512 GB – 2 TB NVMe
  • Up to 18h battery
🇪🇸 Spain From €1,499
🌎 LATAM ~1.400 USD
See price and buy

Mac mini M4 Pro — 48 GB

Maximum performance

For those who want maximum desktop power and to run 70B models: the Mac mini M4 Pro with 48 GB is the most efficient machine on the market for local AI below €2,500. The Pro chip adds more CPU and GPU cores than the Air, notably speeding up inference.

  • 48 GB unified memory
  • Apple M4 Pro — 14 CPU cores
  • 20-core GPU
  • Ultra-fast NVMe SSD
🇪🇸 Spain From €1,999
🌎 LATAM ~1.800 USD
See price and buy

Desktop PC with RTX 4070 Super — 32 GB RAM

Most affordable

If you prefer Windows or plan to train models with CUDA, a desktop with RTX 4070 Super (12 GB VRAM) and 32 GB of system RAM offers great flexibility. Models that fit in the 12 GB VRAM run at top speed; the rest use system RAM. Also the best option for Stable Diffusion and other image AI.

  • 32 GB DDR5
  • RTX 4070 Super 12 GB VRAM
  • Ryzen 7 7700 or Intel i7-14700
  • 1 TB NVMe
🇪🇸 Spain From €1,400
🌎 LATAM ~1.200 USD
See price and buy

FAQ

Which AI models can I run with 16 GB of RAM?
With 16 GB you comfortably run up to 7–8B parameter models in Q4 quantization: Llama 3.2 8B, Mistral 7B, Gemma 2 9B, Phi-3.5. More than enough for code assistant, summaries and chat. For 13B models you'd need to close almost everything; 14B+ already requires 28–32 GB.
What tools do I use to run local AI?
Ollama is the simplest option: installs with one command and supports virtually all popular models (Llama, Mistral, Gemma, Qwen, DeepSeek…). For a ChatGPT-like UI, LM Studio and Jan.ai are the most popular. All three work perfectly on Mac (with Metal acceleration), Windows and Linux.
Is Mac or Windows better for local AI on a mid-range budget?
At €1,200–1,800 the Mac clearly wins: 28–32 GB of unified memory lets you run 34B models that wouldn't fit in the 8–12 GB VRAM of a mid-range Windows GPU. Apple Silicon's unified memory completely changes the equation at this price range.
Do I need internet to use local AI?
Only to download the model the first time. Once downloaded, it works 100% offline. One of the main advantages over cloud services: full privacy of conversations, no per-query costs and available anywhere.

You might also like