DE
← All articles

Stop Waiting, Start Shipping — the Open AI Stack Grew Up in 2026

open-sourceai-strategydigital-sovereigntyenterprise-aipost-trainingeu-ai-actllms
Sebastian Raschka and Alexander C. S. Hendorf in the fireside chat "Stop Waiting, Start Shipping" at PyCon DE & PyData 2026, Darmstadt
Sebastian Raschka is a respected voice in modern AI, known for making complex machine learning and LLM concepts understandable without losing technical depth. His recent appearance on the Lex Fridman Podcast underlines his role as an educator and practitioner who connects research, implementation, and responsible AI development.

At a Glance

The Wrong Reflex of "Catching Up"

In the fireside chat Stop Waiting, Start Shipping at PyCon DE & PyData 2026 — Europe's leading conference for applied AI built on open source, with more than 2,000 attendees — I spoke with Sebastian Raschka, whose books on practical LLM work have become standard references, about the most current questions around LLMs and AI architecture. He put one sentence on the table that the European sovereignty debate rarely articulates this clearly: 99.9 percent of companies do not need to train their own model from scratch — doing so would be a waste no strategic argument can still justify.

That isn't resignation, on the contrary. It moves the question. Pre-training is not the decisive lever — the layer above it is. And that, Raschka answered when I asked where Europe should put its investment, is exactly where he would put the next euro: into post-training and harness, not into trying to build a frontier base model in-house.

For German and European companies that take their AI strategy seriously, this is the central strategic insight of 2026. It replaces the "do we need our own ChatGPT" debate with a more productive question: where do we invest so that open models turn into value we own?

The "Winner Takes All" Mantra Is Dead

Mainstream coverage still leans on the race narrative — OpenAI versus Anthropic, US versus China. PyCon DE & PyData 2026 made it visible that this story no longer holds in 2026.

The ecosystem has become broader, not narrower: Llama, Qwen, DeepSeek, Mistral, Gemma, GPT-OSS families exist in parallel, each with its own strengths and characteristics. The more useful question is no longer "which model wins" — it is "which model family fits which use case". Coding agents benefit from different models than legal research, which differs again from medical classification.

Mistral, as a European provider, is the case in point. By Raschka's reading in the fireside chat, Mistral Large 3 builds structurally on the DeepSeek-V3 architecture — an open architecture post-trained, not reinvented from scratch. A Mistral technical report officially documenting this architectural lineage is not publicly available; the claim is a practitioner's reading, not a Mistral statement. The strategic point still holds: in 2026, taking an open architecture as a starting point is not a weakness but the economically and technically sensible move. No one in 2026 should train a base model from scratch when five comparable open-weight options are on the table.

Post-Training Is the Lever — Composer-3 as Case Study

Cursor, one of the most successful coding tools of 2025/26, runs its own coding model, "Composer-3", which is significantly better than most available LLMs for coding tasks. Cursor itself has not officially confirmed the underlying base model; Raschka's practitioner reading in the chat is Kimi K2.5. Either way, the lesson is unambiguous: the production gain came from post-training, not from picking the right base model.

Anyone designing a comparable programme in 2026 should not copy a specific model selection. They should copy the investment structure — delegate pre-training, keep post-training and harness in-house.

For German and European companies working in regulated industries this logic is doubly valuable: post-training on an open model creates an inference layer that is traceable, documentable, and auditable at run time. Where the limits sit — for example around training-data auditability of the base model — is developed in the Compliance section below.

Local Models Are Productive in 2026

The argument that local models are regulatorily attractive but technically uncompetitive no longer holds in 2026. Raschka reported running a Qwen 3.5 27B model on consumer-grade hardware well enough that it covers most OpenCode use cases — free, local, no API tokens burned.

Examples from the consumer side reinforce the point. Live translation in the AirPods runs as a small model on the iPhone, with roughly two seconds of latency. That isn't GPT-4 quality, but it is live, private, and independent of server-side data flows. That profile — good enough for the use case, in exchange sovereign and low-latency — is often the more strategically relevant configuration for enterprise applications.

Gabriela Bogk, CISO at Mobile.de and a long-time member of the Chaos Computer Club, drew the same line from a German security mandate's perspective in her keynote: for sensitive data, locally runnable models are the more obvious choice in 2026. "If you have the need to protect your data a little bit better or if you have the want to protect your data a little bit better, run local models, absolutely." She pointed to concrete hardware paths — Macs with unified memory, GPUs left over from gaming or mining setups — and to the maturity of downloadable models: not frontier-grade, but sufficient for many enterprise use cases, without data flowing to US servers. From the CISO perspective of a German enterprise, that is the operational confirmation of what Raschka frames strategically.

Concretely: for internal knowledge bases, coding assistance on confidential code, classification tasks in regulated pipelines, document-grounded Q&A systems — all of this has, in 2026, a locally runnable open-weight path that twelve months ago did not exist. For anyone who treats sovereignty as an architectural decision rather than a compliance burden, this is an option that did not exist in 2024.

An Honest Reading of Model Choice

Anyone proposing open-weight models to a German bank or insurer gets a board-level question first that often goes missing in technical discussions: where does the model come from. The productively usable open-weight families in 2026 split into three geopolitical clusters — Chinese models (Qwen, DeepSeek, Kimi), US-trained models (Llama, Gemma, GPT-OSS), and European-trained models (Mistral). Ignoring this immediately costs the argument credibility in the boardroom.

The candid reading — as personal assessment, not formal compliance counsel, which has to be settled with regulators and legal in any specific case — looks roughly like this:

This differentiation does not weaken the sovereignty argument — it sharpens it. Sovereignty means choosing the model deliberately and in context, not reflexively grabbing "the best open-weight model on the benchmark".

What This Means for German and European Strategy

If the strategically right investment is post-training and harness, two concrete consequences follow for programmes starting up across DACH and Europe in 2026:

Compliance: From Obstacle to Argument

For regulated industries, the compliance logic itself shifts in 2026. The EU AI Act and the maturing DORA and MaRisk expectations don't primarily demand the best model — they demand traceability, data provenance, auditability, and proportional risk handling.

Where exactly open-weight models help deserves precise differentiation. Open-weight means: model weights are accessible, inference runs on your own infrastructure, run-time audit logs are produced under your own control, the data flow at inference time is yours. What open-weight does not automatically solve: training-data auditability — most productive open-weight models also do not fully document their training corpora. Compliance arguments anchored on training-data provenance must separate open-weight from open-data. Both exist, but rarely in the same model.

The proprietary providers, in turn, are no longer pure black boxes: OpenAI Enterprise, Anthropic Claude for Work, Azure OpenAI offer data residency, zero-retention modes, BAA contracts, and inference-level audit logs. The honest difference: no model weights in your hands, no training-data inspection, no direct access to model behaviour. For many use cases that is entirely sufficient. For use cases where model inspection or in-house post-training is part of the compliance argument, it is not.

Where open-weight tips economically is a rule-of-thumb question: above a monthly token volume that, by experience from advisory work, sits roughly in the single-digit to low double-digit millions, self-hosting becomes economically viable for a typical coding or knowledge use case. That is an order of magnitude, not a study — the exact threshold depends on hardware, energy cost, operations and personnel overhead, and latency requirements. Anyone running the TCO calculation for their own programme should run it themselves, not lift it from this post.

What follows is an argument that did not carry twelve months ago: for a specific class of applications, compliance becomes a driver for open models rather than a brake on them. That shift is operationally tangible in board discussions for the first time in 2026 — not as a blanket rule, but as a use-case-specific architectural decision.

Stop Waiting, Start Shipping

Stage setup for the fireside chat "Stop Waiting, Start Shipping — Real-World Strategy for Open-Source LLMs" with Sebastian Raschka and Alexander C. S. Hendorf, PyCon DE & PyData 2026

Raschka's closing advice in the fireside chat was simple and direct: don't wait, get started. Try things out. Don't over-plan — what you plan exhaustively today is irrelevant tomorrow.

That advice falls outside the logic of the "big, deeply considered decision" in which many European programmes have been stuck. In 2026 that is exactly the point: whoever starts — clear use case, modest investment, open-weight model, post-training on their own data — builds experience and leverage that creates a head start which is hard to close later without significant effort.

Waiting is not an option. Catching up is not an option either. The right move is making open models productive now — with the seriousness this deserves.

Open-source stack as strategic investment, not catch-up. Where does the lever sit in your programme?

Let's talk

Related links

  1. Stop Waiting, Start Shipping: Real-World Strategy for Open-Source LLMs — Fireside Chat with Sebastian Raschka2026-04
  2. Sebastian Raschka — Personal Website
  3. Build a Large Language Model (From Scratch) — Sebastian Raschka
  4. Honey, I vibe coded some crypto — Security in the age of LLMs (Keynote) — Gabriela Bogk (CISO Mobile.de)2026-04
Based on the fireside chat "Stop Waiting, Start Shipping: Real-World Strategy for Open-Source LLMs" with Sebastian Raschka, PyCon DE & PyData 2026. Several claims in this piece — for example on the architectural lineage of Mistral Large 3 / DeepSeek-V3, or on the base model behind Cursor's Composer-3 — are marked as Raschka's practitioner reading; official confirmations from the respective providers are not available. Note: This piece is published before the public release of the cited conference recordings, which are expected in summer 2026.