Stop Waiting, Start Shipping — the Open AI Stack Grew Up in 2026
Sebastian Raschka is a respected voice in modern AI, known for making complex machine learning and LLM concepts understandable without losing technical depth. His recent appearance on the Lex Fridman Podcast underlines his role as an educator and practitioner who connects research, implementation, and responsible AI development.
At a Glance
- In the fireside chat, Sebastian Raschka pushes back hard on the assumption that every enterprise needs to train its own base model from scratch — his answer: not for 99.9% of them. That moves the European sovereignty debate away from any single house's catch-up project and toward the question of where to put the money instead.
- In 2026, open models pull their weight on three axes: on capability they have closed most of the gap to the proprietary frontier and are practically equivalent for many applications; above a certain usage volume they become cheaper than API consumption; and for relevant applications they run reliably on your own infrastructure.
- The strategically right investment for Europe — and for any European enterprise — is post-training and harness, not the base-model race.
The Wrong Reflex of "Catching Up"
In the fireside chat Stop Waiting, Start Shipping at PyCon DE & PyData 2026 — Europe's leading conference for applied AI built on open source, with more than 2,000 attendees — I spoke with Sebastian Raschka, whose books on practical LLM work have become standard references, about the most current questions around LLMs and AI architecture. He put one sentence on the table that the European sovereignty debate rarely articulates this clearly: 99.9 percent of companies do not need to train their own model from scratch — doing so would be a waste no strategic argument can still justify.
That isn't resignation, on the contrary. It moves the question. Pre-training is not the decisive lever — the layer above it is. And that, Raschka answered when I asked where Europe should put its investment, is exactly where he would put the next euro: into post-training and harness, not into trying to build a frontier base model in-house.
For German and European companies that take their AI strategy seriously, this is the central strategic insight of 2026. It replaces the "do we need our own ChatGPT" debate with a more productive question: where do we invest so that open models turn into value we own?
The "Winner Takes All" Mantra Is Dead
Mainstream coverage still leans on the race narrative — OpenAI versus Anthropic, US versus China. PyCon DE & PyData 2026 made it visible that this story no longer holds in 2026.
The ecosystem has become broader, not narrower: Llama, Qwen, DeepSeek, Mistral, Gemma, GPT-OSS families exist in parallel, each with its own strengths and characteristics. The more useful question is no longer "which model wins" — it is "which model family fits which use case". Coding agents benefit from different models than legal research, which differs again from medical classification.
Mistral, as a European provider, is the case in point. By Raschka's reading in the fireside chat, Mistral Large 3 builds structurally on the DeepSeek-V3 architecture — an open architecture post-trained, not reinvented from scratch. A Mistral technical report officially documenting this architectural lineage is not publicly available; the claim is a practitioner's reading, not a Mistral statement. The strategic point still holds: in 2026, taking an open architecture as a starting point is not a weakness but the economically and technically sensible move. No one in 2026 should train a base model from scratch when five comparable open-weight options are on the table.
Post-Training Is the Lever — Composer-3 as Case Study
Cursor, one of the most successful coding tools of 2025/26, runs its own coding model, "Composer-3", which is significantly better than most available LLMs for coding tasks. Cursor itself has not officially confirmed the underlying base model; Raschka's practitioner reading in the chat is Kimi K2.5. Either way, the lesson is unambiguous: the production gain came from post-training, not from picking the right base model.
Anyone designing a comparable programme in 2026 should not copy a specific model selection. They should copy the investment structure — delegate pre-training, keep post-training and harness in-house.
For German and European companies working in regulated industries this logic is doubly valuable: post-training on an open model creates an inference layer that is traceable, documentable, and auditable at run time. Where the limits sit — for example around training-data auditability of the base model — is developed in the Compliance section below.
Local Models Are Productive in 2026
The argument that local models are regulatorily attractive but technically uncompetitive no longer holds in 2026. Raschka reported running a Qwen 3.5 27B model on consumer-grade hardware well enough that it covers most OpenCode use cases — free, local, no API tokens burned.
Examples from the consumer side reinforce the point. Live translation in the AirPods runs as a small model on the iPhone, with roughly two seconds of latency. That isn't GPT-4 quality, but it is live, private, and independent of server-side data flows. That profile — good enough for the use case, in exchange sovereign and low-latency — is often the more strategically relevant configuration for enterprise applications.
Gabriela Bogk, CISO at Mobile.de and a long-time member of the Chaos Computer Club, drew the same line from a German security mandate's perspective in her keynote: for sensitive data, locally runnable models are the more obvious choice in 2026. "If you have the need to protect your data a little bit better or if you have the want to protect your data a little bit better, run local models, absolutely." She pointed to concrete hardware paths — Macs with unified memory, GPUs left over from gaming or mining setups — and to the maturity of downloadable models: not frontier-grade, but sufficient for many enterprise use cases, without data flowing to US servers. From the CISO perspective of a German enterprise, that is the operational confirmation of what Raschka frames strategically.
Concretely: for internal knowledge bases, coding assistance on confidential code, classification tasks in regulated pipelines, document-grounded Q&A systems — all of this has, in 2026, a locally runnable open-weight path that twelve months ago did not exist. For anyone who treats sovereignty as an architectural decision rather than a compliance burden, this is an option that did not exist in 2024.
An Honest Reading of Model Choice
Anyone proposing open-weight models to a German bank or insurer gets a board-level question first that often goes missing in technical discussions: where does the model come from. The productively usable open-weight families in 2026 split into three geopolitical clusters — Chinese models (Qwen, DeepSeek, Kimi), US-trained models (Llama, Gemma, GPT-OSS), and European-trained models (Mistral). Ignoring this immediately costs the argument credibility in the boardroom.
The candid reading — as personal assessment, not formal compliance counsel, which has to be settled with regulators and legal in any specific case — looks roughly like this:
- For coding assistance, technical classification, document-grounded search: the political-bias question of the base model is mostly irrelevant. Raschka put it bluntly in the chat: for a coding agent, the political stance of the base model doesn't matter.
- For content-generating applications with external visibility (customer communication, contract drafts, editorial work): the training cluster of the base model becomes part of the risk assessment. Continued pre-training on a proprietary corpus can systematically correct bias — Raschka confirmed the path — but costs time and compute.
- For regulated banking and insurance use cases: Mistral, Llama, or Gemma are the natural first candidates because their training provenance sits within the EU or US frame. Chinese open-weight models remain technically interesting but have to enter procurement with the compliance question attached.
What This Means for German and European Strategy
If the strategically right investment is post-training and harness, two concrete consequences follow for programmes starting up across DACH and Europe in 2026:
- The hiring focus shifts. The bottleneck hire in 2026 is the domain expert with appetite for experimentation, not the ML PhD. Someone who can fine-tune a model is useful — someone who knows the business model and the data is indispensable.
- Open source is not the budget option, it is the structurally more mature choice. The notion that for serious enterprise applications a proprietary model is the "safe" pick reverses in 2026 across several axes — on compliance, data sovereignty, long-term TCO, and the ability to understand and control the system, the open stack now has the upper hand.
Compliance: From Obstacle to Argument
For regulated industries, the compliance logic itself shifts in 2026. The EU AI Act and the maturing DORA and MaRisk expectations don't primarily demand the best model — they demand traceability, data provenance, auditability, and proportional risk handling.
Where exactly open-weight models help deserves precise differentiation. Open-weight means: model weights are accessible, inference runs on your own infrastructure, run-time audit logs are produced under your own control, the data flow at inference time is yours. What open-weight does not automatically solve: training-data auditability — most productive open-weight models also do not fully document their training corpora. Compliance arguments anchored on training-data provenance must separate open-weight from open-data. Both exist, but rarely in the same model.
The proprietary providers, in turn, are no longer pure black boxes: OpenAI Enterprise, Anthropic Claude for Work, Azure OpenAI offer data residency, zero-retention modes, BAA contracts, and inference-level audit logs. The honest difference: no model weights in your hands, no training-data inspection, no direct access to model behaviour. For many use cases that is entirely sufficient. For use cases where model inspection or in-house post-training is part of the compliance argument, it is not.
Where open-weight tips economically is a rule-of-thumb question: above a monthly token volume that, by experience from advisory work, sits roughly in the single-digit to low double-digit millions, self-hosting becomes economically viable for a typical coding or knowledge use case. That is an order of magnitude, not a study — the exact threshold depends on hardware, energy cost, operations and personnel overhead, and latency requirements. Anyone running the TCO calculation for their own programme should run it themselves, not lift it from this post.
What follows is an argument that did not carry twelve months ago: for a specific class of applications, compliance becomes a driver for open models rather than a brake on them. That shift is operationally tangible in board discussions for the first time in 2026 — not as a blanket rule, but as a use-case-specific architectural decision.
Stop Waiting, Start Shipping
Raschka's closing advice in the fireside chat was simple and direct: don't wait, get started. Try things out. Don't over-plan — what you plan exhaustively today is irrelevant tomorrow.
That advice falls outside the logic of the "big, deeply considered decision" in which many European programmes have been stuck. In 2026 that is exactly the point: whoever starts — clear use case, modest investment, open-weight model, post-training on their own data — builds experience and leverage that creates a head start which is hard to close later without significant effort.
Waiting is not an option. Catching up is not an option either. The right move is making open models productive now — with the seriousness this deserves.
Open-source stack as strategic investment, not catch-up. Where does the lever sit in your programme?
Let's talkRelated links
- Stop Waiting, Start Shipping: Real-World Strategy for Open-Source LLMs — Fireside Chat with Sebastian Raschka2026-04
- Sebastian Raschka — Personal Website
- Build a Large Language Model (From Scratch) — Sebastian Raschka
- Honey, I vibe coded some crypto — Security in the age of LLMs (Keynote) — Gabriela Bogk (CISO Mobile.de)2026-04