SAP Just Paid Over €1B for an AI Most People Have Never Heard Of

Startseite

Kundencenter

Portfolio

Rekrutierung

Blog

Kontakt

EN/

SAP Just Paid Over €1B for an AI Most People Have Never Heard Of

Artikelinhalt

What SAP Actually Bought What is a Tabular Foundation Model?Why LLMs Cannot Do This Job What This Means for SAP Customers What This Means for the Rest of the Market The 60-Second Takeaway Frequently Asked Questions

SAP just paid over €1 billion for an AI lab almost no one outside enterprise data science has heard of. The target: Prior Labs, a German startup that builds Tabular Foundation Models. The reason: their model, TabPFN, beats ensembles tuned for four hours in 2.8 seconds.

The deal landed on May 4, 2026. It is the largest acquisition SAP has signed in years. And it is not aimed at chatbots or copilots. It is aimed at the boring data that actually runs your company — rows, columns, customer IDs, payment dates.

What SAP Actually Bought

The headline number: €1 billion+ committed over four years to scale Prior Labs. The deal is set to close in Q2 or Q3 of 2026, pending regulatory approval. Prior Labs will keep operating independently. SAP is positioning it as a “globally leading frontier AI lab in Europe.”

Read that phrase again. SAP is not buying a tool. SAP is buying sovereignty. Europe has watched the US and China dominate frontier model labs for three years. Prior Labs gives SAP a flag to plant.

And in the same week, SAP also closed its acquisition of Dremio, the open-source data lakehouse. Two acquisitions in seven days. The pattern is obvious: SAP is building a unified data fabric, and Prior Labs is the brain that sits on top of it.

Why the Timing Matters

Enterprise AI spending in 2026 is shifting away from generic LLM wrappers. Boards want models that touch the structured data inside ERPs and CRMs. That is where the money lives. SAP’s customers run 77% of the world’s transaction revenue through SAP systems. None of that is text. All of it is tables.

What is a Tabular Foundation Model?

You have heard of foundation models for text. GPT, Claude, Gemini. They are trained on the open internet to predict the next token.

Tabular Foundation Models do something different. They are pre-trained to predict outcomes from rows and columns of data — the kind of data that lives in spreadsheets, SQL tables, and SAP HANA databases. Customer churn. Supplier risk. Payment delay. Fraud probability. Upsell likelihood.

For 30 years, this work has been done by tuning XGBoost, LightGBM, or random forests by hand. Every dataset gets its own model. Every model takes hours or days to tune. A senior data scientist costs $200K a year to do this.

TabPFN flips that. It is one model. You give it a fresh table it has never seen. It returns predictions in seconds. No tuning. No feature engineering. No grid search.

The Numbers That Made SAP Sign the Check

Here is what the published research shows. TabPFN v2 was pre-trained on roughly 130 million synthetic tabular datasets. Not real data — synthetic. Generated to teach the model the structure of how tables behave.

The latest version, TabPFN-2.5, scales to 50,000 data points and 2,000 features. That covers most enterprise use cases.

And the benchmark that closed the deal: in 2.8 seconds, TabPFN outperforms ensembles that human data scientists tuned for four hours on classification tasks. On the TabArena benchmark, TabPFN-2.5 matches AutoGluon 1.4 — the previous state of the art.

Read that one more time. Four hours of human work, beaten in under three seconds. That is the number that justifies a €1 billion price tag.

Why LLMs Cannot Do This Job

You might ask the obvious question. If GPT-5 can write code, why can’t it predict customer churn?

Because LLMs have only a rudimentary understanding of tables, numbers, and statistics. They were trained to predict tokens in text. They were not trained to find statistical structure in 50,000 rows of payment data.

Try it yourself. Paste a CSV with 10,000 rows into ChatGPT and ask it to predict which customers will churn next quarter. It will hallucinate. It will sample. It will not run a clean probabilistic inference across the full distribution.

Tabular Foundation Models are built differently. They learn the prior — the underlying probability structure of what tables look like — during pre-training. Then at inference, they perform Bayesian-style prediction on your specific table. That is what “PFN” stands for: Prior-Data Fitted Network.

The Research Pedigree

This is not vibes. The TabPFN paper was published in Nature in early 2025. Peer-reviewed. Reproducible. The Nature publication is what put Prior Labs on SAP’s radar in the first place.

Founded by researchers from the University of Freiburg, Prior Labs spun out with the explicit goal of commercializing the work. They raised quietly, shipped fast, and now they are part of SAP.

What This Means for SAP Customers

If you run SAP today, here is what changes over the next 18 months.

TabPFN will integrate with SAP AI Core. That is the model-serving layer. It will plug into SAP Business Data Cloud — the unified data fabric that absorbs Dremio. And it will sit underneath Joule, SAP’s agentic AI layer.

Picture this flow. A finance agent in Joule asks: “Which suppliers are likely to miss next quarter’s deadlines?” Joule routes the question to TabPFN. TabPFN reads the live supplier table from Business Data Cloud. It returns a ranked list with probabilities in under three seconds. The agent acts.

That loop — natural language question, structured data prediction, agent action — is the actual product. Not the chatbot. The chatbot is the interface. The Tabular Foundation Model is the engine.

Concrete Use Cases SAP is Targeting

SAP has named four initial workloads where TabPFN will ship:

Payment delay prediction. Which invoices will be late? How late?
Supplier risk scoring. Which vendors are about to miss SLAs or default?
Upsell opportunity ranking. Which existing accounts are warm enough for a new product pitch?
Customer churn prediction. Which subscribers will cancel in the next 90 days?

Every one of these is a problem that today consumes a team of data scientists. Every one of them is solved by feeding the right table to TabPFN.

What This Means for the Rest of the Market

Three knock-on effects to watch.

One: Oracle, Salesforce, and Microsoft now need a tabular model. SAP just made it the default expectation. Expect Oracle and Salesforce to either acquire or partner with a tabular AI lab inside 12 months. Microsoft has Fabric and may build internally.

Two: the AutoML market just got compressed. DataRobot, H2O, Dataiku, and even AWS SageMaker Autopilot all sold their AutoML offerings on the promise of fast tuning. TabPFN does it faster, with no tuning at all. Pricing pressure incoming.

Three: data scientist roles will change. The job stops being “tune the model.” It becomes “frame the question, validate the prediction, build the agent.” Less feature engineering. More problem definition.

Risk Factors

Worth flagging the cracks. TabPFN-2.5 still tops out around 50,000 rows and 2,000 features. Plenty of enterprise tables are bigger. Scaling beyond that is an open research problem.

Regulatory approval for the deal is not automatic. The EU and the German competition authority will review. SAP is large. Prior Labs is strategic. Expect questions.

And synthetic pre-training data is a moat — but only until someone else generates 200 million synthetic datasets and trains a bigger model.

The 60-Second Takeaway

SAP paid €1 billion+ for a model that does in 2.8 seconds what data scientists do in 4 hours. The model is called TabPFN. It is a Tabular Foundation Model — pre-trained on 130 million synthetic tables, peer-reviewed in Nature, and now wired into SAP’s agentic stack.

If you build enterprise AI, you need to know what TabPFN is by Friday. If you sell AutoML, you need a new pitch by Monday. If you run a data team at a Fortune 500, you need to ask your SAP rep when this lands in production.

The frontier of enterprise AI in 2026 is not bigger language models. It is structured data models that finally work.

Frequently Asked Questions

What are Tabular Foundation Models?

Tabular Foundation Models are pre-trained AI models that predict outcomes from rows-and-columns data, the kind that lives in databases and spreadsheets. Unlike LLMs that work on text, they learn the statistical structure of tables during pre-training. TabPFN is the leading example, pre-trained on around 130 million synthetic tabular datasets.

How does TabPFN compare to XGBoost or AutoGluon?

TabPFN-2.5 matches AutoGluon 1.4 accuracy on the TabArena benchmark while requiring no tuning. On classification tasks, it beats four-hour-tuned ensembles in 2.8 seconds. The trade-off: it currently caps at 50,000 rows and 2,000 features, while XGBoost scales further.

How much did SAP pay for Prior Labs?

SAP committed over €1 billion across four years to scale Prior Labs as an independently operating frontier AI lab. The deal was announced on May 4, 2026, and is expected to close in Q2 or Q3 2026 pending regulatory approval. SAP also acquired Dremio in the same week to build a unified data fabric.

Why can’t LLMs replace Tabular Foundation Models?

LLMs only have a rudimentary understanding of tables, numbers, and statistics because they were trained to predict text tokens, not statistical distributions over structured data. Feeding a 50,000-row CSV into a chatbot leads to sampling, hallucination, and unreliable predictions. Tabular Foundation Models are purpose-built to do Bayesian-style inference on full tables.

Is TabPFN worth using in 2026?

Yes, if your problem fits within 50,000 rows and 2,000 features and is a classification or regression task on tabular data. The peer-reviewed Nature paper, the SAP acquisition, and the TabArena benchmarks all confirm production-grade performance. For larger datasets, traditional gradient-boosted trees still hold ground until the next TabPFN release.