AI is getting cheaper. Over the past 12–18 months, more operators and investors have begun saying the same thing: AI is breaking SaaS economics. Marginal cost is back. Seat pricing is under pressure. Inference is not free. All of that is directionally correct. But it is incomplete. The more important question is not whether AI introduces marginal cost. The question is who captures the efficiency gains as inference gets cheaper. Some products expand margin. Some expand revenue. Some quietly subsidize heavy users. Others compress under competition. The difference is not model quality; rather, it is cost exposure and pricing architecture. I have been developing a framework to think about this, which is what I call the Jevons Capture 2×2. It maps products (not companies) across two structural dimensions: who bears inference cost and volatility, and who retains efficiency gains as AI becomes more efficient. From that lens, four economic states emerge: Structural Winners, Defensive Beneficiaries, Margin Tension, and Consumer Surplus Trap. More importantly, products migrate between them. Bundled seats can move to credits or metering. Bounded optimization surfaces can expand into open-ended agentic execution. Structural Winners can compress if competition erodes pricing power. Much of the public discourse stops at “move to usage-based pricing.” This piece connects Jevons effects, unit economics and product-level P&L mechanics...AND introduces a migration framework to think about how these states evolve. The theme may now be common. The framework is not.
E-Commerce Technology Platforms
Explore top LinkedIn content from expert professionals.
-
-
When you query AI, it gathers relevant information to answer you. But, how much information does the model need? Conversations with practitioners revealed the their intuition : the input was ~20x larger than the output. But my experiments with Gemini tool command line interface, which outputs detailed token statistics, revealed its much higher. 300x on average & up to 4000x. Here’s why this high input-to-output ratio matters for anyone building with AI: Cost Management is All About the Input. With API calls priced per token, a 300:1 ratio means costs are dictated by the context, not the answer. This pricing dynamic holds true across all major models. On OpenAI’s pricing page, output tokens for GPT-4.1 are 4x as expensive as input tokens. But when the input is 300x more voluminous, the input costs are still 98% of the total bill. Latency is a Function of Context Size. An important factor determining how long a user waits for an answer is the time it takes the model to process the input. It Redefines the Engineering Challenge. This observation proves that the core challenge of building with LLMs isn’t just prompting. It’s context engineering. The critical task is building efficient data retrieval & context - crafting pipelines that can find the best information and distilling it into the smallest possible token footprint. Caching Becomes Mission-Critical. If 99% of tokens are in the input, building a robust caching layer for frequently retrieved documents or common query contexts moves from a “nice-to-have” to a core architectural requirement for building a cost-effective & scalable product. For developers, this means focusing on input optimization is a critical lever for controlling costs, reducing latency, and ultimately, building a successful AI-powered product.
-
Snowflake 𝗷𝘂𝘀𝘁 𝗻𝗲𝘂𝘁𝗿𝗮𝗹𝗶𝘇𝗲𝗱 𝘁𝗵𝗲 𝗯𝗶𝗴𝗴𝗲𝘀𝘁 𝗮𝗿𝗴𝘂𝗺𝗲𝗻𝘁 𝗰𝗼𝗺𝗽𝗲𝘁𝗶𝘁𝗼𝗿𝘀 𝘂𝘀𝗲𝗱 𝗮𝗴𝗮𝗶𝗻𝘀𝘁 𝗶𝘁 For years, one objection kept derailing customer workshops and RFP scoring. Databricks repeated it with full confidence. They claimed Snowflake streaming becomes expensive when data arrives in small, constant pieces because the XS warehouse needs to stay alive. Unfortunately, this held up in practice. The pricing model was tied to compute uptime, which turned low-latency ingestion into a budgeting headache. That chapter is closed. Snowflake moved Snowpipe to a simple model based on GB ingested. The cost now tracks the actual workload instead of the arrival pattern. Bursts or trickles behave the same. Early numbers show savings of roughly fifty percent for many teams. 𝗪𝗵𝘆 𝘁𝗵𝗶𝘀 𝗺𝗮𝘁𝘁𝗲𝗿𝘀 This change hits the three pain points that always show up in platform selection. Forecasting becomes easier because the pricing logic is not tied to warehouse behavior. Predictability improves because the ingestion pattern cannot inflate the bill anymore. Justification becomes simpler because the billing unit is consistent from end to end. 𝗧𝗵𝗲 𝗰𝗼𝗺𝗽𝗲𝘁𝗶𝘁𝗶𝘃𝗲 𝗽𝗿𝗲𝘀𝘀𝘂𝗿𝗲 Databricks still brings multiple pricing units to a single pipeline. DBUs for compute, a different unit for SQL, another for apps, and a Serverless layer that feels opaque to many customers. Teams regularly ask why a basic workflow requires several cost models and an Excel sheet to keep track. Snowflake removed the weakest part of its story. Ingestion now behaves the way enterprises expect a cloud platform to behave. Clear. Measurable. Directly tied to value. The platform rivalry will continue, but ingestion economics are no longer a valid reason to hesitate. How long until competitors simplify their own pricing to keep the comparison fair? #Snowflake #DataEngineering #ModernDataStack #Snowpipe #CloudArchitecture #DataPlatforms #CostOptimization #StreamingData #RFP #Databricks
-
If you plan to run a pricing experiment, but it takes your team 6 months to implement, you are already behind. Most SaaS companies have the same problem. They blame their billing tool. Or Salesforce. Or engineering bandwidth. But when I ask where their monetization bottleneck actually is, the answer is always the same: everywhere. 6 different systems. Each one claiming to be the source of truth. Hard-coded entitlements scattered across the codebase. Product teams can't ship without touching hard-coded billing logic. GTM can't test without filing engineering tickets. It's not a billing problem. If it were, you'd be out of business - you wouldn't be collecting cash. It's a pricing architecture problem. At Vercel, they average 5-6 pricing changes per month. They built pricing like they build product. Here are 4 pillars of pricing architecture that Fynn Glover, Benjamin Papillon, Shar Dara defined at the Schematic Monetizing AI Summit: 1/ 𝐔𝐧𝐢𝐟𝐢𝐞𝐝 𝐏𝐫𝐨𝐝𝐮𝐜𝐭 𝐂𝐚𝐭𝐚𝐥𝐨𝐠 One schema. One source of truth. Every plan, feature, entitlement, and price defined once and inherited everywhere. When product and billing share the same catalog, new SKUs appear across all systems instantly. 2/ 𝐃𝐞𝐜𝐨𝐮𝐩𝐥𝐞𝐝 𝐄𝐧𝐭𝐢𝐭𝐥𝐞𝐦𝐞𝐧𝐭𝐬 Stop hard-coding plan logic into your product. Every time you write "if planId == Enterprise, enable feature X," you're cementing your feet. Let business teams change packaging without touching code. 3/ 𝐑𝐞𝐚𝐥-𝐓𝐢𝐦𝐞 𝐌𝐞𝐭𝐞𝐫𝐢𝐧𝐠 AI features are usually priced per usage. Customers hate buying if they can't predict how much they will actually pay. They fear the huge, unpredictable bill. Enable tools to estimate and monitor usage. 4/ 𝐂𝐨𝐧𝐭𝐫𝐨𝐥 𝐏𝐥𝐚𝐧𝐞 𝐟𝐨𝐫 𝐆𝐓𝐌 Your growth team should be able to spin up a Black Friday promo or a custom Enterprise plan without bothering a single engineer. Your growth and product teams should create new plans, adjust limits, trigger experiments, and launch promotions without waiting on engineering. Here's what happens when you get this right: → Pricing changes go from quarters to hours. → Developers ship product code, not billing code. → GTM iterates on monetization continuously instead of waiting in a queue. Pricing is what makes money. And the winners are the ones who treat pricing like product. Your pricing architecture: Can your team launch a new tier today without engineering? Can you unbundle a feature into an add-on? #SchematicPartner
-
37% of AI companies will change their pricing in the next 12 months. The latest report from ICONIQ shows the market converging on hybrid models—light platform fees plus usage, with safeguards like annual commitments and tiered overages. We have seen similar trends at the Subscribed Institute. Hybrid models balance real tensions: your unit economics need consumption-based pricing, but your customers (and you) need enough predictability to defend budgets to their boards. That said, this creates an operational burden. You're asking customers to understand three pricing dimensions simultaneously: a platform fee (subscription), variable usage costs (consumption), and guardrails (commitments, tiered overages). Then, according to this report, you're planning to change this model within 12 months as the market "settles." The data shows outcome-based pricing most often ties to cost savings (36%) or revenue generated (18%). But ask yourself: Can we measure these outcomes in a way customers trust, or will we spend the first year arguing about attribution? In other words, do you have the Trust Architecture to make that complexity navigable for customers? Three things to build before you change your pricing: -> Economic Clarity: Can customers forecast costs confidently? If you're adding usage-based components, give them tools to model costs. If you're adding commitment tiers, make the value of predictability explicit. -> Value Alignment: Are your pricing units tied to customer outcomes or your costs? The shift toward outcome-based pricing (cost savings, revenue generated) is directionally sound, but only if measurement is transparent and outcomes are defined together. -> Transparent Navigation: Hybrid models have more decision points. Can customers understand their journey? When should they move tiers? What triggers overages? Make the architecture visible. The report notes that the companies that plan pricing changes are reacting to customer demand, competitive pressure, and margin concerns. The companies that navigate these changes successfully are the ones that proactively build Trust Architecture first.
-
The Curious Case of Missing AI Tokens Model quality gets the headlines in enterprise AI. Capacity is shaping the reality. Compute is pushing limits, scope, and entitlements into the product. Hat tip to Stuart Miller for the substack perspective. Over the last few weeks, Anthropic made that visible. March 13: Anthropic introduced a limited-time boost to usage limits during off-peak hours. It reads like a promotion. Operationally, it’s a classic lever: shift demand before you touch price. March 26: Session limits tightened during peak hours across plans. That’s the moment “plan limits” stop being background detail and start shaping when work can happen. March 31: Anthropic acknowledged that Claude Code users were hitting limits much sooner than expected and flagged it as a top priority. Demand outran the assumptions. April 4: The real boundary moved. Subscription limits no longer carried over into certain third-party tools (like OpenClaw). Same subscription. Narrower scope. If your workflow depended on that tooling, you either pay separately or redesign the workflow. This is prosumer pain today. It becomes enterprise reality tomorrow. OpenAI has also been explicit about compute constraints and tough trade-offs. Different company, same physics: when demand outruns infrastructure, pressure shows up in limits, scope, and entitlements. This ties directly to the pricing point I’ve written about before. Enterprise AI pricing risk isn’t only on the invoice. It’s in the architecture. When scope shifts under the same plan, your unit economics change without a new price tag. The bill looks “normal.” The workflow cost quietly drifts. So the operator move is straightforward. First, Treat tokens like an entitlements model, not a utility bill. Track where usage is allowed, which workflows count, and what happens when boundaries move. Then build resilience. ▪️Keep prompts and evals outside any single platform. ▪️Put budget envelopes on critical workflows. ▪️Add circuit breakers for retries and agent loops. ▪️Maintain a second execution path for workflows that matter. The boundary shifts before the invoice does.
-
We’re seeing a clear shift in SaaS pricing: away from seat-based models and toward value-based models. For years, “$X per user per month” worked because software was largely static and usage was easy to approximate. That world is starting to change. Today, pricing is increasingly tied to what customers actually get out of the product. Sometimes that’s straightforward: → AI platforms charging per token or per inference → Data platforms charging per query or compute usage But the more interesting shift is toward outcome-based pricing: → Customer support platforms pricing per ticket resolved → Sales tools pricing per qualified meeting booked → Fraud platforms pricing per transaction protected → Marketing tools pricing per conversion or revenue influenced This is where things get real. Pricing is no longer a proxy for value; it is the value. Here’s the problem: our go-to-market infrastructure wasn’t built for this. Legacy CRMs, CPQ systems, and billing stacks are optimized for static SKUs predictable pricing tiers, seat counts and simple usage metrics They struggle when pricing depends on: dynamic usage signals probabilistic outcomes cross-product value attribution The instinct is often to rip and replace. I think that’s usually a mistake. These systems are actually still very good at what they were designed for: systems of record and core workflow orchestration. The real gap is translation. How do you take a complex, evolving value model and make it usable for: sales reps in the field sales ops designing deals deal desk approving non-standard pricing We believe the answer is augmentation, not replacement. Enter agents. Agents can: interact directly with sales teams in natural language understand nuanced pricing constructs (e.g., “price this based on expected tickets resolved with a 20% uplift”) simulate deal outcomes and margins in real time translate that into structured inputs for CRM, CPQ, and billing systems In other words: They let humans speak in terms of value… …and systems continue to operate in terms of records and workflows. The impact is significant: No multi-million dollar SI projects to replatform Faster iteration on pricing models Higher sales productivity (less wrestling with tools, more selling) Better alignment between pricing and actual customer value Seat-based pricing was simple, but blunt. Value-based pricing is precise, but complex. The winners won’t be the companies that rebuild everything from scratch; they’ll be the ones that bridge the gap between value and execution. Agents are that bridge.
-
The most expensive meeting in a startup is not the fundraising pitch or the roadmap review. It is the meeting where someone says, “Let’s tweak the pricing.” On the surface, it sounds simple. Add a new tier. Introduce usage. Offer a discount for annual plans. It feels like a business adjustment that can be handled with a few changes in the dashboard.But pricing is never just a number. Behind every “small tweak” sits architecture. Data models define what can be charged. State transitions determine what happens during upgrades or downgrades. Contracts lock in assumptions that must hold true months later. Historical invoices need to remain reproducible even after rules change. What feels like a 30-minute strategic decision often translates into months of engineering work because pricing changes ripple through the entire system. - Can existing customers be grandfathered without breaking new logic? - Can mid-cycle changes be handled without corrupting usage calculations? - Can finance explain every invoice after the rules evolve? These are not edge cases. They are the natural consequences of growth. The real cost of a pricing meeting is not the debate in the room. It is the assumption that implementation will be straightforward. Mature companies understand that pricing is architecture. It defines how value is measured, how revenue is recognized, and how contracts behave over time. When you change pricing, you are not adjusting a slide; you are modifying the backbone of your business. That is why the most expensive meeting in a startup is often the one that sounds the simplest and we are here to solve it if it's getting too expensive for you to do in-house :D
-
Licensing shapes architecture more than patterns do. Many platform failures start in the contract, not the design review. The enterprise pays for constraints it never chose. It is rarely about choosing the wrong product. It is usually about commercial terms becoming hidden decision rights. Minimum commits, audit language, usage meters, and renewal leverage add structural load. Teams then design around exposure, not around boundaries and intent. Signals you can observe in any large platform bet: * A standard is selected before anyone can explain the pricing unit and the commit basis * Architecture reviews debate patterns while contract terms quietly set scale limits and penalties * Adoption accelerates in the least governed areas because convenience beats control * Audit readiness is deferred until it becomes an emergency program * Renewal dates start setting the roadmap faster than risk and business value do * Engineers optimize consumption to avoid overages even when it degrades operability and resiliency One practical test: Name the three clauses most likely to constrain design choices in year two. One about commits. One about audit. One about metering. Then name the single owner accountable for each. Architecture is what the enterprise can enforce. #EnterpriseArchitecture #DecisionArchitecture #OperatingModel #ITGovernance #FinOps
-
Clay just published a masterclass in building in public with the announcement of their new pricing model, but the most fascinating part is their internal memo detailing the grueling cross-functional alignment it took to get there. If you’ve ever been part of a major SaaS pricing overhaul, you know exactly how painful this process can be. You get Product, Sales, Marketing, and Finance in a room for a "pricing committee" meeting. Sales wants simple, discountable tiers to close deals faster, Product wants to monetize new features, and Finance demands predictable revenue margins. The executive team agrees on a brilliant new hybrid model, only for Engineering to deliver the bad news that untangling the hard-coded billing logic will take quarters of development work. To support the pricing change, developers are forced to stop shipping core product features and start shipping billing code. If a team leverages Schematic, that bottleneck disappears. Schematic acts as a "Monetization OS" that decouples pricing and entitlements from your codebase. Instead of waiting on engineering deployment cycles, GTM and Product teams can execute new limits, tiers, and usage-based models directly in a central control plane—zero engineering tickets required. "Price is a number. Pricing is an architecture." When you remove the technical debt of hard-coded billing, cross-functional teams stop arguing over what is technically possible and start iterating on what actually drives revenue. #SaaS #PricingStrategy #ProductLedGrowth #SchematicHQ #GTM
