Conor O'Sullivan’s Post

3w Edited

Chinese DeepSeek AI released V4 Preview as an open-source model family with a claimed 1M-token context window. Long-context models are becoming dramatically cheaper and more widely distributed. OpenAI, Anthropic, and Google now face even more pricing pressure at the lower and mid tiers as they compete with a parallel AI stack emerging outside U.S. control. DeepSeek AI is not just a model competitor. It is a pressure mechanism on the entire Western AI margin structure. If good-enough long-context reasoning becomes cheap, application defensibility must come from workflow, data, and distribution. So anyone building the new rule of thumb is to use frontier models where judgement matters and use cheaper/open models for intake, extraction, classification, summarisation, and retrieval. https://lnkd.in/dDCyZ-Tb

DeepSeek V4 Preview Release | DeepSeek API Docs api-docs.deepseek.com

1 Comment

Conor O'Sullivan 3w

DeepSeek V4 Preview costs about 85% less than GPT-5.5https://mashable.com/article/deepseek-v4-preview-comparison-chatgpt-claude-gemini

To view or add a comment, sign in

More Relevant Posts

Douglas José Pereira dos Santos
3w Edited
Report this post
DeepSeek just dropped V4. Open-sourced. 1.6T parameters. 1M context window as the default. Benchmarks rivaling the best closed models in the world. Six months ago, that was a moat. Today it is a Hugging Face download. This is not a surprise if you have been paying attention. It is a pattern. Capabilities that cost hundreds of millions to develop reach open availability within months of their closed-source counterparts. The compression is relentless and it is accelerating. Intelligence, the narrow computational kind, is becoming infrastructure. What TCP/IP did to communication, commoditized AI is doing to cognition. The question worth asking is not which model wins. It is what remains scarce once the model stops being scarce. The answer: judgment. The ability to decide what to trust, where the system breaks, what to build on top of it, and how to adapt when the capability layer shifts again next quarter, because it will. https://lnkd.in/gxnmnn6m #AI #DeepSeek #LLM #Agentic

DeepSeek V4 Preview Release | DeepSeek API Docs api-docs.deepseek.com
Like Comment
To view or add a comment, sign in
Cristian Civera
4w
Report this post
DeepSeek just dropped its V4 Preview with 1M context as the new default, plus open weights and API availability today. The interesting bit is how aggressively they’re pushing long-context efficiency into something practical. Worth a look. #AI #LLM #DeepSeek #GenAI

DeepSeek V4 Preview Release | DeepSeek API Docs api-docs.deepseek.com

1 Comment
Like Comment
To view or add a comment, sign in
mena-ai.org

372 followers
3w
Report this post
The AI race is shifting from raw intelligence to measurable enterprise performance. Led by OpenAI and DeepSeek, this is not just another benchmark battle, but a clash between premium frontier models and cost efficient challengers. GPT 5.5 is positioning itself with strong real world scores, including 82.7% on Terminal Bench 2.0 for complex coding workflows and 58.6% on SWE Bench Pro for GitHub issue resolution. DeepSeek V4, meanwhile, continues gaining traction through lower cost deployment, open availability, and fast developer adoption. For enterprises, the real decision is performance, trust, or efficiency. Expect MENA firms to increasingly adopt multi model AI strategies going forward. #AI #OpenAI #DeepSeek #MENA #Innovation Sources :https://lnkd.in/dwRJZnJw https://lnkd.in/dhK-aPbg

deepseek-ai/DeepSeek-V4-Pro · Hugging Face huggingface.co
Like Comment
To view or add a comment, sign in
Jessica Saini
3w
Report this post
GPT-5.5 just dropped. Here's what the launch coverage isn't telling you. OpenAI shipped a full base-model retrain this week — not a post-training tweak, a ground-up rebuild optimized for agentic, long-context work. It now leads the Artificial Analysis Intelligence Index at 60 points (vs. Claude Opus 4.7 and Gemini 3.1 Pro Preview at 57). The benchmark gap is real and so is the fine print. On AA-Omniscience — a 6,000-question expert benchmark that penalizes confident wrong answers — GPT-5.5 posts an 86% hallucination rate. Claude Opus 4.7: 36%. Gemini 3.1 Pro: 50%. The model knows more. It's also wrong more often, and less likely to say "I don't know." For agentic coding, terminal automation, and long-horizon tasks? Genuinely impressive gains — Terminal-Bench 2.0 and OSWorld-Verified show a clear lead over competitors. For compliance docs, legal drafts, citation work, or anything where a confident wrong answer has real cost? That hallucination number is a workflow decision, not a footnote. A few things I'm watching: → SWE-Bench Pro: OpenAI scores 58.6% vs. Opus 4.7 at 64.3%. OpenAI didn't publish a SWE-Bench Verified score at all. → Pricing doubled vs. GPT-5.4. Token efficiency partially offsets this (~40% fewer output tokens), but GPT-5.5 is still not the cost leader at equivalent quality. The broader takeaway: all three frontier labs are now racing on the same axis — agentic, multi-step, autonomous work. The spread between them is narrowing. Your stack should be able to swap models as easily as bumping a version number. What workloads are you evaluating this on? https://lnkd.in/eWr-xcPw #AI #LLM #MachineLearning #EnterpriseAI #AIStrategy

Introducing GPT-5.5 openai.com
Like Comment
To view or add a comment, sign in
Jure Leskovec
2w
Report this post
We made a short video: Claude Code for predictive AI. It shows how coding agents like Claude Code and OpenAI Codex can work with the Kumo SDK and KumoRFM to help teams go from raw multi-table data to production-ready predictions. The bigger idea is simple: predictive modeling should not be gated by ML expertise, complex pipelines, or hundreds of lines of fragile code. With the right abstractions, including Graphs, Predictive Query Language, Explainability, and KumoRFM, coding agents can become much more useful for real-world enterprise AI. Watch the video, then try it for yourself: KumoRFM-2: https://lnkd.in/gCFZ_yxk Kumo Coding Agent Skills: https://lnkd.in/gcy9bW9f Blog: https://lnkd.in/gMeRX7kH

1 Comment
Like Comment
To view or add a comment, sign in
Amit Yadav
3w
Report this post
🚀 Most AI systems don’t fail because of the model — they fail because they don’t remember. While building conversational workflows, one limitation became immediately clear: LLMs are inherently stateless. Each invocation is processed independently, with no awareness of prior interactions. In practice, this leads to loss of continuity even in simple multi-turn conversations. For Day 19, I focused on implementing short-term memory using LangGraph, with the goal of moving from isolated responses to stateful, context-aware systems. 🧠 System-Level Memory Design Memory is not a property of the model — it is an architectural responsibility. LangGraph addresses this through: • Checkpointers Persist the graph’s state at each execution step • Thread Identifiers Maintain isolated state per user/session This creates a structured flow: User Input → Thread → State → Checkpoint → Resume Instead of reinitializing on every call, the system continues from the last known state. 💾 From Volatile to Persistent State Initial implementation used in-memory storage, which is suitable for prototyping but insufficient for production: • State loss on restart • No cross-session continuity To address this, I integrated a PostgreSQL-backed checkpointer (via Docker). This enables: ✔ Persistent conversation state ✔ Recovery across sessions ✔ Reliable multi-user handling ⚠️ Managing Context Window Constraints As conversation length increases, systems inevitably face: • Token limits • Increased latency • Degradation in response quality At this stage, memory becomes a management problem, not just storage. 🛠️ Strategies Implemented Three approaches were used to control context growth: • Trimming Automatically removes older messages beyond token limits • Deletion Explicit removal of stale or irrelevant data • Summarization Condenses historical interactions into a compact representation Among these, summarization proved most effective, as it preserves semantic meaning while reducing token usage. 🔁 Resulting System Behavior The system now evolves as follows: Raw conversation → Context growth → Summarization → Efficient continuation This ensures that: • Context is retained • Performance remains stable • Token limits are respected 🧠 Key Insight Effective memory design is not about retaining all data. It is about selective retention and controlled compression. This is a critical factor in building scalable, production-grade AI systems. Transitioning from stateless interactions to managed memory is a foundational step toward building reliable AI systems. How are you currently handling memory in your AI systems? #LangGraph #AgenticAI #AIEngineering #LLM #MemorySystems #SoftwareEngineering #Python
Like Comment
To view or add a comment, sign in
Bok Mykola
3d
Report this post
I like how this tutorial breaks the agent into three distinct roles: planner, executor, and critic. Most agentic AI demos blur those lines and end up brittle. Separating strategy from execution and adding a self-critique loop makes the system more robust and easier to debug. Practical breakdown worth checking out. Full write-up at MarkTechPost.com #AI #AgenticAI #OpenAI #LLM 🔗 https://lnkd.in/dJyjCnuz

How to Build an Advanced Agentic AI System with Planning, Tool Calling, Memory, and Self-Critique Using OpenAI API marktechpost.com
Like Comment
To view or add a comment, sign in
Syed Asad
3w
Report this post
OpenAI GPT 5.5 First Impressions! Mind-blowing upgrades that redefine what's possible: >> Unmatched Reasoning: Solves complex, multi-step problems like a PhD-level expert – think advanced math, code, and science in one shot. >> 10x Context Window: Handles massive datasets (up to 10M tokens) for deeper analysis without losing the plot. >> Multimodal Mastery: Seamlessly processes text, images, audio, and video – perfect for creative and enterprise workflows. >> Agentic Superpowers: Autonomous agents that plan, execute, and self-correct tasks across tools and APIs. >> Safety-First Design: Built-in safeguards with 2x better alignment, reducing hallucinations by 40%. >> Enterprise Ready: Faster inference, lower costs, and integrations for businesses scaling AI. >> Developer Dream: Enhanced APIs, fine-tuning, and open weights for custom models. Read More: https://lnkd.in/gtiUxGi6 https://lnkd.in/gRYZTHBr

Introducing GPT-5.5 openai.com
Like Comment
To view or add a comment, sign in
UNDERCODE TESTING

2,121 followers
1w
Report this post
Mastering the AI Revolution: From ANI to ASI – A Technical Deep Dive with RAG, Agents, and Vibe Coding + Video Introduction: Artificial Intelligence is no longer a futuristic concept—it’s embedded in our daily tools, from Siri to ChatGPT. Dr. Shlomi Boutnaru’s Artificial Intelligence Journey v4.0 (June 2025) breaks down the AI ecosystem into actionable technical concepts, covering everything from the Machine Learning lifecycle to the rise of AI Agents and Vibe Coding. This article extracts the core technical insights, adds hands-on tutorials, commands, and security considerations to help you master the AI stack....

Mastering the AI Revolution: From ANI to ASI – A Technical Deep Dive with RAG, Agents, and Vibe Coding + Video undercodetesting.com
Like Comment
To view or add a comment, sign in