Last week, I described four design patterns for AI agentic workflows that I believe will drive significant progress: Reflection, Tool use, Planning and Multi-agent collaboration. Instead of having an LLM generate its final output directly, an agentic workflow prompts the LLM multiple times, giving it opportunities to build step by step to higher-quality output. Here, I'd like to discuss Reflection. It's relatively quick to implement, and I've seen it lead to surprising performance gains. You may have had the experience of prompting ChatGPT/Claude/Gemini, receiving unsatisfactory output, delivering critical feedback to help the LLM improve its response, and then getting a better response. What if you automate the step of delivering critical feedback, so the model automatically criticizes its own output and improves its response? This is the crux of Reflection. Take the task of asking an LLM to write code. We can prompt it to generate the desired code directly to carry out some task X. Then, we can prompt it to reflect on its own output, perhaps as follows: Here’s code intended for task X: [previously generated code] Check the code carefully for correctness, style, and efficiency, and give constructive criticism for how to improve it. Sometimes this causes the LLM to spot problems and come up with constructive suggestions. Next, we can prompt the LLM with context including (i) the previously generated code and (ii) the constructive feedback, and ask it to use the feedback to rewrite the code. This can lead to a better response. Repeating the criticism/rewrite process might yield further improvements. This self-reflection process allows the LLM to spot gaps and improve its output on a variety of tasks including producing code, writing text, and answering questions. And we can go beyond self-reflection by giving the LLM tools that help evaluate its output; for example, running its code through a few unit tests to check whether it generates correct results on test cases or searching the web to double-check text output. Then it can reflect on any errors it found and come up with ideas for improvement. Further, we can implement Reflection using a multi-agent framework. I've found it convenient to create two agents, one prompted to generate good outputs and the other prompted to give constructive criticism of the first agent's output. The resulting discussion between the two agents leads to improved responses. Reflection is a relatively basic type of agentic workflow, but I've been delighted by how much it improved my applications’ results. If you’re interested in learning more about reflection, I recommend: - Self-Refine: Iterative Refinement with Self-Feedback, by Madaan et al. (2023) - Reflexion: Language Agents with Verbal Reinforcement Learning, by Shinn et al. (2023) - CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing, by Gou et al. (2024) [Original text: https://lnkd.in/g4bTuWtU ]
Best Programming Practices for Clean Code
Explore top LinkedIn content from expert professionals.
-
-
If it works, don't touch it! 🤔 We've all heard this phrase in software engineering: "If it works, it works, so don't touch it!" But where does this fit in with the need for maintenance, refactoring, and continuous improvement? It's a balancing act. For small projects or microservices, it's often wise to avoid unnecessary changes—if the code is stable and meeting requirements, why risk introducing new issues? However, when it comes to larger projects with complex logic and multiple teams involved, the story changes. In these environments, it's crucial to regularly refactor and maintain the codebase. Clean, understandable code isn't just a luxury—it's essential for scalability, collaboration, and long-term success. So, the real question isn't whether to touch the code or not, but when and why. In small, contained contexts, stability might be your best friend. But in larger, more complex projects, investing in refactoring and maintenance pays off by reducing technical debt and ensuring the code remains a solid foundation for future development. Keep your projects thriving, not just surviving! #SoftwareEngineering #Refactoring #TechDebt #CodeQuality #DevelopmentStrategy #developer #code
-
Don't Let DRY Make Your Code Too Thirsty The DRY (Don't Repeat Yourself) principle means that every piece of knowledge should only exist once in your codebase. Sounds great, right? But when does DRY become TOO DRY? Sometimes, in an effort to eliminate all repetition, we can end up over-abstracting our code. This can lead to code that is hard to understand, maintain, or extend. For example, if you find yourself creating overly generic methods or classes that try to handle too many scenarios, you might be taking DRY too far. This makes the code confusing for others (or even your future self) and increases the chance of introducing bugs when changes are needed. DRY isn't always the best choice. In cases like DTOs or database schemas, repetition can be more readable and clear. Reusing too much can make your design rigid and harder to change when requirements evolve. Pros of DRY: • Reduces repetition, making your code easier to maintain. • Less copy-pasting means fewer chances for mistakes and errors. • Changes in logic require fewer edits, which reduces the risk of bugs. Cons of DRY: • Too much abstraction can make your code hard to understand. • Reusing too much logic across different parts can make changes risky and cause unexpected problems. Where have you found DRY to be more trouble than it's worth? How do you balance avoiding repetition without over-complicating your code? #DRY #SoftwareEngineering #ProgrammingPrinciples #CleanCode #CSharp
-
I am an Engineering Manager working at Google with almost 20 years of experience. If I could sit down with a Jr. Software Engineer, here are 11 good pieces of advice I would tell them that I learned through my experiences… 1// If your app only serves around 10 users, a single server and a basic REST API will do the job. But if you’re handling 10 million requests a day, you need to start thinking about load balancers, autoscaling, and rate limiting. 2// If only one developer is building features, you can skip the ceremonies and just ship and test manually. But if you have 10 developers pushing code daily, it’s time to invest in CI/CD pipelines, multiple testing layers, and feature flags. 3// If a bit of downtime just breaks a single page, adding a banner and moving on is usually enough. But if downtime kills a key business flow, redundancy, health checks, and graceful fallbacks are absolutely necessary. 4// If you’re just consuming APIs, make sure you know how to handle errors like 400s and 500s. If you’re building APIs for others, you need to version them, document everything, test thoroughly, and set up proper monitoring. 5// If your product can tolerate a few seconds of lag, always pick code clarity over squeezing out a little more performance. But if users are waiting on every click, profiling, caching, and edge delivery need to become a part of your daily work. 6// If your data easily fits in RAM, keep things simple and store it in memory using maps. But if your data spans terabytes, you have to start thinking about indexing, partitioning, and optimizing for disk access patterns. 7// If you’re coding alone, poor naming might just annoy you. But in a growing team, bad names become a ticking time bomb for everyone. 8// If you’re only fixing bugs once a week, basic logs and console prints are probably enough. But when you’re running production systems, you need structured logs, tracing, real-time alerts, and dashboards. 9// If you’re up against tight deadlines, write the simplest code that gets things working. But if the code is meant to last, focus on readability, thorough testing, and making it easy to change in the future. 10// If you’re working alone, “it works on my machine” might be good enough. But in a real team, reproducible builds and shared development setups are the bare minimum. 11// If your app is new, move fast and don’t worry too much about cleaning up right away. But once your app is stuck in maintenance hell, you’ll pay the price for every rushed decision you made in the past. People think software engineering is just about building things. It’s really about: – Knowing when not to build – Being okay with deleting good code – Balancing tradeoffs without always having all the data The best engineers don’t just ship fast. They build systems that are safe to move fast on top of.
-
When working with multiple LLM providers, managing prompts, and handling complex data flows — structure isn't a luxury, it's a necessity. A well-organized architecture enables: → Collaboration between ML engineers and developers → Rapid experimentation with reproducibility → Consistent error handling, rate limiting, and logging → Clear separation of configuration (YAML) and logic (code) 𝗞𝗲𝘆 𝗖𝗼𝗺𝗽𝗼𝗻𝗲𝗻𝘁𝘀 𝗧𝗵𝗮𝘁 𝗗𝗿𝗶𝘃𝗲 𝗦𝘂𝗰𝗰𝗲𝘀𝘀 It’s not just about folder layout — it’s how components interact and scale together: → Centralized configuration using YAML files → A dedicated prompt engineering module with templates and few-shot examples → Properly sandboxed model clients with standardized interfaces → Utilities for caching, observability, and structured logging → Modular handlers for managing API calls and workflows This setup can save teams countless hours in debugging, onboarding, and scaling real-world GenAI systems — whether you're building RAG pipelines, fine-tuning models, or developing agent-based architectures. → What’s your go-to project structure when working with LLMs or Generative AI systems? Let’s share ideas and learn from each other.
-
Last night, I was chatting in the hotel bar with a bunch of conference speakers at Goto-CPH about how evil PR-driven code reviews are (we were all in agreement), and Martin Fowler brought up an interesting point. The best time to review your code is when you use it. That is, continuous review is better than what amounts to a waterfall review phase. For one thing, the reviewer has a vested interest in assuring that the code they're about to use is high quality. Furthermore, you are reviewing the code in a real-world context, not in isolation, so you are better able to see if the code is suitable for its intended purpose. Continuous review, of course, also leads to a culture of continuous refactoring. You review everything you look at, and when you find issues, you fix them. My experience is that PR-driven reviews rarely find real bugs. They don't improve quality in ways that matter. They DO create bottlenecks, dependencies, and context-swap overhead, however, and all that pushes out delivery time and increases the cost of development with no balancing benefit. I will grant that two or more sets of eyes on the code leads to better code, but in my experience, the best time to do that is when the code is being written, not after the fact. Work in a pair, or better yet, a mob/ensemble. One of the teams at Hunter Industries, which mob/ensemble programs 100% of the time on 100% of the code, went a year and a half with no bugs reported against their code, with zero productivity hit. (Quite the contrary—they work very fast.) Bugs are so rare across all the teams, in fact, that they don't bother to track them. When a bug comes up, they fix it. Right then and there. If you're working in a regulatory environment, the Driver signs the code, and then any Navigator can sign off on the review, all as part of the commit/push process, so that's a non-issue. There's also a myth that it's best if the reviewer is not familiar with the code. I *really* don't buy that. An isolated reviewer doesn't understand the context. They don't know why design decisions were made. They have to waste a vast amount of time coming up to speed. They are also often not in a position to know whether the code will actually work. Consequently, they usually focus on trivia like formatting. That benefits nobody.
-
Most of us review code in the wrong order. We spot a missing test or a style inconsistency before even asking whether the code is correct. We should think about it differently. The first question should always be: Does this code do what it is supposed to do? If the answer is no, nothing else matters. Style, structure, tests - all secondary to correctness. Once you are confident it is correct, ask if it is clear. Can someone else (or you, six months from now) understand what is happening and why? Clarity in code helps ensure it does not become a liability. Then check whether it matches the style and conventions, because inconsistencies add cognitive load for everyone who reads the codebase afterward. After that, look for duplication. Is this solving a problem that is already solved somewhere else? Could this be a shared utility? Finally, ask whether it is well tested. Not just "are there tests" (non-sensical ones), but do the tests actually cover the meaningful cases? Correctness. Clarity. Style. Deduplication. Tests. In that order, every time. Hope this helps.
-
Rewrites feel clean at the start. Then reality shows up. Missed edge cases. Broken behavior. Delayed releases. A second system nobody fully trusts. A safer option is to migrate incrementally. That’s where the Strangler Fig Pattern shines. Instead of replacing the whole legacy API at once, you put a reverse proxy in front of it and start routing traffic endpoint by endpoint. Old system keeps running. New system takes over gradually. Risk stays contained. In my example, I start with a Node.js API, add YARP as a reverse proxy, and then migrate individual endpoints into a modern .NET 10 API. The nice part is that this works just as well for old .NET Framework apps. You don’t need a giant rewrite to modernize a legacy system. You need a controlled migration path. I break down the full implementation here: https://lnkd.in/dg_zf-MV
-
Few Lessons from Deploying and Using LLMs in Production Deploying LLMs can feel like hiring a hyperactive genius intern—they dazzle users while potentially draining your API budget. Here are some insights I’ve gathered: 1. “Cheap” is a Lie You Tell Yourself: Cloud costs per call may seem low, but the overall expense of an LLM-based system can skyrocket. Fixes: - Cache repetitive queries: Users ask the same thing at least 100x/day - Gatekeep: Use cheap classifiers (BERT) to filter “easy” requests. Let LLMs handle only the complex 10% and your current systems handle the remaining 90%. - Quantize your models: Shrink LLMs to run on cheaper hardware without massive accuracy drops - Asynchronously build your caches — Pre-generate common responses before they’re requested or gracefully fail the first time a query comes and cache for the next time. 2. Guard Against Model Hallucinations: Sometimes, models express answers with such confidence that distinguishing fact from fiction becomes challenging, even for human reviewers. Fixes: - Use RAG - Just a fancy way of saying to provide your model the knowledge it requires in the prompt itself by querying some database based on semantic matches with the query. - Guardrails: Validate outputs using regex or cross-encoders to establish a clear decision boundary between the query and the LLM’s response. 3. The best LLM is often a discriminative model: You don’t always need a full LLM. Consider knowledge distillation: use a large LLM to label your data and then train a smaller, discriminative model that performs similarly at a much lower cost. 4. It's not about the model, it is about the data on which it is trained: A smaller LLM might struggle with specialized domain data—that’s normal. Fine-tune your model on your specific data set by starting with parameter-efficient methods (like LoRA or Adapters) and using synthetic data generation to bootstrap training. 5. Prompts are the new Features: Prompts are the new features in your system. Version them, run A/B tests, and continuously refine using online experiments. Consider bandit algorithms to automatically promote the best-performing variants. What do you think? Have I missed anything? I’d love to hear your “I survived LLM prod” stories in the comments!
-
Boss, it's not no-code VS custom code. It's knowing when to switch between them. I've built dozens of AI workflows in n8n. Here's the framework that actually works. ✳️ Start with no-code when you need: Speed → Something running today, not next month Standard patterns → Email routing, data syncing, basic AI responses Team collaboration → Non-technical folks will modify it later n8n's 300+ integrations get you from zero to working in under an hour. ✳️ Switch to custom code when you hit: Complex logic → Nested conditionals taking 10+ visual nodes to build Performance walls → Processing thousands of records where JavaScript runs 10x faster Unique AI behavior → Fine-grained prompt control that built-in nodes can't handle 💡 The hybrid approach wins most often. Use n8n's visual builder for workflow structure. Drop in Code Nodes only where you need custom logic. A good analogy would be LEGO vs clay. Standardized blocks snap together fast. Custom molding gives you precision. Smart builders know when to use each. The mistake isn't picking the wrong tool. It's not knowing when to switch. What's your experience? Do you fight with no-code when code would be faster, or over-engineer with custom scripts when simple integrations would work? Follow me, Bhavishya Pandit, for practical AI automation insights 🔥
