Monthly Archives: March 2025

Good engineering practices have become essential in the age of GenAI-assisted development

Since Cursor launched their Composer Agent “Yolo” mode there’s been a lot of excitement about fully AI-driven development, where developers sketch out the intent and the agent generates the code. It’s interesting stuff, and I get why people are excited.

But what’s becoming increasingly clear to me is that modern engineering best practices aren’t just nice to have in this world – they’re absolutely essential. Without them, you’re basically handing the GenAI a loaded gun and pointing it at your own foot.

For one thing, if this is the case, the value of people with my type of skills and experience have just shot up significantly, which is why I keep coming back to this Kent Beck tweet from a couple of years ago:

Kent Beck Twitter post “I’ve been reluctant to try ChatGPT. Today I got over that reluctance. Now I understand why I was reluctant. The value of 90% of my skills just dropped to $0. The leverage for the remaining 10% went up 1000x. I need to recalibrate”

Why good engineering practices matter more than ever

I see a lot of guides and people comparing using GenAI in software development to a pairing with a junior developer, but that’s the wrong analogy.

It’s much more like a toddler.

Just like a young child, GenAI has no real-world context. It grabs random things, unexpectedly veers off in bizarre directions, and insists it’s right even when it’s covered in spaghetti. It’ll confidently hand you a “masterpiece” that’s just glue and glitter stuck to your phone. Anyone who’s had young children will know this very well – you can’t leave them alone for a second.

GenAI can produce plausible-looking code at incredible speed, but it will happily generate code that’s subtly, or spectacularly, wrong.

Without good practices and strong guardrails around it all, you’re not accelerating delivery – you’re accelerating chaos.

The engineering practices that matter

None of these are new. They’re well-established, widely recognised best practices for building and maintaining software. They’ve always mattered – but now they’re essential. If you’re not in good shape on these fronts, I’d strongly suggest staying away from AI-driven development until you are. Otherwise, you’re putting your foot down on a trip to disaster.

  • Clear requirements and expected outcomes – Knowing what you’re building and why, with clear, well understood, outcome-based requirements and definitions of success.
  • Clean, consistent, loosely coupled code – Code that’s easy to understand, maintain, and extend, with clear separation of concerns, high cohesion and minimal unnecessary dependencies.
  • High and effective automated testing – Unit tests, integration tests, E2E tests all running as part of your deployment pipeline.
  • Frequent code check-ins – Regularly checking in code, keeping branches short-lived.
  • Continuous Delivery – highly automated build and deployment. Releasing frequently (not every 2 weeks)
  • Static analysis – Automated checks for code quality, vulnerabilities, and other issues, baked into your pipelines.
  • Effective logging and monitoring – Clear visibility into what’s happening in in all environments, so issues can be identified and understood quickly.
  • Infrastructure as code – Consistent, repeatable infrastructure and environments, easy to maintain and keep secure.
  • Effective documentation – Lightweight, useful documentation that explains why something was done, not just what was done.

Most common but not common

I’ve long opined that whilst these are the most well established and common best practices, they are still far from common. Only a relatively small proportion of organisations and teams actually follow them well, and the amount of people in the industry with the skills is still comparatively small.

The reality is, most of the industry is still stuck in bad practices – messy code, limited automated testing, poor automation and visibility, and a general lack of solid engineering discipline.

If those teams want to lean heavily into GenAI, they’ll need to seriously improve their fundamentals first. For many, that’s a long, difficult journey – one I suspect most won’t take.

While already high-performing teams will likely see some benefit, I predict most of the rest will charge in headfirst, blow up their systems, and create plenty of work for consultants to come in to clean up the mess.

Final Thought

GenAI isn’t a shortcut past the hard work of good engineering – it shines a spotlight on why the established good practices were already so important. The teams who’ve already invested in strong engineering discipline will be the ones who may see real value from AI-assisted development.

For everyone else, GenAI won’t fix your problems – it’ll amplify them.

Whether it leads to acceleration or chaos depends entirely on how strong your foundations are.

If you’re serious about using GenAI well, start by getting your engineering house in order.

What CTOs and tech leaders are observing about GenAI in software development

It’s helpful to get a view of what’s actually happening on the ground rather than the broader industry hype. I’m in quite a few CTO and Tech Leader forums, so I thought I’d do something GenAI is quite good at – collate and summarise conversation threads and identify common themes and patterns.

Here’s a consolidated view of the observations, patterns and experiences shared by CTOs and tech leaders across various CTO forums in the last couple of months

Disclaimer: This article is largely the output from my conversation with ChatGPT, but reviewed and edited by me (and as usual with GenAI, it took quite a lot of editing!)

Adoption on orgs is mixed and depends on context

  • GenAI adoption varies significantly across companies, industries and teams.
  • In the UK, adoption appears lower than in the US and parts of Europe, with some surveys showing over a third of UK developers are not using GenAI at all and have no plans to1UK developers slow to adopt AI tools says new survey.
    • The slower adoption is often linked to:
      • A more senior-heavy developer population.
      • Conservative sectors like financial services, where risk appetite is lower.
  • Teams working in React, TypeScript, Python, Bash, SQL and CRUD-heavy systems tend to report the best results.
  • Teams working in Java, .NET often find GenAI suggestions less reliable.
  • Teams working in modern languages and well-structured systems, are adopting GenAI more successfully.
  • Teams working in complex domains, messy code & large legacy systems often find the suggestions more distracting than helpful.
  • Feedback on GitHub Copilot is mixed. Some developers find autocomplete intrusive or low-value.
  • Many developers prefer working directly with ChatGPT or Claude, rather than relying on inline completions.

How teams are using GenAI today

  • Generating boilerplate code (models, migrations, handlers, test scaffolding).
  • Writing initial tests, particularly in test-driven development flows.
  • Debugging support, especially for error traces or unfamiliar code.
  • Generating documentation
  • Supporting documentation-driven development (docdd), where structured documentation and diagrams live directly in the codebase.
  • Some teams are experimenting with embedding GenAI into CI/CD pipelines, generating:
    • Documentation.
    • Release notes.
    • Automated risk assessments.
    • Early impact analysis.

GenAI is impacting more than just code writing

Some teams are seeing value beyond code generation, such as:

  • Converting meeting transcripts into initial requirements.
  • Auto-generating architecture diagrams, design documentation and process flows.
  • Enriching documentation by combining analysis of the codebase with historical context and user flows.
  • Mapping existing systems to knowledge graphs to give GenAI a better understanding of complex environments.

Some teams are embedding GenAI directly into their processes to:

  • Summarise changes into release notes.
  • Capture design rationale directly into the codebase.
  • Generate automated impact assessments during pull requests.

Where GenAI struggles

  • Brownfield projects, especially those with:
    • Deep, embedded domain logic.
    • Where there is little or inconsistent documentation.
    • Highly bespoke patterns.
    • Inconsistent and poorly structured code
  • Languages with smaller training data sets, like Rust.
  • Multi-file or cross-service changes where keeping context across files is critical.
  • GenAI-generated code often follows happy paths, skipping:
    • Error handling.
    • Security controls (e.g., authorisation, auditing).
    • Performance considerations.
  • Several CTOs reported that overly aggressive GenAI use led to:
    • Higher defect rates.
    • Increased support burden after release.
  • Large, inconsistent legacy codebases are particularly challenging, where even human developers struggle to build context.

Teams are applying guardrails to manage risks

Many teams apply structured oversight processes to balance GenAI use with quality control. Common guardrails include:

  • Senior developers reviewing all AI-generated code.
  • Limiting GenAI to lower-risk work (boilerplate, tests, internal tooling).
  • Applying stricter human oversight for:
    • Security-critical features.
    • Regulatory or compliance-related work.
    • Any changes requiring deep domain expertise.

The emerging hybrid model

The most common emerging pattern is a hybrid approach, where:

  • GenAI is used to generate initial code, documentation and change summaries, with final validation and approval by experienced developers.
  • Developers focus on design, validation and higher-risk tasks.
  • Structured documentation and design rules live directly in the codebase.
  • AI handles repetitive, well-scoped work.

Reported productivity gains vary depending on context

  • The largest gains are reported in smaller, well-scoped greenfield projects.
  • Moderate gains are reported more typical in medium to large, established or more complex systems.
  • Neutral to negative benefit in very large codebases, messy or legacy systems
  • However, across the full delivery lifecycle, a ~25% uplift is seen as a realistic upper bound.
  • The biggest time savings tend to come from:
    • Eliminating repetitive or boilerplate work.
    • Speeding up research and discovery (e.g., understanding unfamiliar code or exploring new APIs).
  • Teams that invest in clear documentation, consistent patterns and cleaner codebases generally see better results.

Measurement challenges

  • Most productivity gains reported so far are self-reported or anecdotal.
  • Team-level metrics (cycle time, throughput, defect rates) rarely show clear and consistent improvements.
  • Several CTOs point out that:
    • Simple adoption metrics (e.g., number of Copilot completions accepted) are misleading.
    • Much of the real value comes from reduced research time, which is difficult to measure directly.
  • Some CTOs also cautioned that both individuals and organisations are prone to overstating GenAI benefits to align with investor or leadership expectations.

Summary

Across all these conversations, a consistent picture emerges – GenAI is changing how teams work, but the impact varies heavily depending on the team, the technology and the wider processes in place.

  • The biggest gains are in lower-risk, less complext, well-scoped work.
  • Teams with clear documentation, consistent patterns and clean codebases see greater benefits.
  • GenAI is a productivity multiplier, not a team replacement.
  • The teams seeing the most value are those treating GenAI as part of a broader process shift, not just a new tool.
  • Long-term benefits depend on strong documentation, robust automated testing and clear processes and guardrails, ensuring GenAI accelerates the right work without introducing unnecessary risks.

The overall sentiment is that GenAI is a useful assistant, but not a transformational force on its own. The teams making meaningful progress are those actively adapting their processes, rather than expecting the technology to fix underlying delivery issues.