Thought Leadership

AI Can Code. But Can It Ship?

The Challenges of AI Code Generation

Haider Al-Seaidy

Chief Customer Officer Customer Success

Every few months, it seems, the tech industry collectively gasps at the next leap forward in large language models (LLMs) like ChatGPT 5. You can now generate working code with a fluency that would have seemed like science fiction only a few years ago.

On the surface, this looks like the holy grail: tell the AI what you want, and voilà, out comes a program. The dream of “everyone can code” is suddenly within reach. A marketing executive, a biology undergrad, or even my neighbour who still struggles with Excel formulas can now dabble in the art of software creation. That’s no small advancement.

But here’s the catch: generating code has never been the hard part. Beneath the hype lie very real AI code generation challenges that can’t be ignored. The true difficulty is controlling the code—ensuring it fits into a coherent architecture, remains maintainable, resilient, and performant, and, above all, aligns with a product vision that solves real business problems. Writing lines of code is easy. Shipping software that enterprises can depend on? That’s where the battle is truly won or lost.

 

The Hidden Challenges of AI Code Generation

Having worked with enterprise clients, I know how cautious they are with new technology. “Due diligence” isn’t just a phrase; it’s a ritual. Layers of testing, review, and validation are baked into the software delivery lifecycle (SDLC). And for good reason.

Now imagine dropping AI-generated code into that environment. Suddenly, the AI code generation challenges aren’t just theoretical, they hit directly at the heart of enterprise software delivery. What if the code contains vulnerabilities the prompter doesn’t understand? What if it lacks error handling, or behaves in unpredictable ways under stress? It’s hard enough to debug our own logic, let alone logic conjured up by a statistical model.

And then there are the questions nobody has clear answers to yet:

  • What does the SDLC look like when code is generated by AI?
  • Can prompts ever capture the level of detail needed for robust software?
  • If your prompt is longer than the code it produces… have you really saved time?
  • When bugs crop up, do you fix them with more prompting, or roll up your sleeves and get into the code yourself?

We’re in uncharted waters. And while the ship is moving faster, I can’t help but notice the extra time we’re now spending inspecting the cargo.

 

LLMs in Software Development: A Double-Edged Sword

Take a friend of mine as an example. He doesn’t have a background in coding, but thanks to today’s LLMs, he managed to create a trading bot. He won’t stop telling people that it actually made him money three days in a row. Now he’s convinced he’s cracked the markets and is walking around with the swagger of a hedge fund quant.

The reality, of course, is that the market is just being polite to him for now. We all know what usually happens to “can’t miss” trading bots after day four. But his story does illustrate the point: the barriers to entry have been dramatically lowered. LLMs in software development make it easier than ever for non-experts to build something that works, at least temporarily. Whether that thing is robust, reliable, secure, and sustainable… well, that’s another matter entirely.

 

Speeding Up in One Place, Slowing Down in Another

Yes, AI accelerates the creation of code. But deployment in an enterprise context involves more than speed. It requires security reviews, performance testing, integration planning, architecture alignment, documentation, and ongoing maintenance.

Ironically, introducing AI-generated code may increase time spent validating, auditing, and rewriting. You gain speed in development, but risk losing it in governance and quality assurance.

It reminds me of the old truth: it’s always harder to understand someone else’s logic than your own. Well, what happens when that “someone else” is an AI?

 

The Human Side

There’s another wrinkle: how do we measure developer competency in this new world? Hiring was already a challenge, but what happens when a candidate generates elegant solutions with a single prompt? Do they really understand the underlying systems? Or have we entered the era of “prompt jockeys” passing as engineers?

I worry this could dumb us down, making it harder to tell who’s a true builder versus who’s just very good at nudging a model in the right direction. The wood may be harder to see for the trees.

 

AI in Enterprise Software Delivery: What Still Matters

For my company, Cyferd, it means being aware of what’s happening in the industry but staying grounded in reality. The world is too complex to be solved purely through automated code generation. Yes, this is a leap forward. Yes, it lowers barriers and empowers more people to experiment. But when it comes to AI in enterprise software delivery, the reality is clear: AI-generated code is, at least for now, just one small piece of a much larger puzzle.

The future may prove me wrong. But today, I’d say: AI can code. Impressive. But can it ship? That’s another question entirely.

Find out more About Cyferd

New York

Americas Tower
1177 6th Avenue
5th Floor
New York
NY 10036

London

2nd Floor,
Berkeley Square House,
Berkeley Square,
London W1J 6BD