Human in the Loop: Engineering with Vibe Coding
Vibe coding accelerates innovation. HPC workloads demand discipline.
Recently, I was asked to help on a customer case involving Nextflow pipelines on OCI. The request seemed simple on the surface: could we run real Nextflow pipelines efficiently on OCI? The challenge was that there was no production-ready Nextflow plugin/executor integration for OCI like on other clouds. So the task was not just “can OCI run Nextflow,” but whether we could make it run predictably, end-to-end, and in a way customers could adopt.
The context and the constraint
I had to build nf-iac in a few weeks from a place where I was not yet comfortable with Nextflow internals. It became a real test of where AI helps and where engineering discipline still has to lead.
I did have an advantage:
- solid infrastructure and IaC experience
- distributed systems instincts
- tolerance for operational edge cases (resources, locality, retries, and data movement)
Time was short, so I did not start by trying to become a Nextflow expert. I used a practical method:
Deconstruct → observe → understand → reassemble.
Deconstruction: split the problem first
I split the system into two layers:
- a thin Nextflow-side executor/plugin that extracts task context and sends JSON to a REST gateway
- an
nf-serverservice that receives that context and performs execution with IaC logic
This design gave immediate observability while I was still learning:
- JSON payloads could be logged, diffed, and replayed
- each transition became explicit
- integration behavior became testable under time pressure
With AI, a working version was surprisingly fast.
The inputs were narrow and explicit:
- architecture split
- REST contract
- IaC starting template
- a few real sample pipelines
That first phase felt like a real “vibe coding” moment: mechanical tasks disappeared quickly and scaffolding appeared.
Why AI moved us to “it runs” quickly but not “it’s ready”
AI is very good at producing something that runs.
It is less effective at enforcing maintainability, minimalism, and coherence on its own.
So I kept boundaries tight:
- clear contracts
- deterministic behavior
- deliberate ownership of taste and structure
The first lesson was simple:
AI can take you to “it runs.”
Engineering is what takes it from “it runs” to “it is trustworthy.”
Reality check: from demos to production-shaped tests
The first version worked on my own pipeline, then on small baselines.
Then came the real workload: nf-core/methylseq on a large dataset.
That moved the problem from “demo success” to “systems reliability.”
The break happened quietly: early-stage failures, intermittent missing outputs, and behavior that passed simple checks but broke under load.
The failure pattern pointed to assumptions, not syntax:
- timing
- consistency
- filesystem semantics
- metadata behavior under pressure
Object storage semantics became the blocker.
Object storage mismatch and the storage bridge
This shifted the project into a design problem.
REST and API-style decomposition helped for speed, but HPC workloads exposed mismatched assumptions between:
- POSIX-oriented workflow expectations
- object storage behavior
I built BucketFS to make the gap explicit:
- work directories on local NVMe for normal filesystem behavior
- bucket as durable namespace through a FUSE mount
- explicit data movement points
- observable upload/commit flow
The prototype came quickly, but systems bugs were immediate:
- path handling edge cases
- directory vs file semantics
- mount behavior nuances
- hidden workflow assumptions around symlinks and metadata
AI accelerated scaffolding and surfaced likely traps, but it did not replace the design decisions.
I simplified further:
- explicit stage-in / stage-out
- deterministic local-first execution
- scripts over daemons where possible
- minimal dependency surface
- predictable cloud CLI path for operational simplicity
That move was boring in the best sense: stable, readable, and reliable.
Reassembly: back to a single operational shape
With enough understanding, version 1 had done its job.
I had learned what Nextflow really does, and where failures propagate.
So I collapsed to nf-iac:
- one coherent wrapper around the execution model
- clearer operational surface
- stronger support for IaC-driven lifecycle
The merge took about three days from code standpoint, but mostly because the uncertainty was gone.
AI worked best when the target behavior was already clear:
- high-volume refactors
- interface translation across layers
- wrapper-style consolidation
It still could not replace:
- contract ownership
- invariants
- tradeoff decisions
Why tool-shaped architecture worked better than service-shaped API design
Version 1 worked as a RESTful API split: thin plugin + execution service.
For learning, that was perfect.
For scale, that became extra lifecycle complexity.
nf-iac as a tool was easier to reason about because it kept:
- narrow input/output boundaries
- deterministic behavior
- fewer moving operational surfaces
This project reinforced what I now prefer in many AI-assisted systems: tools with explicit contracts rather than service-like flexibility that drifts into ambiguity.
Vibe Coding to Value
Only after stability did I run the real cost comparison versus an alternative cloud execution path with:
- same pipeline
- same dataset size
- comparable performance expectations
For this measurement, nf-iac showed clear cost advantage without performance sacrifice.
The result is not a universal benchmark; it is one workload-specific measurement. But it validated one practical point:
closing the execution/contract gap can make OCI cost advantages visible and real.
Why it did not stop at completion
Once the execution became declarative, extending to heterogeneous compute (Arm + GPU stages in one framework) became natural.
The architecture was now a shape:
- one framework
- multiple compute targets
- lower cost of experimentation
That extension took about two days and validated the same pattern:
AI can build quickly. Humans still decide what should be built, how far to extend it, and why.
Working with AI: what works, what does not
What I use AI for
- quick orientation in unfamiliar systems
- language and layer translation
- low-level scaffolding where constraints are clear
- fast iteration on scoped refactors
What I do not use AI for
- owning architecture tradeoffs
- guaranteeing end-to-end correctness
- solving unknown frontier contract mismatches alone
AI can enumerate options fast. It can shorten feedback loops.
Humans still verify, decide, and own the consequences.
Final reflection
Before this work, I already trusted OCI strengths.
This project proved the difference is not AI speed alone, but:
- clear method
- workload-driven testing
- explicit contracts
- disciplined ownership of correctness
The lesson is simple:
AI can compress mechanical work.
Humans still own the hard parts.