Agentify Cloud: When the Agent Becomes the Cloud Runtime

For many years, software architecture has been built around one simple assumption. A request comes in. A handler receives it. A deterministic function processes it. A response goes out.

This is the foundation of most cloud services. REST API, RPC, FastAPI, serverless function, microservice, controller, route handler — they are all variations of the same idea. The developer defines the logic first. The runtime only executes it.

But the agent-native era is changing this assumption.

TL;DR

An LLM is a stateless prediction runtime. An agent is a stateful runtime that makes suitable context for the LLM through tools, memory, and execution.
Codex server is an important signal: a coding agent can become a standalone, callable MCP service.
Agentify Cloud asks the next question: if Codex can be an agent backend for engineering, why cannot a cloud endpoint be backed by an agent runtime?
The right model is not agent without rules. It is deterministic gating before the server, and runtime gating inside skill files.
In this model, traditional cloud service logic moves from fixed code handlers into skills, tools, markdown policy, and service contracts.

Part 1: Software architecture shift

The real question is no longer whether an API can call an LLM. That is already old style.

The deeper question is: what is the runtime now?

Is the agent only a responsive HTTP API that receives a query and returns an answer? Or is the agent a standalone application that can run with its own logic, its own memory, its own tools, and its own decision loop?

In the first wave of LLM applications, many people treated the model as a smarter endpoint.

HTTP request -> fixed handler -> LLM call -> response

This is useful, but it is not yet agent-native architecture. It is still old software architecture with an LLM attached to it.

A real agent is different. An agent is not just a response generator. An agent is a runtime that can observe the request, decide what context is missing, call tools, inspect state, recover from bad input, and then return something useful.

HTTP request -> agent runtime -> tool/context/action loop -> suitable response

This looks small, but architecturally it is a big change.

In traditional software, the developer must predefine the logic path. The code must know what endpoint is called, what schema is expected, what error may happen, and what return format should be produced.

In agent-native software, the developer defines the intention, tools, boundary, and skills. The agent decides how to make the request useful.

Agent as the runtime. Old runtime is a deterministic process to complete predefined logic. Agent runtime turns predefined logic into redefined intent.

Part 2: LLM versus agent

An LLM and an agent are not the same thing.

An LLM is a stateless runtime. It predicts the next result based on the input context. If the context is good, the result may be good. If the context is missing, wrong, or too narrow, the result will be limited.

The LLM itself does not naturally know where to find missing context. It does not naturally decide which file to read, which API to call, which database to query, which command to run, or which previous state matters.

That is why the agent matters.

An agent is a stateful runtime around the LLM. It does not replace the LLM. It makes the LLM operational.

LLM: input context -> prediction -> output Agent: input intent -> state/tool/context loop -> LLM reasoning -> action -> suitable output

The agent's job is to make suitable context for the LLM. It can use tools, read files, query memory, call APIs, inspect errors, retry, transform messy input into a better internal task, and turn vague human intent into executable steps.

The LLM is the reasoning engine. The agent is the runtime that prepares the world for reasoning and applies reasoning back to the world.

Part 3: Running the agent as a standalone service

Once we understand this, the next step is natural.

Why should the agent only live inside a chat window? Why should the agent only be called as a function inside an existing application? Why not run the agent itself as a standalone service?

Codex is a very good signal. Codex started as a coding agent, but the important architectural point is not only that it can write code. The important point is that Codex can run as a standalone MCP server.

That means another agent, another client, or another workflow can call Codex as a capability. Codex is no longer only a human-facing CLI. It becomes an agent backend.

Parent agent / MCP client -> Codex MCP server -> Codex agent runtime -> files / shell / repo / tests / patches

This is important because coding is becoming a callable agent service.

Old style: Call model -> get code Agent service style: Call agent -> inspect repo, edit files, run commands, test, and return progress

A deterministic API server exposes functions. An agent server exposes capability. That is the difference.

Part 4: Agent is more than an API server

But here is the part we should be careful about.

If we run an agent as a server, we should not immediately constrain it back into old API thinking. The old habit is to define strict JSON input and strict JSON output, then force the agent to behave like a deterministic function.

Of course, schema is necessary. Production systems need contracts. Security needs boundaries. Services need stable output.

But if we only treat the agent as a JSON function, we lose the most important part of the agent. The agent is useful because it can understand imperfect input.

It can decide whether the input is good or bad. It can repair missing structure. It can ask tools for missing context. It can route vague intent to the right action. It can turn error into recovery. It can make the return suitable for the service, even when the original input is not suitable.

This is the core idea behind Agentify Cloud.

Agentify Cloud is an experiment in agent-handled cloud services. It accepts FastAPI-style input. It accepts MCP-style input. It lets the agent decide what the input means. It lets the agent determine whether the request is complete, broken, ambiguous, or executable. Then it produces a response suitable for the service.

Old cloud: If input matches schema, run handler. If not, return error. Agentify Cloud: Receive input. Understand intent. Repair or enrich context if possible. Select skill or tool. Execute through agent runtime. Return a useful service response.

In the old model, every route needs a handler. In Agentify Cloud, the agent becomes the universal handler.

Unknown endpoint? Let the agent understand it. Bad payload? Let the agent repair it or return a meaningful explanation. Missing context? Let the agent search memory or call tools. New service behavior? Add a skill, not necessarily new runtime code.

This is why Agentify Cloud is not simply LLM behind FastAPI. It is a different cloud architecture. The endpoint is no longer only a deterministic function. The endpoint becomes an intent boundary. The backend becomes an agent runtime.

Part 5: Deterministic gating and runtime gating

One important clarification: agent-native cloud does not mean removing control. Actually, it needs stronger control.

The difference is where the control lives.

In Agentify Cloud, I separate gating into two layers.

The first layer is deterministic gating before the agent server. This is the traditional infrastructure boundary. Authentication, authorization, rate limit, tenant isolation, request size limit, network policy, endpoint exposure, and basic schema validation should happen before the request reaches the agent.

This layer should not be agentic. It should be deterministic.

Request -> deterministic gateway auth permission rate limit tenant boundary payload size basic validation -> agent runtime

This is the hard boundary of the service.

The second layer is runtime gating inside the agent. This is where Agentify Cloud becomes different.

The agent may have many skills: memory sharing, database adaptor, code execution, file operation, MCP tools, workflow automation, service diagnosis, and so on. Each skill can define its own runtime gating in markdown files.

A memory-sharing skill may say: only share memory from this project, do not expose private system instruction, summarize instead of returning raw files, and ask for confirmation before overwriting persistent memory.

A database adaptor skill may say: read-only by default, no destructive SQL unless explicitly approved, always explain generated query before execution, reject broad table dumps, and limit result size.

Agent runtime -> skill selection -> runtime gating from markdown skill file -> tool execution / memory access / database access -> service response

The gateway controls who can enter. The skill file controls what the agent is allowed to do after it enters.

Deterministic gateway protects the service boundary. Runtime gating protects the capability boundary. Agent runtime connects intent to capability.

Part 6: Cloud service becomes a skill

There is one more important idea.

If an agent is running as a standalone service, then traditional cloud services can become skills inside the agent.

Suppose I want this standalone agent to become a memory-sharing service. In the old way, I need to design a memory API, implement storage logic, define endpoints, handle schema, deploy, and maintain the service.

In Agentify Cloud, I can simply write a markdown file to describe the memory behavior, including its runtime gating. The agent reads it as instruction and skill. Now the service has memory behavior.

Suppose I want the agent to work as a database adaptor. In the old way, I write a database service, implement query translation, define data access layers, expose endpoints, and maintain a new backend.

In Agentify Cloud, I give the agent database skills. Now database access becomes part of the agent runtime.

Old cloud: new capability -> new code -> new endpoint -> new deployment Agentify Cloud: new capability -> new skill/tool/context/gating -> same agent runtime

This is plugin-and-play cloud architecture. Not plugin-and-play in the old UI extension sense. Plugin-and-play at the runtime capability level.

The cloud will not disappear. APIs will not disappear. Schemas will not disappear. Gates will not disappear. But their role will change.

The API becomes the boundary. The schema becomes the contract. The gateway becomes the deterministic guard. The tool becomes the execution surface. The skill becomes the service logic. The markdown file becomes the runtime capability description. The agent becomes the runtime.

Try Agentify Cloud

Agentify Cloud is a small project, but it points to a larger architecture shift.

uv pip install agentify-cloud

Project page: Agentify Cloud. PyPI package: agentify-cloud.

Create a simple AGENTS.md gate:

# Agent Gate - /dashboard/* returns complete HTML for GET. - POST and MCP calls return structured JSON. - Never expose secrets or local environment variables. - Read-only by default unless the user explicitly asks for write behavior. - Keep service responses bounded and suitable for the caller.

Then start the runtime:

agentify server --port 8000

A normal HTTP route becomes intent. A normal MCP call becomes governed action. The client still sees ordinary HTML or JSON. But behind the boundary, the agent runtime produces the service behavior.

Closing thought

Codex server shows one important direction: an agent can become a standalone callable service.

But coding is only one domain.

Agentify Cloud asks the next question: if Codex can become an agent backend for software engineering, why cannot a cloud endpoint become an agent backend for general services?

This is the architecture shift I am exploring.

From API as handler, to agent as runtime. From fixed logic, to redefined intent. From cloud service as code, to cloud service as skill. From static validation only, to deterministic gating plus runtime gating.

This may change not only how we build software, but also what we mean by software.