Programming Agents
To develop an agent using AgentOS, the most important concepts to understand are Agents, Policies, and Environments (or Envs).
An Agent is an entity that can take action over time. It must have an environment. It can also have one or more Policy that it uses to make decisions.
Environments
Environments can be either simulators (e.g., CartPole) or connectors to the real world (e.g., an environment for a chatbot that passes messages back and forth between the agent and other agents, humans, etc.).
AgentOS does not define its own Environment API, instead we reuse gym.Env. Environments must:
Descend from
gym.Env
.Define a function
step(action) -> observation, reward, done, context
that takes an action and returns an observation, reward, etc.Define action and observation spaces.
Policies
In AgentOS, a Policy is a function that takes an observation as input and returns an action. Policies must:
Descend from
AgentOS.Policy
.Define the
compute_action(observation) -> action
function.Define action and observation spaces.
Agents: putting it all together
The architeture and API that AgentOS provides for Agents is minimal in order to
provide flexibility, because different agents should be able to perform very
different types of tasks. But it is also expected that Agents will be highly
sophisticated. So then most of the complexity of agents will be outside of
the core AgentOS abstraction (e.g., the agentos.Agent
class).
To be compatible with AgentOS, an agent class must:
Descend from
agentos.Agent
. Agents must take an environment class as thefirst argument to their
__init__()
function.Agents must define a instance function called
advance()
that returns a boolean.
That’s it. It is up to each agent developer how they want to structure the
internals of their agent, but from AgentOS’s perspective, the only way that an
agent can do anything is via its advance()
function.
Guidelines for structuring advance()
We recommend Agents keep the advance function as clean and minimal as possible, with code living in other functions that are called with in the advance function, or even better in other modules. Agent’s are intended to be minimal and easy to read, and mostly be used to import and compose functionality contained in “agent libraries” (see Architecture and Design).
Background on agent design
This design is inspired by operating systems where the core kernel code is kept minimal and most functionality is implemented in libraries (cite microkernels, exakernel).
Rollouts
A rollout, also called an episode, is a concept that comes from Reinforcement Learning. Conceptually, you can think of a rollout as a simulation of an agent advance()-ing through time in order to learn.
Technically, a rollout is a process involving an instance of a Policy and an instance of an Env that proceeds as demonstrated by the following pseudocode:
def rollout(Env_class, Policy_class):
"""Pseudocode implementation of simplified rollout function.
See agentos/core.py for the actual implementation."""
env = get new instance of Env
obs = initial observation from env
policy = initialize a new Policy
trajectory = []
done = False
until done:
action = policy.compute_action(obs)
obs, reward, done, _ = env.step(action)
trajectory += [action, obs, reward]
return trajectory
As you can see, performing a rollout generates a trajectory
, which you
can think of as a simulation of how an agent might advance through the given
environment, and what rewards it might receive along the way, if it were
to use the given policy.
Different types of agents and algorithms might use rollouts for different purposes, but rollouts always consist of the same basic structure.
Since rollouts are used frequently and have a standard structure, AgentOS
includes the agentos.core.rollout()
utility function, but note that the
psuedocode above is a simplified version of agentos.core.rollout()
.