Evaluation
The evaluation module records what happened during an experiment and measures the outcome. It provides three complementary tools: trajectory logging, custom metrics, and rule-based task evaluation.
Trajectory
A Trajectory is an ordered list of TrajectoryStep objects — the
complete record of an experiment run. Each step captures a single agent
turn, including what the agent observed, what it said, and any side
information. Trajectories are the primary input to both risk detectors
and metrics.
from risklab.evaluation.trajectory import TrajectoryStep
# Each TrajectoryStep records:
step.round # int — interaction round number
step.speaker # str — which agent acted
step.observation # Any — what the agent observed
step.message # Any — the raw LLM output
step.action # Any — the parsed action
step.local_utility # float | None — per-agent reward
step.system_state # dict — snapshot of the global state
step.metadata # dict — additional key-value data
Trajectory Logger
TrajectoryLogger builds a Trajectory in memory and can flush it
to disk as JSON. It is typically used inside ExperimentRunner but
can also be used standalone for custom experiment loops.
from risklab.evaluation.logger import TrajectoryLogger
logger = TrajectoryLogger(experiment_id="exp_001", output_dir="results/")
logger.log_step(
round=0,
speaker="seller_0",
observation={"prices": [50, 55]},
action={"price": 48},
)
# Persist to JSON
path = logger.save("exp_001_seed0.json")
Metrics
Metrics quantify trajectory properties that are not necessarily risks — for example, price convergence speed or message diversity. The framework defines three metric families:
Outcome — task success, efficiency
Interaction — agreement rate, entropy collapse, repetition
Risk indicator — collusion score, drift distance
Subclass Metric to implement your own, then group them into a
MetricSuite for batch evaluation:
from risklab.evaluation.metrics import (
Metric, MetricResult, MetricType, MetricSuite,
)
class PriceConvergence(Metric):
def __init__(self):
super().__init__("price_convergence", MetricType.OUTCOME)
def compute(self, trajectory) -> MetricResult:
# Calculate convergence ...
return MetricResult(
name=self.name,
metric_type=self.metric_type,
value=0.85,
)
suite = MetricSuite()
suite.add(PriceConvergence())
results = suite.evaluate(trajectory) # list[MetricResult]
flat = suite.evaluate_as_dict(trajectory) # {"price_convergence": 0.85}
Task Evaluator
RuleBasedTaskEvaluator checks whether agents achieved the task
objective defined in TaskConfig.success_criteria. Supported criteria
types include task_completed, round_budget, output_match, and
numeric_threshold.
from risklab.evaluation.task_evaluator import RuleBasedTaskEvaluator
evaluator = RuleBasedTaskEvaluator()
result = evaluator.evaluate(task_config, trajectory)
# result.success → bool
# result.score → float in [0, 1]
# result.details → dict with per-criterion breakdown