Selection Algorithms¶
AgentOpt provides 8 selection algorithms. Choose based on your search space size and evaluation budget.
At a Glance¶
| Algorithm | Strategy | Evaluations | Best For |
|---|---|---|---|
| Brute Force | Exhaustive | All | Small spaces (< 50 combos) |
| Random Search | Sampling | Configurable fraction | Quick baselines |
| Hill Climbing | Greedy + restarts | Guided neighbors | Medium spaces |
| Arm Elimination | Progressive pruning | Adaptive | Statistical early stopping |
| Epsilon LUCB | ε-optimal LUCB | Adaptive | Cost savings when ε-optimal is enough |
| Threshold SE | Threshold classification | Adaptive | Filtering above/below a performance target |
| LM Proposal | LLM-guided | Shortlist | Leveraging model knowledge |
| Bayesian Optimization | GP surrogate | Sequential | Expensive evaluations |
Common interface
All selectors share the same constructor and select_best() method. Switching algorithms is a one-line change.
Brute Force¶
Evaluates every combination in the Cartesian product.
from agentopt import BruteForceModelSelector
selector = BruteForceModelSelector(
agent=MyAgent,
models=models,
eval_fn=eval_fn,
dataset=dataset,
)
When to use
Small search spaces where you can afford to evaluate everything. Guarantees finding the true optimum.
Complexity
Evaluations grow as the product of model list sizes. 5 models x 3 nodes = 125 combinations.
Random Search¶
Samples a random fraction of all combinations.
from agentopt import RandomSearchModelSelector
selector = RandomSearchModelSelector(
agent=MyAgent,
models=models,
eval_fn=eval_fn,
dataset=dataset,
sample_fraction=0.25, # evaluate 25% of combinations
seed=42,
)
| Parameter | Default | Description |
|---|---|---|
sample_fraction |
0.25 |
Fraction of combinations to evaluate |
seed |
None |
Random seed for reproducibility |
When to use
Quick exploration to establish a baseline before committing to a thorough search.
Hill Climbing¶
Greedy local search with random restarts. Defines "neighbors" using model quality and speed rankings, so each step is an informed single-model swap.
from agentopt import HillClimbingModelSelector
selector = HillClimbingModelSelector(
agent=MyAgent,
models=models,
eval_fn=eval_fn,
dataset=dataset,
max_iterations=20,
num_restarts=3,
patience=3,
)
| Parameter | Default | Description |
|---|---|---|
max_iterations |
20 |
Max steps per restart |
num_restarts |
3 |
Number of random restarts |
patience |
3 |
Steps without improvement before restart |
When to use
Medium-sized spaces where you want to exploit model topology — cheaper models are neighbors of expensive ones.
Arm Elimination¶
Progressively eliminates statistically dominated combinations. Starts with a small batch of datapoints, then grows the batch while eliminating underperformers.
from agentopt import ArmEliminationModelSelector
selector = ArmEliminationModelSelector(
agent=MyAgent,
models=models,
eval_fn=eval_fn,
dataset=dataset,
growth_factor=2.0,
confidence=1.0,
)
| Parameter | Default | Description |
|---|---|---|
n_initial |
None |
Initial batch size. Default: 10% of dataset (max(1, len(dataset)//10)) |
growth_factor |
2.0 |
Batch size multiplier per round |
confidence |
1.0 |
Elimination confidence threshold |
When to use
When bad combinations should be eliminated early to save budget. Particularly effective when there are clearly weak options.
Epsilon LUCB¶
Identifies an ε-optimal best arm using Lower and Upper Confidence Bounds. Each round, it compares the current leader's lower confidence bound against the best challenger's upper bound. When the gap closes below epsilon, the algorithm stops with statistical confidence that the selected arm is within epsilon of optimal.
from agentopt import EpsilonLUCBModelSelector
selector = EpsilonLUCBModelSelector(
agent=MyAgent,
models=models,
eval_fn=eval_fn,
dataset=dataset,
epsilon=0.01,
confidence=1.0,
)
| Parameter | Default | Description |
|---|---|---|
epsilon |
0.01 |
Acceptable gap from the true best |
n_initial |
1 |
Initial datapoints per combination |
confidence |
1.0 |
Confidence level for bound computation |
When to use
When finding the exact best combo isn't necessary and you can tolerate a small accuracy gap (epsilon) in exchange for significant cost savings. Particularly effective when many combos are close in performance.
Threshold Successive Elimination¶
Instead of finding the single best combination, Threshold SE classifies each combination as above or below a user-defined performance threshold. Each round, it evaluates all surviving combos on one more datapoint and checks their confidence intervals. Once a combo's interval no longer straddles the threshold (entirely above or entirely below), it's classified and removed from the active set.
from agentopt import ThresholdBanditSEModelSelector
selector = ThresholdBanditSEModelSelector(
agent=MyAgent,
models=models,
eval_fn=eval_fn,
dataset=dataset,
threshold=0.75,
confidence=1.0,
)
| Parameter | Default | Description |
|---|---|---|
threshold |
0.75 |
Performance threshold to classify against |
confidence |
1.0 |
Confidence level for bound computation |
When to use
When you have a minimum acceptable accuracy in mind (e.g., "I need at least 75%") and want to quickly identify which combinations meet it. Useful for filtering rather than ranking.
LM Proposal¶
Uses a proposer LLM to shortlist promising combinations before evaluation. The proposer sees the candidate models and a dataset preview, then suggests which combinations to try.
from agentopt import LMProposalModelSelector
selector = LMProposalModelSelector(
agent=MyAgent,
models=models,
eval_fn=eval_fn,
dataset=dataset,
proposer_model="gpt-4.1",
objective="maximize accuracy and then minimize latency and cost",
dataset_preview_size=10,
)
| Parameter | Default | Description |
|---|---|---|
proposer_model |
"gpt-4.1" |
Model used for proposal generation |
proposer_client |
None |
Custom OpenAI-compatible client; auto-creates OpenAI() if omitted |
objective |
"maximize accuracy and then minimize latency and cost" |
Natural-language objective passed to the proposer |
dataset_preview_size |
10 |
Number of dataset examples shown to the proposer |
When to use
When you want to leverage an LLM's knowledge about model capabilities to skip obviously bad combinations.
Bayesian Optimization¶
Uses a Gaussian Process surrogate to predict accuracy for unevaluated combinations, then selects the most promising one via Expected Improvement.
from agentopt import BayesianOptimizationModelSelector
selector = BayesianOptimizationModelSelector(
agent=MyAgent,
models=models,
eval_fn=eval_fn,
dataset=dataset,
batch_size=1,
sample_fraction=0.25,
)
| Parameter | Default | Description |
|---|---|---|
batch_size |
1 |
Combinations to evaluate per GP iteration |
sample_fraction |
0.25 |
Fraction of dataset to use per evaluation |
When to use
When each evaluation is expensive (large dataset, slow models) and you want to minimize total evaluations. The GP learns from past results to pick the most informative next combination.