Class AgentBenchmarkSuite

java.lang.Object
neqsim.util.agentic.AgentBenchmarkSuite
All Implemented Interfaces:
Serializable

public class AgentBenchmarkSuite extends Object implements Serializable
Defines and evaluates standardized engineering benchmark problems for agent performance measurement.

Inspired by the Simona dataset used by Tian et al. (2026) for evaluating multi-agent chemical process design workflows, this class provides a curated set of engineering problems with known reference solutions. Each benchmark problem specifies inputs, expected outputs with tolerances, and pass/fail criteria. Agent systems can run the full suite to measure convergence rate, accuracy, and completeness across diverse engineering tasks.

Problem Categories:

  • THERMO — Pure component and mixture thermodynamic properties
  • FLASH — Phase equilibrium calculations (TP, PH, PS flash)
  • PROCESS — Process equipment and flowsheet simulations
  • PIPELINE — Multiphase pipe flow and pressure drop
  • ECONOMICS — Field development NPV and cost estimation
  • SAFETY — Depressurization, relief valve sizing, safety envelopes

Usage:


AgentBenchmarkSuite suite = AgentBenchmarkSuite.createStandardSuite();
suite.addResult("methane_density_300K_50bar", 34.05);

BenchmarkReport report = suite.evaluate();
System.out.println("Pass rate: " + report.getPassRate());
System.out.println("Failed: " + report.getFailedProblems());
String json = report.toJson();

Version:
1.0
Author:
Even Solbraa
See Also:
  • Field Details

  • Constructor Details

    • AgentBenchmarkSuite

      public AgentBenchmarkSuite(String suiteName)
      Creates a new benchmark suite with the given name.
      Parameters:
      suiteName - descriptive name for the benchmark suite
  • Method Details

    • addProblem

      public void addProblem(AgentBenchmarkSuite.BenchmarkProblem problem)
      Adds a benchmark problem to the suite.
      Parameters:
      problem - the benchmark problem to add
    • addResult

      public void addResult(String problemId, double value)
      Submits an agent result for a specific problem.
      Parameters:
      problemId - the unique identifier of the benchmark problem
      value - the computed result value
    • addConvergenceResult

      public void addConvergenceResult(String problemId, boolean converged)
      Records whether a simulation converged for a specific problem.
      Parameters:
      problemId - the unique identifier of the benchmark problem
      converged - true if the simulation converged, false otherwise
    • evaluate

      Evaluates all submitted results against the benchmark reference data.
      Returns:
      a BenchmarkReport with pass/fail verdicts and aggregate metrics
    • getProblems

      Returns the list of benchmark problems in this suite.
      Returns:
      unmodifiable list of benchmark problems
    • getSuiteName

      public String getSuiteName()
      Returns the name of this benchmark suite.
      Returns:
      the suite name
    • createStandardSuite

      public static AgentBenchmarkSuite createStandardSuite()
      Creates a standard benchmark suite with representative problems across all categories.

      Reference data sources: NIST Chemistry WebBook, published experimental data, validated simulation results.

      Returns:
      a pre-populated benchmark suite
    • toJson

      public String toJson()
      Serializes the benchmark suite definition to JSON.
      Returns:
      JSON string representation of the suite