Class TrainingDataCollector
java.lang.Object
neqsim.process.ml.TrainingDataCollector
- All Implemented Interfaces:
Serializable
Training data collector for surrogate model development.
Collects input-output pairs from NeqSim simulations for training neural network surrogates. Supports:
- CSV export for scikit-learn, PyTorch, TensorFlow
- JSON export for flexible data handling
- Feature normalization statistics
- Train/validation/test split suggestions
Usage Example:
TrainingDataCollector collector = new TrainingDataCollector("flash_surrogate");
collector.defineInput("temperature", "K", 200.0, 500.0);
collector.defineInput("pressure", "bar", 1.0, 100.0);
collector.defineOutput("vapor_fraction", "mole_frac", 0.0, 1.0);
// Run many simulations
for (...) {
collector.startSample();
collector.recordInput("temperature", T);
collector.recordInput("pressure", P);
// Run flash calculation
collector.recordOutput("vapor_fraction", result);
collector.endSample();
}
collector.exportCSV("training_data.csv");
- Version:
- 1.0
- Author:
- ESOL
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static classFeature definition for inputs/outputs.private static classRunning statistics calculator. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final Map<String, TrainingDataCollector.FeatureDefinition> private final Map<String, TrainingDataCollector.RunningStats> private final Stringprivate final Map<String, TrainingDataCollector.FeatureDefinition> private final Map<String, TrainingDataCollector.RunningStats> private static final long -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidclear()Clear all collected samples.defineInput(String name, String unit, double minBound, double maxBound) Define an input feature.defineOutput(String name, String unit, double minBound, double maxBound) Define an output feature.voidEnd current sample and add to dataset.voidExport to CSV format.Get normalization statistics for inputs.getName()Get dataset name.Get normalization statistics for outputs.intGet number of samples collected.Get summary statistics as formatted string.voidrecordInput(String name, double value) Record an input value for current sample.voidrecordOutput(String name, double value) Record an output value for current sample.voidrecordStateAsInputs(StateVector state) Record state vector as inputs.voidrecordStateAsOutputs(StateVector state) Record state vector as outputs.voidStart recording a new sample.toCSV()Export to CSV string.toString()
-
Field Details
-
serialVersionUID
private static final long serialVersionUID- See Also:
-
name
-
inputDefs
-
outputDefs
-
samples
-
currentSample
-
inputStats
-
outputStats
-
-
Constructor Details
-
TrainingDataCollector
Create a training data collector.- Parameters:
name- identifier for this dataset
-
-
Method Details
-
defineInput
public TrainingDataCollector defineInput(String name, String unit, double minBound, double maxBound) Define an input feature.- Parameters:
name- feature nameunit- physical unitminBound- expected minimum valuemaxBound- expected maximum value- Returns:
- this collector for chaining
-
defineOutput
public TrainingDataCollector defineOutput(String name, String unit, double minBound, double maxBound) Define an output feature.- Parameters:
name- feature nameunit- physical unitminBound- expected minimum valuemaxBound- expected maximum value- Returns:
- this collector for chaining
-
startSample
public void startSample()Start recording a new sample. -
recordInput
Record an input value for current sample.- Parameters:
name- input feature namevalue- value to record
-
recordOutput
Record an output value for current sample.- Parameters:
name- output feature namevalue- value to record
-
recordStateAsInputs
Record state vector as inputs.- Parameters:
state- state vector
-
recordStateAsOutputs
Record state vector as outputs.- Parameters:
state- state vector
-
endSample
public void endSample()End current sample and add to dataset. -
getSampleCount
public int getSampleCount()Get number of samples collected.- Returns:
- sample count
-
getName
-
exportCSV
Export to CSV format.- Parameters:
filePath- path to output file- Throws:
IOException- if writing fails
-
toCSV
-
getInputStatistics
-
getOutputStatistics
-
clear
public void clear()Clear all collected samples. -
getSummary
-
toString
-