Training and Evaluating the System
MeshGraphNets.train_network
— Functiontrain_network(noise_stddevs, opt, ds_path, cp_path; kws...)
Starts the training process with the given configuration.
Arguments
noise_stddevs
: Array containing the standard deviations of the noise that is added to the specified node types, where the length is either one if broadcasted or equal to the length of features.opt
: Optimiser that is used for training.ds_path
: Path to the dataset folder.cp_path
: Path where checkpoints are being saved to.kws
: Keyword arguments that customize the training process.
Keyword Arguments
mps = 15
: Number of message passing steps.layer_size = 128
: Latent size of the hidden layers inside MLPs.hidden_layers = 2
: Number of hidden layers inside MLPs.batchsize = 1
: Size per batch (not implemented yet).epochs = 1
: Number of epochs.steps = 10e6
: Number of training steps.checkpoint = 10000
: Number of steps after which checkpoints are created.norm_steps = 1000
: Number of steps before training (accumulate normalization stats).max_norm_steps = 10f6
: Number of steps after which no more normalization stats are collected.types_updated = [0, 5]
: Array containing node types which are updated after each step.types_noisy = [0]
: Array containing node types which noise is added to.training_strategy = DerivativeTraining()
: Methods used for training. See documentation.use_cuda = true
: Whether a GPU is used for training or not (if available). Currently only CUDA GPUs are supported.gpu_device = CUDA.device()
: Current CUDA device (aka GPU). See nvidia-smi for reference.cell_idxs = [0]
: Indices of cells that are plotted during validation (if enabled).solver_valid = Tsit5()
: Which solver should be used for validation during training.solver_valid_dt = nothing
: If set, the solver for validation will use fixed timesteps.wandb_logger
= nothing: If set, a Wandb WandbLogger will be used for logging the training.reset_valid = false
: If set, the previous minimal validation loss will be overwritten.
Training Strategies
DerivativeTraining
SolverTraining
MultipleShooting
See CylinderFlow Example for reference.
Returns
- Trained network as a
GraphNetwork
struct. - Minimum of validation loss (for hyperparameter tuning).
MeshGraphNets.eval_network
— Functioneval_network(ds_path, cp_path, out_path, solver; start, stop, dt, saves, mse_steps, kws...)
Starts the evaluation process with the given configuration.
Arguments
ds_path
: Path to the dataset folder.cp_path
: Path where checkpoints are being saved to.out_path
: Path where the result is being saved to.solver
: Solver that is used for evaluating the system.start
: Start time of the simulation.stop
: Stop time of the simulation.dt = nothing
: If provided, changes the solver to use fixed step sizes.saves
: Time steps where the solution is saved at.mse_steps
: Time steps where the relative error is printed at.kws
: Keyword arguments that customize the training process. The configuration of the system has to be the same as during training.
Keyword Arguments
mps = 15
: Number of message passing steps.layer_size = 128
: Latent size of the hidden layers inside MLPs.hidden_layers = 2
: Number of hidden layers inside MLPs.types_updated = [0, 5]
: Array containing node types which are updated after each step.use_cuda = true
: Whether a GPU is used for training or not (if available). Currently only CUDA GPUs are supported.gpu_device = CUDA.device()
: Current CUDA device (aka GPU). See nvidia-smi for reference.num_rollouts = 10
: Number of trajectories that are simulated (from the test dataset).use_valid = true
: Whether the last checkpoint with the minimal validation loss should be used.