Training and Evaluating the System

MeshGraphNets.train_networkFunction
train_network(noise_stddevs, opt, ds_path, cp_path; kws...)

Starts the training process with the given configuration.

Arguments

  • noise_stddevs: Array containing the standard deviations of the noise that is added to the specified node types, where the length is either one if broadcasted or equal to the length of features.
  • opt: Optimiser that is used for training.
  • ds_path: Path to the dataset folder.
  • cp_path: Path where checkpoints are being saved to.
  • kws: Keyword arguments that customize the training process.

Keyword Arguments

  • mps = 15: Number of message passing steps.
  • layer_size = 128: Latent size of the hidden layers inside MLPs.
  • hidden_layers = 2: Number of hidden layers inside MLPs.
  • batchsize = 1: Size per batch (not implemented yet).
  • epochs = 1: Number of epochs.
  • steps = 10e6: Number of training steps.
  • checkpoint = 10000: Number of steps after which checkpoints are created.
  • norm_steps = 1000: Number of steps before training (accumulate normalization stats).
  • max_norm_steps = 10f6: Number of steps after which no more normalization stats are collected.
  • types_updated = [0, 5]: Array containing node types which are updated after each step.
  • types_noisy = [0]: Array containing node types which noise is added to.
  • training_strategy = DerivativeTraining(): Methods used for training. See documentation.
  • use_cuda = true: Whether a GPU is used for training or not (if available). Currently only CUDA GPUs are supported.
  • gpu_device = CUDA.device(): Current CUDA device (aka GPU). See nvidia-smi for reference.
  • cell_idxs = [0]: Indices of cells that are plotted during validation (if enabled).
  • solver_valid = Tsit5(): Which solver should be used for validation during training.
  • solver_valid_dt = nothing: If set, the solver for validation will use fixed timesteps.
  • wandb_logger = nothing: If set, a Wandb WandbLogger will be used for logging the training.
  • reset_valid = false: If set, the previous minimal validation loss will be overwritten.

Training Strategies

  • DerivativeTraining
  • SolverTraining
  • MultipleShooting

See CylinderFlow Example for reference.

Returns

  • Trained network as a GraphNetwork struct.
  • Minimum of validation loss (for hyperparameter tuning).
source
MeshGraphNets.eval_networkFunction
eval_network(ds_path, cp_path, out_path, solver; start, stop, dt, saves, mse_steps, kws...)

Starts the evaluation process with the given configuration.

Arguments

  • ds_path: Path to the dataset folder.
  • cp_path: Path where checkpoints are being saved to.
  • out_path: Path where the result is being saved to.
  • solver: Solver that is used for evaluating the system.
  • start: Start time of the simulation.
  • stop: Stop time of the simulation.
  • dt = nothing: If provided, changes the solver to use fixed step sizes.
  • saves: Time steps where the solution is saved at.
  • mse_steps: Time steps where the relative error is printed at.
  • kws: Keyword arguments that customize the training process. The configuration of the system has to be the same as during training.

Keyword Arguments

  • mps = 15: Number of message passing steps.
  • layer_size = 128: Latent size of the hidden layers inside MLPs.
  • hidden_layers = 2: Number of hidden layers inside MLPs.
  • types_updated = [0, 5]: Array containing node types which are updated after each step.
  • use_cuda = true: Whether a GPU is used for training or not (if available). Currently only CUDA GPUs are supported.
  • gpu_device = CUDA.device(): Current CUDA device (aka GPU). See nvidia-smi for reference.
  • num_rollouts = 10: Number of trajectories that are simulated (from the test dataset).
  • use_valid = true: Whether the last checkpoint with the minimal validation loss should be used.
source