Train supervised classifiers

Train supervised classifiers#

As depicted in Fig. 3, once the parameter template has been completed and initialized there is the option to train the models. This implies the use of a statistical technique, such as an SVM, along with the hyperparameter optimization (e.g., optimizing the C parameter) and any wrappers that might have been activated. The menu looks similar to the Preprocess features so we will not describe all the settings here. One setting you can change is whether you want to use precomputed data (e.g. if you’ve run the preprocessing before). The following menu will be displayed:

1 | Create analysis from scratch

2 | Create analysis using precomputed preprocdata-MATs

3 | Create analysis using precomputed CVdatamats

4 | Create analysis using precomputed CVresult-MATs

1 | Create analysis from scratch#

This is where the analysis has only been initialized and no data matrices have been stored. This will preprocess and then train classifiers. Please note, that if your matrices are large (e.g., with neuroimaging data) then this will take up a lot of RAM and as such it is recommended to preprocess the data first.

2 | Create analysis using precomputer preprocdata-MATs#

This is when the user has preprocessed their data and stored the preprocessed files in a folder; i.e., they have TRAIN files.

3 | Create analysis using precomputed CVdatamats#

This is when the user has completed an analysis (preprocessing and training) and they have the data matrix files stored in a folder; i.e., ”…CVdatamat oCV…” files. When this option is selected, the analysis results from the CVdatamat files are then collated into a result file. It is useful especially when the program has crashed, and/or the NM structure has not been saved, and the user wants to get the results without re-running all the training again.

Note

This is especially useful when CV datamats have been computed using job submission!

4 | Create analysis using precomputed CVresult-MATs#

This is when the user already has the results file and they want to integrate the results into the NM structure. This is useful if there has been a crash and the data has been lost from the NM structure.