Suggested Prerequisites#

NeuroMiner works exclusively from the MATLAB command line. There are some situations in which the variables need to be in a specific format in order for NeuroMiner to recognize them, as described later in this manual. You may also want to understand the NeuroMiner outputs stored in output variables. For these reasons, we recommend a minimal level of knowledge about command-line operation of MATLAB (see below).

MATLAB:

  • Know the differences between vectors/matrices/cell-arrays/structures

  • Know how to address certain cells/rows/columns of the variables

  • Know how to delete/create/address items in a MATLAB structure

  • Know the differences between numerical/string/binary variables

  • Know how to add stuff to the MATLAB path and why this is needed

  • Know how to import files into MATLAB

Tip

For a basic MATLAB tutorial see http://antoniahamilton.com/matlab.html !

NeuroMiner is a program that uses a number of advanced machine learning tools. We have provided some description of the most important concepts and techniques, but the manual often assumes machine learning knowledge. For this reason, we recommend a basic understanding of machine learning concepts (see below).

Machine learning:

  • What is the difference between training and test samples?

  • What is cross-validation/nested cross-validation?

  • What is preprocessing (for machine learning, not just MRI data)?

  • What are filters and wrappers?

  • What are features and classification targets?

  • What is Principal Component Analysis (PCA)?

  • What is a Support Vector Machine (SVM)?

Tip

For a basic tutorial see Andrew Ng’s tutorial!

Nomenclature#

Machine learning uses different nomenclature compared to classical statistics. The following terms and their definitions are used throughout the manual (Rob Tibshirani, Statistics 315a, Stanford):

Machine Learning

Statistics

Features

Predictor variables

Target variable/labels

Outcome variables

Weights

Parameters

Hyperparameters

Parameters

Learning

Fitting

Generalization

Test set performance

Supervised learning

Regression/ Classification

Unsupervised learning

Clustering, density estimation

Large grant = $1,000,000

Large grant = $50,000