Multi-group analyses

Multi-group analyses#

Multi-group analysis is a more advanced feature of NeuroMiner where classification is used to distinguish between subjects from multiple groups (e.g., schizophrenia, bipolar, and depression subjects). During data input, these groups are simply defined within the labels – e.g., ’SCZ’, ’BP’, and ’DEP’ identifiers. Once this is defined, then an option will appear in the parameter template called ”Multi-class settings”. Selecting this option and then selecting the option to ”Train multi-class predictor” to ’yes’ will reveal another option to ”3 : Specify multi-class decision mechanism [ ]”. This option determines how NeuroMiner deals with the multiple groups (e.g., how it optimises the predictions) and is critical for the interpretation of results. For all the methods included here, it is important to note that NeuroMiner works by conducting multiple binary analyses between each group pair (see Aly, 2005 for an overview of the main types introduced below). Selecting this option will reveal the following menu:

1 | Simple Pairwise Decoding (Maximum-Wins method)

2 | Error-correcting output codes

3 | Directed Acyclic Graph

Simple pairwise decoding This is the simplest and most intuitive method, often called “Maximum-Wins.” For each subject, it effectively tallies the evidence for every possible group. To calculate a final score for a specific group, say ‘schizophrenia’, the method gathers the decision scores from all the pairwise classifiers that included schizophrenia (e.g., SCZ vs. Bipolar, SCZ vs. Depression). These scores are then aggregated into a single total score for the schizophrenia group. This process is repeated for all other groups. The final prediction is simply the group with the highest total score—the “maximum wins.”

You can choose how the pairwise decision scores are being aggregated. For instance, NeuroMiner allows you to sum the scores, calculate their mean, or use a majority vote where each binary classifier casts one vote for its preferred group.

Error-correcting output codes Error-Correcting Output Codes (ECOC) is a powerful framework that transforms a multi-group problem into a coding problem. The core idea is to assign each group a unique binary “codeword” or signature. This codeword is a string of values (e.g., +1, -1, 0) that represents how that group is treated by the entire set of binary classifiers. For example, in a one-vs-one classifier comparing schizophrenia and bipolar disorder, the schizophrenia codeword might have a +1 at that position while the bipolar codeword has a -1.

When making a prediction for a new subject, an “output code” is generated by running them through all the binary classifiers. This output code is then compared to the ideal codeword for each group. The subject is assigned to the group whose ideal codeword is the closest to their output code. This “closeness” is measured using a distance metric, and NeuroMiner provides several options, including the classic Hamming distance (which counts mismatched bits) and Euclidean distance. The “error-correcting” property of this method makes it robust; even if a few of the binary classifiers are wrong, the overall pattern of the output code can still be closest to the correct signature, allowing the system to overcome individual errors (see Dietterich et al., 1995 and Dietterich and Bakiri, 1995 and for an overview see Aly, 2005). Optimal results are obtained by maximally separating the codes using a distance metric, such as the Hamming distance.

Directed Acyclic Graph This method, often referred to as DAGSVM (Directed Acyclic Graph Support Vector Machine), arranges the pairwise classifiers into an efficient, tree-like elimination structure. You can think of it as a tournament bracket. A new subject starts at the root node of the graph, where a single binary classifier (e.g., comparing schizophrenia vs. bipolar) is evaluated. Based on the result, one of the groups is eliminated, and the subject is passed down the path to the next node. This process is repeated at each node, eliminating one competing group at a time, until only a single group remains. This final group becomes the prediction. A key advantage of this method is its computational speed, as it only needs to evaluate N-1 classifiers to make a decision for an N-group problem, unlike other methods that may need to evaluate all possible pairs.

Majority Voting

NeuroMiner provides an ensemble method for combining the predictions from multiple classifiers to produce a single, final prediction. This is done using a strategy called majority voting (or plurality voting). The principle is simple: each individual classifier gets to “vote” on a prediction, and the class that receives the most votes is declared the final result.

This approach can lead to more robust and accurate predictions than any single model because the prediction errors of individual models can often cancel each other out. For example, if you have trained 10 different models and 7 of them classify a subject as ‘Schizophrenia’ while 3 classify them as ‘Bipolar’, the majority vote result will be ‘Schizophrenia’.