SPARTA: Interpretable functional classification of microbiomes and detection of hidden cumulative effects
Fig 2
Classification algorithm implemented in SPARTA.
For a given run k, a test subset is randomly selected within the initial dataset and set aside. A given iteration j consists in training X random forests (20 by default), each having a dedicated validation subset. These 20 forests are used to compute a median classification performance P(j, k) and a shortlist of important features. This lists is used to train the X random forests of iteration j + 1. By default, SPARTA launches 10 runs and 5 iterations.