Advancing computational biology and bioinformatics research through open innovation competitions
(A): Evaluation metric used for Antibody Clustering as a function of time (t measured in units of t0 = 100ms) and accuracy (ACC), as determined during the competition. Solutions lying on the isocurves are considered equivalent in quality, as defined by the evaluation metric. Data points correspond to average performance based on up to four 10K test sets (failed test cases excluded). Also shown is the benchmark algorithm implemented in Python (A1) and C++ (A2); note that benchmark algorithms A1 and A2 have perfect accuracy (ACC equal to unity). (B): Effective computational cost (excluding I/O) on a single core as a function of test set size N for Antibody Clustering (t measured in units of t0 = 100ms). Dashed lines indicate linear and quadratic scaling behaviors. Results are shown for the benchmark algorithms (A1 and A2) and the top four performing solutions (top solution indicated by solid line and darker data points). The benchmark algorithms exhibit quadratic scaling behavior with test set size, whereas the top algorithm exhibits better than quadratic scaling behavior within the regime considered. The best performance over the entire range of N can be achieved using an ensemble of solutions selected based upon whether the test set size is either below or above the crossover value at approximately N ∼ O(104).