Fig 1.
Interpretation of two instances for class 27 from the SEC-seq dataset.
Both instances are correctly classified as class 27. Three types of super features are denoted as green, yellow, and purple colors, respectively. Note that cell-level super features, especially the super feature 13, influence the predictions the most while super feature 0 commonly gives a negative impact. In other words, INFO accurately predicts labels of input instances as the class 27 since the instances have a unique pattern of accessing a specific memory region related to super feature 13.
Table 1.
Description of symbols.
Fig 2.
Overall process of generating super features.
We define three different types of super features for workload classification. Super features are designed to capture the similarity between the original features, and are commonly applied to every workload subsequence.
Fig 3.
Illustration for generating CMD super features.
The first super feature includes sequences s1 and s2, and the second super feature includes sequences s3 and s4.
Fig 4.
An example of constructing bank-level and cell-level super features.
(a) shows a bank-level super feature extracted from rank, bank group, and bank fields. (b) shows a cell-level super feature including the address field. Features and super features are denoted as blue and red colors, respectively.
Fig 5.
An example of constructing vectors for an interpretable model.
An entry of a feature vector x denotes the number of occurrences of each n-gram sequence. For example, 11, 0, 30, 2, 0, and 10 are the numbers of occurrences of sequences 113, 131, 111, 555, 535, and 553 in an instance, respectively. x, z1, and z2 are defined in the original feature space while x′, z′1, and z′2 are represented using super features.
Table 2.
Datasets for a workload classification.
Fig 6.
Running time and accuracy for INFO and LIME.
(a-c) and (d-f) show performance on SEC-seq and Memtest86-seq datasets, respectively. INFO is up to 2.0× and 1.5× faster than LIME while having similar Top-1 and Top-3 accuracies for SEC-seq and Memtest86-seq datasets, respectively.
Fig 7.
Interpretation of two instances for class 9 from SEC-seq dataset.
Instance 3 is correctly classified but instance 4 is misclassified as class 8. Super feature 2 helps classify instances into class 9 while super features 1, 11, and 3 restrain the instance 4 from being classified into class 9.
Fig 8.
Interpretation of two instances for class 23 from Memtest86-seq dataset.
Both instances are correctly classified as class 23. A cell-level super feature 12 and CMD super features commonly give a positive impact on the results.
Fig 9.
Interpretation of two instances for class 10 from Memtest86-seq dataset.
Instance 7 is correctly classified as class 10 while instance 8 is misclassified as class 9. CMD super features 1, 0, and 3 affect the predictions the most. Note that instance 8 contains more negative weights than instance 7 leading to misclassification.
Fig 10.
Comparison of interpreting two instances from SEC-seq dataset using LIME and INFO.
Both instances are correctly classified as class 27. Note that super features of two instances generated from LIME are different resulting in inconsistent interpretations while INFO utilizes the same super features over all instances.
Table 3.
Example of CMD super features generated by LIME and INFO.
LIME (27) and LIME (9) show the clustering results of two instances from class 27 and class 9 in the SEC-seq dataset, respectively. INFO makes the same super features over all instances while LIME gives inconsistent super features.
Fig 11.
Visualization of CMD super features.
We cluster n-gram sequences based on a variant of Jaccard similarity.
Table 4.
Example of CMD super features.
The bold text represents the repetitive patterns within each cluster.