Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SP

doi:10.1371/journal.pone.0220135

Fig 1.

Outline of the methodology.

More »

Expand

Table 1.

System specifications.

More »

Expand

Table 2.

Benchmarks tested, divided between integer (int column) and floating point (fp column).

Filled cells in columns “2006” and “2017” mean the benchmark appears in the corresponding suite. The columns labeled “_r” and “_s” refer to the application versions producing SPECrate and SPECspeed metrics, respectively. Columns “Inst.” and “#” show instruction count (x10¹²) and input identifier, respectively.

More »

Expand

Fig 2.

MPKI1, MPKI2 and MPKI3 for all SPEC CPU2006 benchmarks, sorted by benchmark number.

More »

Expand

Fig 3.

MPKI1, MPKI2 and MPKI3 for all SPEC CPU2017 single-threaded benchmarks, sorted by benchmark number.

More »

Expand

Table 3.

Selected benchmarks and their performance metrics for minimum LLC size and no prefetching.

More »

Expand

Fig 4.

MPKI3 vs. LLC size for the selected CPU2006 benchmarks, with and without prefetching.

More »

Expand

Fig 5.

MPKI3 vs. LLC size for the selected CPU2017 benchmarks, with and without prefetching.

More »

Expand

Fig 6.

Speed-ups enabled either by hardware prefetching, with the minimum cache size (X axis) or maximum LLC size, without prefetching (Y axis) over a baseline configuration without prefetching and minimum LLC size for the selected SPEC CPU2006 and CPU2017 benchmarks.

Integer and floating point benchmarks are represented by gray circles and black squares, respectively.

More »

Expand

Fig 7.

CPI vs. MPKI3 for the selected CPU2006 benchmarks, varying LLC size and with prefetching (square marks) and without prefetching (x marks).

Slope units are cycles/miss. Slopes are comparable in all graphs because the ratio between X and Y scales is constant (10:1).

More »

Expand

Fig 8.

CPI vs. MPKI3 for the selected CPU2017 benchmarks, varying LLC size and with prefetching (square marks) and without prefetching (x marks).

Slope units are cycles/miss. Slopes are comparable in all graphs because the ratio between X and Y scales is constant (10:1).

More »

Expand

Fig 9.

Impact of the different hardware prefetchers on performance (CPI, bars) and bandwidth consumption (BPKI, line) for the selected CPU2006 benchmarks.

More »

Expand

Fig 10.

Impact of the different hardware prefetchers on performance (CPI, bars) and bandwidth consumption (BPKI, line) for the selected CPU2017 benchmarks.

More »

Expand

Fig 11.

Temporal evolution of MPKI3 and SimPoint selection for the selected CPU2006 benchmarks, with minimum LLC size and hardware prefetching.

More »

Expand

Fig 12.

Temporal evolution of MPKI3 and SimPoint selection for the selected CPU2017 benchmarks, with minimum LLC size and hardware prefetching.

More »

Expand