Targeted Sampling - RSIML Documentation

Next: Sampling and Trace Mode RTIs, Previous: Periodic Sampling, Up: Sampling

3.6 Targeted Sampling

Under targeted sampling the locations of samples are predetermined, usually by analyzing a benchmark run. Also predetermined for each sample is a weight, which indicates how much of the program the sample characterizes. For example, one might predetermine that samples should start at instruction 500000 and 1800000. Further, one might predetermine that the first sample characterizes one third of the program, and the second characterizes two thirds. The weight of the second should then be twice the weight of the first. The simulator uses weights to scale data collected during samples. If the weight of a sample was set to w then at the end of the sample numeric data collected in that sample would be multiplied by w. Suppose the weight of the first sample was 1 and the weight of the second sample was 2. Then because of scaling it would be as though the second sample occurred twice. Targeted sampling realizes the largest gains over periodic sampling when programs have large uniform phases, each uniform phase gets the same number of samples regardless of size, the larger phases have larger weights.

Targeted sampling is specified using three RTIs. The warmup and sample sizes are specified by ss_warmup_size and ss_sample_size. RTI ss_target_spec specifies the start of each sample, its weight, and the size of the benchmark. Its value is a string which consist of items separated by colons, for example, w1:m1000:s500:w2:s1800:m1:e1874500. Each item starts with a letter and is followed by a floating-point number. See Sampling and Trace Mode RTIs for a detailed description.

The weights specified are used to scale data collected during the sample. Any weights can be used, though non-positive values should be avoided. Weights are applied to variables registered using the sim_data object; all of these must be numbers for which scaling makes sense. The scaling is done at the end of each sample but the results are not written to registered variables until the main processor exits. (See the code in file sim_data.cc.)

For example, consider Proc member variable bpb_good_predicts. It is registered using macro SIM_DATA_REG when simulator code initializes. As the simulator runs it is incremented whenever a branch resolves and the outcome matches the prediction. The sim_data object will examine its value at the beginning and end of each sample and multiply the difference by the respective sample's weight. It will then add this value to its own internal sum. (If the weights of all samples are 1 and there is no warmup then the internal sum will match the variable's value.) When the main processor exists internal sums are copied to the registered variables.

Targeted sampling is strict about specifying the size of the benchmark. Let e denote the indicated number of instructions in the benchmark and define the fuzz to be max(1000, e * 0.00004). If the benchmark finishes more than fuzz instructions earlier or later than the indicated end there will be a fatal error, cruelly refusing to write simulation results which may be just fine. This draconian measure is to detect benchmark changes (recompilation or changes to input data) for which the target specification would be inappropriate.

The RTI settings below specify targeted sampling of two targets, one at 500000 and one at 1800000 instructions. Each has size 50000 instructions and a warmup of 10000 instructions. The first sample has weight 1 (the default), the second has weight 2. The benchmark is expected to finish at most 1000 instructions before or after 1874500.

     ss_target_spec m1000:s500:w2:s1800:m1:e1874500
     ss_sample_size    50000
     ss_warmup_size    10000