Value-based cost-cognizant test case prioritization for regression testing

Software Test Case Prioritization (TCP) is an effective approach for regression testing to tackle time and budget constraints. The major benefit of TCP is to save time through the prioritization of important test cases first. Existing TCP techniques can be categorized as value-neutral and value-based approaches. In a value-based fashion, the cost of test cases and severity of faults are considered whereas, in a value-neutral fashion these are not considered. The value-neutral fashion is dominant over value-based fashion, and it assumes that all test cases have equal cost and all software faults have equal severity. But this assumption rarely holds in practice. Therefore, value-neutral TCP techniques are prone to produce unsatisfactory results. To overcome this research gap, a paradigm shift is required from value-neutral to value-based TCP techniques. Currently, very limited work is done in a value-based fashion and to the best of the authors’ knowledge, no comprehensive review of value-based cost-cognizant TCP techniques is available in the literature. To address this problem, a systematic literature review (SLR) of value-based cost-cognizant TCP techniques is presented in this paper. The core objective of this study is to combine the overall knowledge related to value-based cost-cognizant TCP techniques and to highlight some open research problems of this domain. Initially, 165 papers were reviewed from the prominent research repositories. Among these 165 papers, 21 papers were selected by using defined inclusion/exclusion criteria and quality assessment procedures. The established questions are answered through a thorough analysis of the selected papers by comparing their research contributions in terms of the algorithm used, the performance evaluation metric, and the results validation method used. Total 12 papers used an algorithm for their technique but 9 papers didn’t use any algorithm. Particle Swarm Optimization (PSO) Algorithm is dominantly used. For results validation, 4 methods are used including, Empirical study, Experiment, Case study, and Industrial case study. The experiment method is dominantly used. Total 6 performance evaluation metrics are used and the APFDc metric is dominantly used. This SLR yields that value-orientation and cost cognition are vital in the TCP process to achieve its intended goals and there is great research potential in this research domain.


Introduction
Making great decisions in Software Engineering (SE) require a thorough understanding of the business consequences of those decisions [1]. SE research is primarily based on value-neutral settings in which all software artifacts have equal importance and there are many limitations of this fashion [2]. Value-based SE (VBSE) has catered to these limitations by considering value in software development principles and practices [2]. According to Barry Boehm, the definition of VBSE is "the explicit concern with value concerns in the application of science and mathematics by which the properties of computer software are made useful to people" [3]. The early research in the currently popular approach of agile software development was focused on extreme programming, pair programming, and lean software development. Now, there is a transition in this trend and the focus is on the value of developed features and continuous value delivery [4]. Current trends are focused on continuous value delivery and close coordination between business units and technical teams [4]. According to [5] there is a value-based view of software product quality. Software customers usually take the value-based view, and they are concerned about the value added by software products to their organization. They perform a cost-benefit analysis. This requires a good definition of customer expectations regarding software quality in terms of some value. To meet customers' software quality expectations in terms of value, software testing becomes more critical. Software testing is an expensive and critical phase of the software development life cycle and often consumes 40-50% of the overall budget of a software project [6]. Software testing has become a critical part of software development, due to its complexity, size, and support to real-time businesses [7]. Just like other software development phases, software testing research is also primarily based on a value-neutral fashion in which all the code statements, requirements, use cases, conditions, methods, and scenarios are treated as equally important [8]. In value-neutral software testing, resources are allocated to the activities that are inefficient in the context of Return on Investment (ROI). Software testing costs around $300 billion a year worldwide [3]. Value-neutral testing is not directly linked to the business objectives of the product and is considered agnostic to value considerations [9]. To address these challenges, value-based testing was introduced.
Value-based testing involves testing software systems that can better align testing resources to meet the value objectives of the project [8]. The major thing in value-based testing is to integrate internal testing objectives with the client's business objectives and expectations [8]. The focus is on value delivery instead of verifying code against a set of requirements. According to [2], the value-neutral testing generated a higher ROI of 1.22 with 100% test execution, and the value-based testing produced a higher ROI of 1.74 with the execution of about 40% of the most valuable test cases. This indicates that testing resources should be utilized in such a way that they can add value to the client's business success. The tests should be aligned with the written requirements as well as with the client's expectations. Therefore, testing activities should adopt a business value-based perspective. The value-based verification and validation are also included in the agenda of VBSE [10].
According to Boehm software cost is estimated at $1 trillion per year and testing activities cost half of this total cost. He said that 60% of the testing cost can be diminished and there is a cost-saving potential of $300 billion worldwide in a year through this value-based testing investment [3]. Regression testing is among the most expensive testing activities and is a big challenge in rapidly growing and changing systems. It consumes a large amount of time as well as effort and mostly accounts for around half of the software maintenance costs [11]. Regression testing is normally associated with system testing after any code change, it can be carried out at the system level, integration level, or unit level [11]. Complete test coverage is normally not possible during regression testing due to limited time and resources. Then the question arises: how much regression testing is enough, which is always a challenge for the testing teams. There are four types of techniques used in regression testing for test case coverage including TCP, test case selection, test case reduction, and retest all [12,13]. Fig 1 illustrates the different types of regression testing approaches.
The test selection approach is widely applied in the industry, but it is not risk-free because it is based on selection. Similarly, the test case reduction cannot guarantee that only unrelated test cases are eliminated from the test case pool. On the other hand, TCP does not reduce or remove test cases from the test suite. That is why it is more secure, reliable, and popular in practice and a lot of research work is being done in this field.
Test case prioritization is one of the ways for optimized regression testing [14]. Rothermel et al. defined the TCP problem as follows [15]. Suppose T is a test suite, PT is a set of permutations of T, and f is a function from PT to real numbers, f: PT!R.
Prioritization Goal: To find a T I 2 PT that maximizes f. The factors that are considered in TCP techniques include the size of the test case set, cost, time, effort, efficiency, number of defects detected, and repetitiveness [16]. Different TCP techniques have been primarily proposed to increase the Average Percentage of Fault Detection (APFD), by prioritizing the test cases to save time and cost. There are two different fashions in which TCP techniques have been proposed. These fashions include value-based and value-neutral. In a value-based fashion, the cost of test cases and the severity of bugs are considered while prioritizing test cases for regression testing. Whereas the value-neutral fashion assumes that all faults have equal severity, and all test cases have equal cost. But in practice this assumption rarely holds. Value-neutral fashion is dominant over value-based fashion.
The value-neutral TCP techniques consider that all faults have the same severity, and all test cases have the same cost. Similarly, the metrics used for performance evaluation of TCP techniques like Average Percentage of Statement Coverage (APSC), the Average Percentage of Fault Detection (APFD), the Total Percentage of Fault Detection (TPFS), the Average Percentage of Branch Coverage (APBC), the Average Percentage of Function Coverage (APFC), the Average Percentage of Condition Coverage (APCC), and the Average Percentage of X elements Coverage (APXC) are also proposed in a value-neutral fashion. All these metrics assume that all faults have the same severity, all requirements have the same worth, and all code statements have the same value but practically this is very rare. Different bugs have different severity, and different requirements have different values. Similarly, the value of different functions, statements, conditions, branches, and methods may differ from other functions, statements, conditions, branches, and methods, respectively. Most of the existing TCP techniques are

PLOS ONE
Value-based cost-cognizant TCP for regression testing coverage-based and are effective at unit level testing, but are time-consuming and consider that all bugs are equally severe and all test cases have equal cost [5]. This assumption is not possible in practice [17]. A summary of the existing value-neutral TCP techniques along with their core objectives is presented in Table 1.
The focus is on the number of faults detected instead of their impact on the client's business. Ignoring value in the TCP process is prone to produce unsatisfactory results. Another problem with this fashion is that the use of value-neutral metrics for performance evaluation of TCP techniques will produce unreliable results. To address these problems, value-based cost-cognizant TCP techniques have been introduced.
The value-based cost-cognizant TCP techniques deal with the severity of faults and the cost of test cases in the prioritization process [58]. The value-based TCP takes the challenge of integrating value consideration into the prioritization process. The value orientation in TCP ensures that prioritization satisfies its value objectives. In practice, 80% of the value exists in a 20% portion of the software [3,8]. This fact supports the need for value-orientation in software testing. Unfortunately, a limited number of value-based cost-cognizant TCP techniques are available in the literature. To the best of the author's knowledge, no comprehensive review is available on value-based TCP techniques. In this paper, we have performed a systematic literature review of value-based cost-cognizant TCP techniques to know the current state of research in this domain. An enhanced taxonomy of TCP techniques has been proposed so that value considerations can be taken into account in the TCP process. A generic cost-cognizant TCP [59][60][61] process with its objectives and an analysis of the proportional differences of value-neutral and value-based TCP techniques are also given. This study emphasized the need for value-orientation in TCP and highlighted that a paradigm shift is required from a valueneutral to value-based TCP process. The study results yield that there is very lesser work done in value-based TCP as compared to value-neutral TCP. The study also highlighted a few open research problems and concluded that there is a great potential for further research in valuebased TCP. To perform the study, standard SLR guidelines recommended by Kitcheham have been followed [62].

Method
This study has been undertaken as an SLR following the standard guidelines proposed by Kitchenham and Carter [62,63]. An SLR is a great means to know the status of research related to a specific phenomenon or a particular domain. Hence, the goal of this SLR is, to sum up the knowledge related to value-based cost cognizant TCP techniques. The review protocol for this study contains four phases, each with two steps. In the first phase, research motivation and research questions have been described. The second phase is related to the selection of search repositories and the search process. The third phase describes two study selection criteria, including inclusion/exclusion criteria and quality assessment criteria. The last phase is related to data synthesis and data extraction. An external reviewer performed evaluation and validation of the review protocol and provided feedback. All the feedback suggestions are incorporated to refine and improve the overall quality of the protocol. The review protocol is shown in Fig 2.

Research motivation
Different SLRs have been published on different aspects of TCP. In [64], an SLR of TCP approaches for regression testing has been performed by Khatibsyarbinni et al. The goal of this SLR is to comprehend the current trend of TCP approaches and to provide their empirical evaluation. The taxonomic distribution of the TCP approaches is presented. It covers the strengths and weaknesses of TCP approaches in terms of their results. It also covered the According to a study [64], the average percentage of fault detection (APFD) 51%, coverage effectiveness (CE) 10%, APFDc 9%, Execution Time (ET) 7%, and other metrics have been used 23% for the performance evaluation of TCP techniques. A mapping study highlighted the major categories of TCP techniques including coverage-based, distribution-based, requirements-based, model-based, human-based, probabilistic-based, history-based, cost-awarebased, and others [58].
According to a study [66], the usage of performance metrics is 58% (APFD), 8% (APFDc), 8% (APSC), 6% (NAPFD), and others 2%. In [17], mapping study results have been presented related to the test prioritization in a continuous integration environment. In [67], a review of TCP techniques using GAs is done that covers methodologies, adequacy criteria, algorithms, dataset specifications, performance evaluation metrics, and validation criteria. According to this study, different metrics have been used like Execution time (ET) 48%, APFDc 18%, Expense 15%, Fault Detection 33%, APFD 24%, and NAPFD 9%. Another mapping study indicates that the usage of APFD is on top, APFDc on second and APSC is the least used metric [68].

PLOS ONE
Value-based cost-cognizant TCP for regression testing The literature describes cost-aware/cost-cognizant as an explicit category of TCP techniques, but there is no SLR available on it. The mainly used TCP evaluation metrics are APFD and APFDc. For value-based cost-cognitive TCP techniques, the APFD metric is not appropriate because it has two limitations a) all test cases have equal cost and b) all faults have equal severity [69]. The use of the APFD metric for performance evaluation of value-based TCP techniques is prone to produce unsatisfactory results. These research gaps found in existing studies are the major motivation and inspiration that raised a need to publish a technically informative document in the domain of value-based-cost-cognizant TCP techniques. To fill this research gap, an SLR of value-based cost-cognizant TCP techniques is performed in this study. Its objective is, to sum up, the knowledge related to value-based cost-cognizant TCP techniques and to highlight the related open research problems of this domain.
To execute the SLR, a review protocol is developed to control the researcher's bias. It consists of research questions, selection of the literature sources, search process, study selection procedure, quality assessment score, and data extraction and data synthesis. The review is assessed and validated by an external reviewer. Few suggestions are received and are incorporated to improve its quality. Each step of the review protocol is comprehensively described below.

Research questions
Six research questions have been articulated that are required to be answered through this research. These questions are listed in Table 2. The motivation behind each research question is also presented.
To define the scope of selected studies and goals of this SLR we used the Population, Intervention, Comparison, Outcomes, and Context (PICOC) method [67] as mentioned below. It helped us to reduce the risk of bias.

Population: Literature on value-based cost-cognizant TCP.
Intervention: Taxonomic classification of TCP.

Outcomes: Recommendations for further research on value-based TCP for a paradigm shift with evidence.
Context: An SLR to combine the current body of knowledge.

PLOS ONE
Value-based cost-cognizant TCP for regression testing

Selection of literature sources
The selection of the literature sources is an important step for any SLR. We selected the popular research repositories that are evident to report research papers related to TCP. The other justification behind the selected research repositories is that few existing reviews of TCP also utilized the same sources [17,64,67]. The following database repositories are utilized for this SLR. •

Search process
The search strings were formulated considering the research questions and study goals. The search strings were composed of the terms "Test Case Prioritization", "Value-Based TCP", "Cost-Cognizant TCP", and "Evaluation Metrics for TCP". The keywords used in the search process are listed below. •

Study selection procedure
The study selection procedure consisted of a set of steps, presented in Fig 3 following the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) statement [70]. A proper study selection procedure is adopted to select relevant studies and remove all irrelevant studies. An inclusion and exclusion criteria are defined to ensure that only relevant studies are selected for the study. Inclusion and exclusion criteria are presented in Table 3. The primary author selected primary studies. A test/retest approach is opted to verify the selection process. The co-author (Ph. D. Research Supervisor) compared the results using random sampling. The appropriateness of inclusion and exclusion criteria is tested and verified as per the guidelines of Brereton et al. [71]. The inclusion and exclusion criteria facilitate the selection of studies to be considered for SLR and it is utilized by existing reviews of TCP [64,67]. The opted inclusion/exclusion criteria for this SLR are presented in Table 3.

Quality assessment score
The Quality Assessment Score (QAS) provides help to evaluate the relevance and significance of the study [63]. A study is selected or rejected based on the QAS. For study selection, we formulated a three-point QAS checklist following the guidelines of Kitchenham et al. [63]. The checklist is given in Table 4.
Based on the defined search process, we retrieved 365 papers from the selected literature sources. First, 170 duplicate papers were removed. On the remaining 195 papers, the title and  abstract review were performed, and 75 papers were screened out. Through a detailed review of the full text and by applying inclusion and exclusion criteria, 14 more studies were removed. Afterward, we performed a quality assessment and as a result, 14 more studies were removed. Finally, 21 papers were selected for the study. The search process and selection procedure are depicted in PRISMA flow diagram in Fig 3. Table 5 presents the list of selected studies along with their QAS. The final acceptance criteria for minimum QAS value was "4". All 21 papers qualified minimum acceptance criteria. To justify the range of literature years covered in this paper, it is mentioned here that we also reviewed the latest papers of 2021, and Table 1 contains multiple papers of 2020 and 2021. The selected studies in Tables 5 and 7 do not include any paper of years 2020, and 2021 because they do not meet the established inclusion criteria. That is the reason for not including any paper of 2020, and 2021 in the selected studies.

Data extraction and data synthesis
In any review protocol, the process of extracting, and synthesizing data from the selected studies is a prominent reason that distinguishes an SLR from a traditional literature review. In the extraction process, data is extracted from the selected studies relevant to the SLR questions whereas the data synthesis process is the collective form of the results derived from those studies [62]. In the data extraction process, we collected bibliographic information (Unique ID, title, authors, year of publication, citations, paper type, publisher), common steps in the TCP process, the algorithm used, evaluation metrics used, the results validation method, the dataset availability, contribution, category of TCP technique, and open research problems. To collect data from the primary studies, a data extraction form is designed and is given in Table 6. In the data synthesis process, the extracted data is combined and organized in such a way that it can be useful to answer the defined research questions. Table 6 presents the outcome of the data synthesis process. Table 7 presents the data extraction results from the selected studies.

Assessment and findings
After a comprehensive analysis of the selected studies and synthesized data, the assessment is performed, and the findings are concluded. In this section, all the defined research questions have been answered. The first paper related to value-based TCP published in 2001, proposed a cost-cognizant performance evaluation metric APFDc [87]. Later, few other authors used this metric for the performance evaluation of their proposed TCP technique [19, 59-61, 72, 74, 76, 78, 80, 83]. We found that only 21 papers are published in a value-based fashion which used test case cost and fault severity in the test case prioritization process. Out of 21 selected papers,

PLOS ONE
Value-based cost-cognizant TCP for regression testing A requirements-based TCP technique to boost the rate of detection of severe faults.

(Continued )
there are 4 conference papers, and the rest of the seventeen are published in different journals. Table 8 presents the authors, year of publication, reference, and publisher information. The current trend of value-based cost-cognizant TCP techniques shows that limited work is available in a value-based fashion. Therefore, more TCP studies are required in a value-based fashion to get better and reliable results. There is a great potential for further research to fill the gaps. The leading researchers in the domain of value-based cost-cognizant TCP techniques are Gregg Rothermel, Sebastian Elbaum, and Alexey Malishevsky [60,87].

Algorithms used in value-based cost-cognizant TCP (RQ1)
The synthesized data extracted from the selected studies show that different algorithms have been used in value-based cost-cognizant TCP techniques. The algorithms include TERMINA-TOR Algorithm, Particle swarm optimization (PSO), Additional Greedy, Sorting Algorithm,

PLOS ONE
Value-based cost-cognizant TCP for regression testing

Validation methods used in value-based cost-cognizant TCP (RQ2)
The collective results extracted from the selected studies indicate that four results validation methods have been used including, Empirical study, Experiment, Case study, and Industrial case study. Papers P1, P5, and P14 used the empirical study method to validate their results. Empirical evaluation is usually based on the researcher's observations and investigation of the phenomenon. Papers P2, P3, P4, P6, P7, P8, P11, P15, P16, and P20 used the experiment method to compare results with the existing state-of-the-art techniques. The experiment method is usually applied for a small project. Most of the studies are done with a small scope. Papers P13, P17, P18, P19, and P21 used the case study method to validate their results. The case study is usually applied to a specific case to validate the results. Papers P9 and P10 used the industrial case study method to validate the results. Industrial case studies usually cover real industrial projects. Paper P12 did not use any validation method. Fig 5 shows the distribution of selected studies according to the validation method used.

Value-based cost-cognizant TCP process and its objectives (RQ3)
The use of standard processes and clarity of objectives are vital for the success of software projects. In the TCP process, test cases are not eliminated or removed, rather each test case is assigned a priority. The test cases in the test suite are sorted by their priority. Then the testing team starts executing the test cases with the highest priority and ends when regression time ends, or all test cases are covered. A variety of TCP techniques are available including searchbased, requirements-based, coverage-based, history-based, fault-based, risk-based, and others [64]. This sub-section outlines a TCP process that depicts some common steps involved in value-based cost-cognizant TCP techniques. These steps are taken from the selected studies [61,73,81] and are depicted in Fig 6. 1. Prepare the test case data set that needs to be prioritized.

Define the parameters to be used for prioritization assuming that different faults may have different severity and different test cases may have a different cost.
3. Apply formula to calculate the prioritization score for each test case.

4.
Prepare test data set with prioritization score that needs to be prioritized.

5.
Run the algorithm to prioritize the test cases based on prioritization criteria/score.

6.
Prepare test dataset in prioritized order.

Repeat step 3.
Few common objectives of value-based cost-cognizant TCP techniques given in selected studies have also been presented and are depicted in Fig 7. • Early fault severity detection is the major objective of value-based TCP. Late detection of bugs is more costly. Therefore, there is a direct relationship of the cost with this TCP objective.

PLOS ONE
Value-based cost-cognizant TCP for regression testing • To provide quick product maturity through critical fault detection first. Maximum bugs are detected and fixed earlier therefore product gets mature and it ultimately builds confidence to meet the deadline.
• Efficient utilization of testing resources during regression testing.
• Early business value coverage • Saving time and budget

An enhanced taxonomy of TCP techniques (RQ4)
This section presents the details about the categorization of TCP techniques given in the selected studies. According to a study, there are seven categories of TCP techniques including search-based, coverage-based, requirements-based, fault-based, risk-based, history-based, and others [64]. The most widely used approach is Search-based, while other techniques used are coverage-based and fault-based. Another study presented a categorization including probabilistic, cost-based, history-based, human-based, distribution-based, coverage-based, modelbased, and others [17]. Table 9 shows the categorization of value-based cost-cognizant TCP techniques selected for this study. The existing categorizations and taxonomies of TCP techniques have recognized "costbased" as one category among other categories. But we believe that "cost-based" or "cost-cognition" is a value-based fashion. We must recognize cost-cognizant test prioritization to segregate it from value-neutral TCP techniques. To address this need we proposed two abstract classes of TCP techniques including value-neutral and value-based. For this, we have proposed an enhanced taxonomy of TCP techniques presented in Fig 8. Our enhanced taxonomy of TCP techniques is comprised of two major classes including value-neutral test prioritization and value-based test prioritization.

Performance evaluation metrics used in cost-cognizant TCP (RQ5)
For the validation of any proposed technique, its performance is evaluated. This section presents the performance evaluation metrics used for TCP techniques given in the selected studies.

PLOS ONE
Value-based cost-cognizant TCP for regression testing There are many metrics used for the performance evaluation of TCP techniques. Table 10 shows the list of metrics used in value-based cost-cognizant TCP techniques mentioned in the selected studies.

The average percentage of fault detection (APFD)
. APFD is the most popular metric used for the performance evaluation of TCP techniques and it was created by Elbaum et al. in 2000 [88]. This is a value-neutral metric based on the assumption that all faults have the same severity, and all test cases have the same cost. APFD is presented by Eq 1.
In this formula, m is the total number of faults detected and n is the total number of test cases, and TF1 is the place of the first test case that reveals fault F1.

The average percentage of fault detection per cost (APFD C ).
APFDc metric was proposed by the researchers to overcome the shortcomings of the APFD metric [87]. APFDc considers varying test cases cost and fault severity. This new cost-cognizant metric was proposed in a value-based fashion. It accounts for units of fault severity exposed by units of test case cost. The x-axis shows the total units of test cases cost instead of simply showing the percentage of test cases executed and similarly y-axis implies the total units of fault severity instead of simply showing the percentage of faults detected. APFDc has been presented by Eq 2. In Eq 2, T is the test suite and n is the number of test cases with costs t1, t2, . . ., tn. F is a set with m number of faults detected by T, and f1, f2. . ., fm are the severities of faults. TFi is the first test case in a test case order that detects fault i. [89]. It best describes the physical explanation of the testing process. This metric has been presented by Eq 3.

Metric based on varying requirements' priority and test cases' cost (M RP_TC ).
This metric was introduced by Zhang et al. and it is based on varying requirements' priority and test case cost [84]. It can be represented by Eq 4. The value range of M RP_TC is 0 to 100% and higher value implies the better performance.

The average severity of faults detected (ASFD).
The ASFD metric was introduced by Hema Srikanth and Laurie Williams [86]. ASFD value is the ratio of the sum of severities detected by a specific requirement and Total Severity of the Faults Detected (TSFD). It can be represented by Eq 5.
4.5.6 The average percentage of business importance earned (APBIE). The APBIE was introduced by Qi Li, and Barry Boehm [79]. This metric is proposed to cover the business significance of the system under test. It can be represented by Eq 6.
• If the above assumptions rarely hold, then why is the APFD measure highest in popularity in TCP research?
• Is APFDc a good alternative to APFD? If yes, then why APFDc is far behind in comparison with APFD?
• Does the research community need a new standard measure for performance evaluation of TCP techniques?
A limitation of the existing literature is that while using the APFD metric, the researchers did not explicitly mention that basic assumptions of APFD hold or not. They did not mention the reason why they selected APFD as an evaluation metric. Most of the researchers expressed that APFD is the most popular measure which is why we are using it. This is not a strong and valid justification for the metric selection. Therefore, the results produced by using APFD are mostly unreliable in the cases where the assumption "all faults have equal severity and all test cases have equal cost" does not hold [87]. The selection of performance evaluation metrics is still an open research problem. The coverage-based techniques are related to a specific element like statement, condition, branch, methods, or requirements. The metric used for performance evaluation of these techniques is the Average Percentage of Statement Coverage (APSC), the Average Percentage of Condition Coverage (APCC), the Average Percentage of Branch Coverage (APBC), the Average Percentage of Method Coverage (APMC), and Average Percentage of Requirement Coverage (APRC). These metrics can be collectively described as the Average Percentage of Element Coverage (APEC) [67]. These metrics are similar to APFD and are based on the same assumption that all statements, conditions, branches, methods, or requirements are of the same worth, and the test cases used to cover these certain elements have the same cost. This is a major limitation of coverage-based techniques and their utilized metrics, and they might not produce intended results. We suggest that value considerations should be considered in coverage-based techniques and traditional coverage-based metrics should be replaced with value-based coverage metrics. Introducing value in the TCP process can make TCP techniques more efficient and effective.
The existing coverage-based TCP techniques are not aligned with the client's priorities. The client may bother with some features and may not for others. Some features may involve profitability and productivity for the client business, and some may not. There is a slogan in SE that "Client is always right". But the client's perspective is missing in existing TCP techniques. Most of the existing TCP techniques have been proposed in a value-neutral fashion and do not consider the client's business value expectations. Business value orientation has great potential in the TCP process. Proposed techniques detect a huge number of faults even then releases become late because high severity bugs are detected late in the regression cycle. Debugging and fixing such critical faults at the eleventh hour creates stress on development teams and fixes become prone to further errors. Therefore, detecting critical faults early in the regression testing life cycle is vital. The value-based TCP branch depicted in the taxonomic classification of TCP in Fig 4 is still a gray area. There is great potential for further research related to value orientation in all categories of TCP. Value orientation should be considered in research methods, study contexts, prioritization approaches, and performance evaluation metrics. This is a big surprise that TCP research is continuously coming in a value-neutral fashion despite knowing the fact that all software elements do not have equal worth. Here are a few recommendations to fill the research gap.
A quick paradigm shift from a value-neutral test prioritization to value-based test prioritization is needed. Value-neutral TCP techniques are not likely to produce satisfactory and reliable results. Therefore, TCP remains unable to achieve its intended goals. To fill this major research gap, further research in the domain of TCP should be in a value-based fashion and performance evaluation should be done through value-based cost-cognizant metrics. The study results show that there is a limited application of machine learning techniques in value-based cost-cognizant TCP techniques. Further research should try to solve the TCP problem by applying machine learning techniques to achieve efficiency in the prioritization process. It is also evident that the dataset used for results validation is publicly available only for one study P1. The rest of the studies did not use the public data set for validation. Public datasets should be used so that future researchers can conduct empirical studies to reproduce the results and make further improvements. A limitation is reported in the recent literature that simple statements and traditional coverage cannot guarantee 100% fault detection [67]. Coverage-based techniques are producing less optimistic outcomes. Utilizing value coverage can overcome this limitation. According to the study, most of the work done covers only the functional aspects of the applications [14]. Security, usability, privacy, and performance are very important and perhaps these are not addressed in traditional code coverage metrics. There is a need for coverage of non-functional aspects in the TCP process as well.

Threats to validity
In this section, we identified a few known threats to the validity of this study's results. Our defined research questions may not include all aspects of value-based cost-cognitive TCP techniques. This is a construct validity threat and we addressed it through discussions. We believe that our research questions are well designed and mapped with the goals of the study. Ensuring perfection in the data collection process is a difficult task. We cannot guarantee that our data collection is complete. Imperfect data collection can be a threat to the validity of this SLR. We carefully selected our search keywords to fetch more relevant studies from the research repositories. Our paper search was limited to a few prominent research repositories. There might be more relevant publications available in other search repositories. To minimize this problem, we utilized those research repositories which were utilized by previous reviews of TCP techniques. Validation of the study relevancy evaluation process is also a major threat for any SLR. To address this issue, an independent reviewer also evaluated selected studies' relevance. The second author (supervisor) played this role as an independent reviewer. The data extraction process may be imprecise, and this may affect the validity of this research. It is due to the unsystematic data extraction process. To reduce this risk, we applied manual data extraction through expert judgment. Drafting of a review protocol can also affect the study selection. We followed Barbara Kitchenham's guidelines to prepare our study protocol to avoid any associated threat.

Conclusion
The TCP is a vital approach for regression testing to meet time and budget constraints. There are two major classes of TCP techniques 1) Value-neutral TCP techniques 2) Value-based TCP techniques. Both classes have many other categories like coverage-based, history-based, and risk-based. The value-neutral TCP techniques assume that all elements like statements, requirements, test cases, use cases, methods, and bugs are equally important. This assumption rarely holds therefore value-neutral TCP techniques are prone to produce unsatisfactory results. Due to this major limitation of the TCP process, value-based cost-cognizant TCP techniques are gaining popularity. To the best of our knowledge, currently, no review is available on value-based cost-cognizant TCP techniques.
• In this paper, an SLR of value-based cost-cognization TCP techniques is performed. The objective of this study is to see the current state of research in this field.
• This review is evident that there is very limited work on value-based test prioritization. It is needed to realize that without value considerations in the TCP process, its intended results cannot be achieved.
• The right metric selection for the performance evaluation of TCP techniques is essential to get reliable results. Popularity-based metric selection is not a valid justification, and it cannot produce reliable results. This is a big area for further improvement. The efficiency and effectiveness of TCP approaches are strongly dependent on the correct evaluation metric because a researcher usually targets an improvement in a metric value while proposing a TCP technique.
An enhanced taxonomy of TCP techniques has been devised in this SLR for further advancements in the value-based cost-cognizant TCP process. This SLR yields that there is a great potential in value-based cost-cognizant TCP and future research should cover this important dimension.