Quantity bias in comparison-shopping of multi-item baskets

Comparison-shopping applications are widespread and have been the subject of considerable research and development. There has also been widespread recognition that people are predictably irrational when making shopping decisions. In this work, we combine these two facts to propose a new type of predicable irrational behavior that has important implications for comparison-shopping applications that now utilize crowdsourcing to increase the information provided about sellers in these electronic marketplaces. In a series of three studies we demonstrate that, even after controlling for relative and absolute savings, the number of items in a shopping trip is an important consideration in the decision to make a trip to more than one store. This is true of both actual trips in physical shopping in the real world, and virtual trips to other vendors in online shopping. We term this effect quantity bias.


Introduction
The growth of the Internet and mobile services has led to increased opportunity for consumers to rely on information technology (IT) enabled shopping [1][2][3][4][5]. Due to the ubiquity of mobile devices, crowdsourcing [6,7] of information is now possible across a variety of tasks. A particularly interesting new use of technology-enabled shopping is to crowdsource information about prices of goods in physical stores. This has resulted in recommendation agent applications such as Basket1, Favado1, Flipp1, and GasBuddy1, allowing shoppers to do realtime comparison-shopping in physical stores. These applications have significant economic implications for both shoppers and the physical stores at which they shop, as better information should drive down prices [8,9].
However, research shows that people may violate rational economic principles when evaluating multi-attribute situations [10,11]. To explain how people violate these principles when analyzing a multi-attribute situation, Thaler [12, p. 183] describes mental accounting as "the set of cognitive operations used by individuals and households to organize, evaluate, and keep track of financial activities" including the formation of topical accounts.
In this work we use the framework of behavioral economics, and in particular the idea of hedonic editing [13][14][15], which is a type of mental accounting, to examine how multi-item comparison shopping applications are used by consumers. Behavioral

Absolute savings
Assume a situation where a user has a multi-item basket of products, which they want to purchase. For simplicity we focus on commodity items that are identical at different stores. Additionally, while other types of biases are probably at work when considering these pricecomparison applications, the number of items at each store is a basic information piece. We designed away other biases that might be in play that we could, as we wanted to isolate the effect of the number of items compared to savings. Using a representative application (see Fig 1), consumers specify the list of items they desire and then the application calculates the lowest cost for the entire list based upon the number of stores that the shopper is willing to travel to. Current applications have options for a single store, two, or three or more stores, but for simplicity we limit our enquiry to two stores. Thus, the user must decide between two alternatives. The user can simply go to the store that has the overall lowest cost, or the user can make a second trip and get all of the items at store 1 that are less expensive at store 1 and all of the items at store 2 that are less expensive at store 2. The implementation of this in a representative app is shown in Fig 1. The left panel shows that the cost at the least expensive store is $28.01. The right panel shows that the total cost is reduced to $26.33 if two stores are utilized for those same items (the bread is less expensive at Red Circle than Blue Diamond). Screenshots from a representative application illustrating the effect on total cost by going to one store (left hand side) or two stores (right hand side) to complete the purchase of a basket of items. Total cost is lowered from $28.01 to $26.33 if the user chooses the cheapest two-store option. https://doi.org/10.1371/journal.pone.0263406.g001 In principle, a rational decision maker will only consider the cost/benefit of making a trip to a second store. Thus, if it takes 10 extra minutes to travel to the store, the rational question is "how much is 10 minutes of my time worth?" However, people have been shown to be irrational when making the decision to drive to a second store. Moreover, they have been found to be predictably irrational (Thaler 1999, Ariely 2010. That is to say that people do care about things other than the value of their time, but they do so in a predictable way, so that their behavior can still be modeled and predicted. It is the job of researchers to ascertain what other things people care about. Of course, users are not completely irrational; the first thing they care about is the absolute amount they will save. Thus, we propose: H1: To complete the purchase of a multi-item list, absolute savings will be positively correlated with the decision to go to another store. (Absolute Hypothesis) Despite this, we propose that other, predictably irrational, factors play a role in the decision to separate a basket of groceries into two trips.

Relative value bias
Another finding of behavioral economics is relative value bias [23]. This bias is normally activated when deciding to go to a second vendor to realize savings on a particular product, thus trading time for money. Decision makers consider not only the absolute value that is to be saved, but also the relative savings in relation to the price of the item [23][24][25][26]. It has been shown that relative price discounts are significant in getting people to decide to purchase the item at the further location, even though the absolute savings are the same [23,26]. Thus, we propose that relative value bias is present when making a multiple-item purchase: H2: To complete the purchase of a multi-item list, the relative value of savings for going to a second store will be positively correlated with the decision to go to another store. (Relative Hypothesis)

Quantity bias
The key theoretical addition we provide is to propose that in addition to the well-known biases discussed above, in multi-item basket shopping there will also be a quantity bias such that shoppers will prefer to go to a second store more if there are more items to purchase at the second store even when the relative and absolute savings are the same. This follows directly from Thaler's work on hedonic editing [13,14]. Following Kahneman and Tversky (1981) Thaler (1985) posits a value function which is concave in gains, where consumers judge losses and gains based on some reference point and, most importantly, people can choose to aggregate or separate joint outcomes. Under these assumptions several results concerning joint outcomes can be shown. Of specific interests to this work is the idea that multiple gains should be segregated.
To see this, we start with the two-item case from Thaler (1985). First, we note that in the context of shopping savings is a gain relative to a base price. If a customer can get item 1 for price X � 1 in one trip, but could get the same item for price X 1 by making two trips, where X � 1 > X 1 , then the customer enjoys a gain because the actual price paid is less than the established reference price. We label this gain G 1 . If consumers have a value function which is concave over gains then v(G 1 ) + v(G 2 ) > v(G 1 + G 2 ). By induction it is straightforward to show more generally that P n i¼1 vðG i Þ > vð P n i¼1 G i Þ. In other words, the value of saving a smaller amount on n separate items is higher than the value of saving the sum of those amounts on a single item. For hedonic editing the question is, should an individual aggregate gains? Whereas in our case the question is, would an individual prefer several small gains to a single equivalent large gain?
Thaler tested this with the following question. "Mr. A was given tickets to two lotteries involving a World Series. He won $50 in one lottery and $25 in the other. Mr. B was given a ticket to a single, larger World Series lottery. He won $75." [13, pg. 203]. When subjects were asked who would be happier, 56% said A would be happier, while only 16% said B would be happier and 15% said no difference. This result has been validated across many contexts including sales of stocks [27], manuscript acceptances [28], and weight loss during dieting [29]. Our contribution is to extend it to the case of N items and to use it not to predict how people should aggregate gains and losses, but rather to predict how people choose whether to avail themselves of saving allowed by new IT enabled comparison-shopping applications.
The notion of hedonic editing suggests that consumers would prefer savings to be spread across multiple items rather than to gain the same amount of savings on one or a few items. This suggests that consumers would be more willing to travel to a second store for savings if there are more items that will need to be purchased at the second store to achieve those savings. Thus, we predict: H3: To complete the purchase of a multi-item list, the number of items that are to be purchased from a second store will be positively correlated with the decision to go to the second store.

Methodology
We conducted three studies that confirm the existence of quantity bias. The first study involved a within-subject design, while studies 2 and 3 utilized between subject design. Our first study confirmed that quantity bias existed, while study 2 and 3 were utilized to explore properties about quantity bias. All studies were approved under IRB2018-763 by the Texas Tech University Human Research Protection Program, with participant's informed consent gained prior to starting each study.

Methodology and data collection study 1
For our first study, we presented eight shopping situations, manipulating three variables between low and high values for a two by two-by-two presentation (see Table 1): total cart cost, potential total savings by going to a second store, and total number of items to be purchased at the second store. Total items in the cart was centered on 20 with a randomized presentation value of {19, 20, 21}. Total cart cost was centered on $50 and $100; with a randomized presentation value +/-50 cents. Potential total savings by going to the second store was centered on $5 and $10; with a randomized presentation value +/-50 cents. Total number of items to purchase at the second store were randomized between two sets: {1,2} and {9,10,11}. All stimuli, along with the total items, were randomized to reduce order effects from the repeated measures presentations. Additionally, cart order (the eight shopping situations) was also random. All random values were uniformly randomized amongst the range or set, with total number of items randomly selected from {19, 20, 21} items.
Subjects were students at a large Southern university, who were asked to complete a survey about a new grocery shopping application. They received course credit for participating in the survey. A preliminary question that described the setup for the eight questions was presented which controlled for time by assuming that the second store would always be 10 minutes away. Then the eight different presentations were presented in random order. Likelihood for going to a second store was collected on a seven-level Likert scale.
A demographic summary of subjects is presented in Table 2.
After obtaining consent, demographic information was collected from each participant, and then a setup example question (Fig 2). Note that a math error exists in the setup question. The cheaper item was incorrectly multiplied by three at the second store instead of five. Thus, the example question shows a $.15 savings vs $.25 savings if going to the second store (no data was collected on the example question). We had purposely chosen a minimal amount close to zero, to avoid priming subjects for the follow-on presentations. Even with the example problem math error, we do not feel it affected participant responses in the experimental portion as the manipulated stimuli contained no errors in any of the presentations and therefore does not invalidate the results. Then eight questions were presented in the following format (Fig 3), in random order. All stimuli presentations were specific-product neutral so as to avoid any product-specific inherent biases from participants.

Results study 1
The mean values for each cart type are presented in Table 3 (n = 101). The comparison values between each hypothesis are presented in Table 4. For H1 (absolute hypothesis) and H3 (quantity hypothesis), a paired t-test (n = 404) was performed to determine that the true difference in the means is not equal to 0. For H2 (relative hypothesis), a one-way ANOVA test was used to determine that there is a difference in the means between the three different conditions.
All three hypotheses are confirmed using paired t-tests and ANOVA. To test the robustness of the data to outliers, the results were re-analyzed by Winsorizing the data to discard the bottom and top five percent of respondents with regards to time completing the survey. All results remained significant.

PLOS ONE
Quantity bias in comparison-shopping of multi-item baskets In addition to the group-wise tests, we conducted regression analysis to determine coefficient size for each independent variable, including controls for age and sex. Regression analysis includes results for Ordered Logistic Regression (Table 5) and Linear Regression (Table 6) with a step-wise addition of each variable in our final model: Absolute Savings (by going to the 2 nd store), Relative Savings (by going to the 2 nd store), Number of Items to purchase at 2 nd Store, Sex, and Age. R base version 3.4.2 was utilized with ordered logistic regression performed using the polr function of the MASS package version 7.3-47 and linear regression performed using the stats package of the base R version.
Again, all three hypotheses are confirmed and are significant in all models. Results were similar when running Winsorized regression against the removal of participants in the top and  bottom five percent with regards to time to complete the survey. Effect size classifies as small for quantity bias (r = .08), small for relative value bias (r = .11), and medium to large for absolute savings (r = .39) [30]. Of note, for both types of regression analysis in the full model, age and sex are also significant, decreasing the likelihood of going to another store to complete a multi-item purchase as age increases or if the respondent is male.

Relative vs. absolute quantity bias (Study 2)
After verifying the existence of quantity bias, we wanted to see if the bias is relative or absolute, i.e. does it matter how many items are at a second store when compared to the total number of items? Since relative value bias has shown to be an influence on the perception of the value of absolute savings, it is worth testing to see if quantity bias also has a relative component. It is possible that, if we fix the number of items at the second store, and increase the total number of items systematically, there could be a systematic change in the willingness to go to the second store. Thus, we propose: H4: Willingness to travel to the second store will increase as the relative number of items that are less expensive at the second store decreases. (Relative-Quantity Hypothesis)  (2), Slightly unlikely (3), Neither likely nor unlikely (4), Slightly likely (5), Moderately likely (6), Extremely likely (7). https://doi.org/10.1371/journal.pone.0263406.t003

Methodology and data collection study 2.
We also test again for quantity bias (H3 from above).
In addition, to testing for a relative effect of quantity bias, we also wanted to expand our sample beyond college students to increase generalizability. Therefore, we collected a sample using Amazon Mechanical Turk by advertising a 20-cent Human Intelligence Task (HIT) asking for 1-2 minutes to answer questions about grocery shopping decisions and two demographic questions (gender and age). A demographic summary of the final subject pool is presented in Table 7. The sample is much more varied in age than the college student sample.
We presented two scenarios to 398 participants. One scenario where the total number of items varied uniformly from 2 to 21, with a single item cheaper at the second store. The other scenario where the total number of items varied uniformly from 6 to 25, with 5 items cheaper at the second store. Held constant was the time to the second store (5 minutes), cost of all items at the first store ($50), and the savings by buying the cheaper items at second store ($5). To make sure subjects read the question we required them to repeat back each of the variables first, then we asked them how likely they were to go to the second store (see Fig 4). Each participant received one scenario presentation. After removing responses in which respondents did not repeat back at least four of the five variables correctly, 371 valid responses were analyzed.

Results study 2
An overview of the presentation values is shown in Table 8.

PLOS ONE
Quantity bias in comparison-shopping of multi-item baskets We conducted regression analysis of the likelihood of shopping at a second store against the total number of items in the basket (relative-quantity effect) with a dummy variable indicating whether the total number of items purchased at the second store was 1 or 5 (the absolute-quantity effect) and we include controls for age and sex. Regression analysis included Ordered Logistic Regression and Linear Regression with a step-wise addition of each variable in our final model: Total Items at Second Store, Gender, and Age. R base version 3.4.2 was utilized with ordered logistic regression performed using the polr function of the MASS package version 7.3-47 and linear regression performed using the stats package of the base R version.
Regression against the total number of items in the baskets was not significant for any model, while # of items at the second store was significant. Thus, H3 (quantity bias) was supported and H4 (relative-quantity) was not supported. (Note, tables not included for space, but are available if desired).
We then took a subset of each presentation value such that the total number of items was limited to those that were between 5 and 22 (see Table 9) to investigate presentations where the total number of items were included on both values of the control condition, so as to determine if H4 (relative-quantity) remained insignificant on the overlap. Then we ran regression against the likelihood of shopping at a second store against the number of cheaper items at the second store and total number of items, including controls for age and gender. Regression analysis included Ordered Logistic Regression (Table 10) and Linear Regression (Table 11) with a step-wise addition of each variable in our final model: Total Items at Second Store, Gender, and Age. R base version 3.4.2 was utilized with ordered logistic regression performed using the polr function of the MASS package version 7.3-47 and linear regression performed using the stats package of the base R.

H3 (Quantity Bias) is supported in all models. However, H4 (Relative-Quantity) is not supported
in this analysis either. Effect size for quantity bias is medium (r = .28) [30]. Within this study age and gender are not significant.

Online vs. offline effects (Study 3)
Thus far our focus has been on examining crowdsourced price comparison for physical trips to the grocery store. However, quantity bias may also exist for online shopping. Though there is not a physical cost to travel to a second online retailer, there is a cost in terms of time and effort to visit two different online shopping channels. Thus, in study 3 we test to see if the quantity bias exists for online shopping as well. As we argue above, quantity bias is based on the general rule that more items lead to a greater total perceived value for a shopping trip. We believe this inherent bias does not simply vanish in an online setting because the same logic applies, and the same general rule can be derived. Currently, the authors are unaware of any multi-item price comparison websites that are as in-depth as the applications presented for physical store shopping. However, it is not difficult to understand that it is technologically feasible to apply these same design characteristics to online-comparisons of multi-item baskets, especially as online-grocery shopping becomes more commonplace. The decision to make the trip then is a comparison of the value of the trip to the cost of the trip. As online shopping is generally less time consuming than physical store shopping, we expect that the threshold that the value of the trip must overcome is lower for online shopping.
If we are correct, this would result in a main effect of both quantity and shopping method, but a non-significant interaction. Thus, we hypothesize:

Methodology and data collection study 3
We presented a two-by-two matrix, manipulating the number of items at the second store between one and 10, and the purchase location between online and offline. We kept the total number of items, total cost, and savings, constant at 20 items, $50, and $5 respectively. We also kept time to the second store for offline presentation and time to complete transaction on the other website the same at five minutes (see Table 12). Subjects were collected using Amazon Mechanical Turk by advertising a 20-cent Human Intelligence Task (HIT) asking for 1-2 minutes of their time about their shopping habits. Two demographic questions (gender and age), five check questions, and one of the four scenario situations were presented, with likelihood for going to a second store or online retailer collected on a seven-level Likert scale. Subjects from the previous study were excluded as participants in this study, and subjects in this study were only allowed to participate once. (see Fig 5).
A demographic summary of subjects is presented in Table 13.

Results study 3
In this study, we collected responses from 496 participants on Amazon Mechanical Turk. Again, we only use participants that got at least four of five check questions correct. This left

PLOS ONE
Quantity bias in comparison-shopping of multi-item baskets n = 464. Results were re-run against at least three of five, and five of five check questions with similar results. We conducted regression analysis to determine the coefficient size for each presentation, including controls for age and sex. Regression analysis included Ordered Logistic Regression  PLOS ONE (Table 14) and Linear Regression (Table 15) with a step-wise addition of each variable in our final model: # of items at the second store, a dummy variable for Online vs. Offline presentation (Online), Gender, and Age. R base version 3.4.2 was utilized with ordered logistic regression performed using the polr function of the MASS package version 7.3-47 and linear regression performed using the stats package of the base R version.
Both types of regression analysis support H3 (Quantity Bias), H5 (Quantity-Online), and H6 (Quantity-Online Offline-Items) with controls Age and Gender not becoming significant

Discussion and conclusion
Prior research has demonstrated that people are predictably irrational in that they attend to the relative cost savings rather than only attending to the absolute cost savings when making a decision to travel to a store. We add to the literature on predicable irrationality in the decision to travel to another store by proposing that the number of items will also be a significant decision variable for people making a decision about traveling to a store. We term this quantity bias. We test this theory in a series of three studies and consistently produce the result across populations and methods. First, using college students and a within-subjects design, we demonstrate that the bias exists even after controlling for absolute and relative savings. Second, using a sample from Mechanical Turk and a between-subjects design, we confirm quantity bias. Third, using a different sample from Mechanical Turk, we demonstrate that quantity bias exists both in online and physical shopping. Thus, we find quantity bias in three different samples under three different conditions. Effect size for all three studies ranges from small to medium, which indicates that while not as powerful an effect as absolute savings, which has a medium to large effect size in study 1, quantity bias does have an impact on user decisions to go to a further on store, or to make a purchase online at a different retailer.
We also note that there are direct consequences for the designers of comparison-shopping applications that directly influence utility, i.e. the intensity of pleasure or pain that is gained from actions [31]. Including information about the number of items seems to influence consumers, so designers need to proceed carefully when choosing how to display such information. In principle, consumers could be nudged into making better decisions if the information is excluded. However, this strategy could backfire if consumers actually feel bad about buying a single item in one trip. It would be useful to do further experiments to determine if experienced and remembered utility were compromised by having subjects only purchase one item. Thus far we have only demonstrated that anticipated utility is compromised. It is easy to imagine a situation where a comparison-shopping app failed because people complained that it made them take an extra trip for only a few items. However, it is also easy to imagine the same app succeeding because people saved more by not being presented with information that activated predictably irrational behavior. More research is warranted.
There could also be implications that sellers need to consider as these sorts of physical world comparison-shopping applications become more widely adopted. The first is that it does not seem that customers will completely price optimize even after considering travel costs. Thus, strategies like loss leaders [32,33] where a retailer offers an item at a very low cost to get customers into a store to buy other items at regular prices could endure, but the effect of price comparison apps may induce more negative results.
Another interesting thing that we discovered was that while people still experienced quantity bias to a similar degree online, there was a much higher baseline willingness to go to a second store online. This occurs in spite of the fact that in both cases we suggested that the time involved was identical in the online and offline contexts. It might be the case that in this work we unintentionally discovered a second economic decision bias that makes people discount time spent online at a different rate than time spent in the physical world. More research would be required to assess this. However, the immediate implication of our inquiry is that online retailers might be much more susceptible to comparison-shopping even for one or few items.
Research supporting offline time-denominated mental accounts has been reported [34] while other research has not found the same [25,35]. Additionally, some research has shown that people are more price sensitive online than offline, especially when cross-store comparison is made easy [36], while other research has shown lower price sensitivity online [36,37] when differentiated product information is highlighted. Research into the predicable irrationality of both sensitivities and time-denominated mental-accounts is worthy of more research when considering quantity bias. We should also note that there is an alternative rational explanation for quantity bias. If we assume that a store could be out of an item, then we would have to discount the savings from items by the probability of the item being out of stock. In other words, going to a second store would be a gamble between getting a savings and getting nothing if the store is out of stock. For a single item, this would be a particularly risky gamble. Of course, this becomes very complex for multi-item shopping, so more research should be done. However, it does suggest that in addition to costs, applications might want to offer information on quantity or availability of items.
In conclusion, we note that quantity bias seems to be a robust effect, which we were able to duplicate in multiple settings. It is a heretofore unexamined bias that has many important consequences for the design of comparison-shopping systems. It seems to have differential consequences for online vs. offline sellers, so there is an important technology component that needs to be studied. It also has implications for how researchers can model and predict both shopper behavior and higher-level market trends in the presences of comparison-shopping applications. In short, it seems to be a pretty important consideration for both researchers and practitioners, which demands additional study.