The new rank-based concentration index: Further analysis and properties

Tarald O. Kvålseth

doi:10.1371/journal.pone.0343034

Abstract

Additional properties and generalizations are explored for a recently introduced concentration index C_K. The C_K is based on both the distribution of a set of proportions (probabilities) as well as their ranks. The C_K is closely related to and proposed as a preferred alternative to the widely used Q that equals the sum of quadratic terms (proportions). Besides the use of C_K and Q as measures of market or industry concentration, with the proportions being market shares, C_K or its potential transformations can be used as alternative measures in a variety of real measurement situations for which Q has been applied. The extended analysis of C_K includes the proof that C_K is a convex function, which makes it capable of decomposition analysis. The sensitivity and transfer effect of C_K due to changes in the distribution of the proportions is studied. Derivation is given for the so-called numbers equivalent of C_K and for its probability interpretation. Generalizations of C_K are considered for changing the relative emphasis of the component proportions. Randomly generated distributions exemplify the limited effect on C_K from excluding the smallest proportions that are often unavailable in real situations. Numerical comparisons between C_K and other concentration indices are presented for a wide variety of firms or industries. A statistical inference procedure is presented for appropriate situations.

Citation: Kvålseth TO (2026) The new rank-based concentration index: Further analysis and properties. PLoS One 21(2): e0343034. https://doi.org/10.1371/journal.pone.0343034

Editor: Muntazir Hussain, Air University, PAKISTAN

Received: December 5, 2024; Accepted: January 30, 2026; Published: February 24, 2026

Copyright: © 2026 Tarald O. Kvålseth. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data is within the manuscript and its Supporting Information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

The importance of measuring concentration, especially market or industry concentration, is evidenced by the number and variety of measures or indices that have been proposed over the years (e.g., [1]). Those measures have all been defined in terms of proportions or probabilities , or market shares in the case of market (industry) concentration, with for and (or 100%). The most popular one is simply the following sum of quadratic terms:

(1)

As a measure of market concentration, for example, it is known as the Herfindahl-Hirschman index after Herfindahl [2] and Hirschman [3].

However, since those various proposed indices, including (1), lack an important property, the value-validity property, an alternative index based on the ranked components of the distribution has recently been introduced by Kvålseth [1] as follows:

(2)

That is, with the ’s arranged in descending order (tied or equal ’s may be arranged in any order), is simply defined as the largest divided by 1, plus the second largest divided by 2, etc. Thus, is seen to be simply the weighted mean reciprocal rank, i.e., the reciprocal of the ranks 1,...,n weighted with the respective ranked ’s.

This index in (2) was first briefly introduced by Kvålseth [4] as a general measure of homogeneity for categorical data. A particular form of the expression in (2) with and referred to as the mean reciprocal rank is also being used for information retrieval and ranking systems (e.g., [5,6]). Besides being a weighted mean of the reciprocal ranks, the in (2) can also be interpreted in terms of a statistical expectation as the expected reciprocal rank of a randomly chosen observation. Thus, in terms of a random variable X that can take on values with the respective ordered probabilities , becomes the expected value of X.

While in (2) was primarily introduced as a measure of market concentration, the versatility of in (1) also extends to because of a close approximate functional relationship between the two indices [1]. Such variety of applications include as a measure of biological species concentration [7], coincidence in cryptology [8], political consensus (e.g., [9]), and accounting harmonization and standardization (e.g., [10]). The complement 1 − Q(P_n) has been used as a measure of biological diversity (e.g., [11, Ch. 4]), qualitative variation ([12, pp. 70–71]), linguistic diversity ([13,14]), ethnic fractionalization ([15]), political fractionalization ([16, Ch. 2]), quadratic entropy ([17,18, pp. 174–176]; [19,20]). Other functions of Q(P_n) include 1/Q(P_n) as a measure of diversity (e.g., 11, Ch. 4]) and −logQ(P_n) as both the collision entropy [21] and as a measure of biological diversity ([22, p. 311]). Thus, being an approximate function of , but having the advantage of the value-validity property, the in (2) may have other potential applications besides as a measure of market concentration.

With the recent introduction of , its various properties were defined and discussed [1]. Those properties included such generally well-known characteristics as continuity, symmetry, zero-indifference (adding one or more zero-probability components does not affect , Schur-convexity, and value validity. With emphasis on market (industry) concentration, the was compared with other frequently used concentration indices using both randomly generated data and real market-share data. The was also considered in terms of economic theory and market competition, leading to merger implications equivalent to those based on the Herfindahl-Hirschman index [2,3]. Since these properties, discussions, and results were given in a readily available open-access publication [1], there is no need for a repetition.

Rather, the purpose of the present paper is to identify and prove additional properties with relevant and important implications. One such property is the convexity of , which permits subsystem decompositions. A simple probability interpretation of is defined from its functional relationship with . Expressions will be derived for computing the lower bound on and also the so-called numbers equivalent. The transfer and elasticity properties of will be considered as will be its potential parameterized generalizations. Randomly generated distributions will be used to demonstrate the effect of ignoring the smallest ’s, which are often excluded from real data. Real market-share data for a variety of firms (industries) are used to compare the types of numerical values taken on by and other concentration indices. A statistical inference procedure will also be derived and exemplified.

2 Properties of

Before discussing the properties of , a point should be made about the notation used throughout this paper. Thus, with any concentration index C being a function of the distribution , it would be mathematically most correct to use to denote the value of the index (function) C. However, as a matter of simplicity and convenience and when there is no chance of ambiguity, C will be used to denote both the index and its value for .

In the introductory paper on in (2), various properties discussed can be concisely outlined as follows ([1]):

(P1) is simple, comprehensible, and meaningfully interpretable;

(P2) takes on its extreme values for the distributions

(3)

with

(4)

(P3) C_K is (permutation) symmetric with respect to p₁,..., p_n;

(P4) C_K is zero indifferent (expansible), i.e., ;

(P5) C_K is strictly Schur-convex;

(P6) C_K has the value-validity property.

2.1 Convexity of

The fact that is strictly Schur-convex (Property (P5)) does not imply that is convex, which is a stronger requirement. In order to prove the convexity of , a more general formulation will be used and of which is a particular member.

Thus, consider a general class of concentration measures defined as

(5)

where and are both arranged in descending order for . Then, for any ordered distributions and and for any constant , it follows immediately from the definition in (5) that

(6)

where the last equality follows from the (permutation)symmetry of C, i.e., the value of C is invariant with respect to any permutations of the unordered ’s and is always ordered (ranked) as . Next, as an immediate consequence of the definition of majorization ([23, p. 8]),

(7)

where the symbol means that the left side of (7) is majorized by the right side. Then, from (7) and the fact that the type of expression as in (5) is (strictly) Schur-convex [23, pp 160, 639],

(8)

Finally, from (6) and (8),

(9)

which completes the proof that C in (5) is a convex function of . With in (2) being a particular member of C in (5), with for all i, this result proves that is convex.

Note that C in (5) is convex, but not strictly convex. Had C been strictly convex, then the inequality in (9) would have been strict and C in (5) and hence in (2) could not have complied with the value-validity requirement [1]. With and in (6)-(9) replaced with and in (3), respectively, the value-validity property requires that

(10)

which is clearly satisfied by C in (5).

2.2 Comment on

The component of in (4) is recognized as the n-th harmonic number and is of considerable mathematical interest. It is also well known that the logarithmic expression , with 0.5772 being the Euler’s constant (to 4 decimal places), converges quite rapidly to with increasing n. Therefore, an approximate expression for can be defined as

(11)

This approximation is adequate for all practical purposes. In fact, it is found to be correct to at least 3 decimal places when .

It could, of course, be argued that although is mathematically interesting and easily computable from (11), a more convenient and intuitively reasonable lower bound on a concentration index would be 1/n (as in the case of Q in (1)). In order to be defined over the interval [1/n, 1], could be transformed as follows:

(12)

Note, however, that the transformed index in (12) lacks the zero-indifference property (P4). Also, the true n may not necessarily be known in all real situations.

2.3 Numbers equivalent of

Some prefer that a concentration measure should be of a so-called number equivalent or effective number form (e.g., [24]). For any distribution , the numbers equivalent of in (2) can most concisely be defined by the approximate expression

(13)

where is the nearest integer that makes this approximation as accurate as possible. For any given , the value of in (13) could be determined by means of a search procedure or by trial and error based on the expression in (4) or (11).

An alternative approach to obtain would be to explore some potential approximate functional relationship between in (4) and n. Based on exploratory graphical analysis and statistical regression analysis, with parameter estimates rounded off to convenient fractions, the following fitted model has been derived:

(14)

For the fitted model in (14), with the 25 data points for n = 2,4,6,...,50, the coefficient of determination , when properly computed [25], is found to be

. When the predicted is rounded off to the nearest integer, it is found that for n = 1,2,...,50.

The expressions in (13)-(14) can then be used to determine the numbers equivalent as

(15)

for any given distribution . It needs to be emphasized, however, that being a non-linear function of does not meet the value-validity condition in (10) with for . Nevertheless, does provide an alternative interesting interpretation of . For example, consider a market with 30 firms and for which from (15), which means that this 0.45 concentration is equivalent to that of a market with 5 firms of equal size (market share).

2.4 Probability interpretation of

An important result from the original paper on in (2) is the close functional relationship between and the quadratic measure Q in (1) [1]. Specifically, in terms of natural (base-e) logarithms and an exponential term , it was established that

(16)

with a high degree of accuracy . Since Q lacks the value-validity property, (16) can be used as a transformation into that does have this property.

The approximate relationship in (16) can be inverted into

(17)

as a good approximation. This expression provides with another intuitively appealing interpretation: the probability that two randomly chosen observations belong to the same category. In the preceding market concentration example with , from (17). This result means that if two products are chosen at random from within a market, the probability is about 0.15 that they were both produced by the same firm.

2.5 Sensitivity and transfer of

A potentially interesting characteristic of is its sensitivity to the individual components and the form of the distribution or of its rank ordered form This can simply be done by taking partial derivatives of the relative terms for i = 1,...,n and and by treating each as a continuous variable for mathematical purpose. Thus, the sensitivity of to a small change in , with all other kept fixed, can be defined as

(18)

With the interest being the change in , irrespective of being an increase or decrease, the absolute value from (18) provides some clear indications of the sensitivity of to small changes in . A most striking overall observation would seem to be that C is most sensitive to changes in the distribution towards its upper and lower ends. That is, this sensitivity increases with decreasing i for and with increasing i for . For specific components, is particularly sensitive to changes in the extreme components and .

A related characteristic of is the so-called transfer, i.e., the effect on when transferring a small amount from a smaller to a larger . Such a transfer will cause the value of to increase as a consequence of the Schur-convexity of (Property (P5)) [23, Ch. 1]. With the restriction that and , the transfer effect on may be defined as

(19)

which is a similar form of definition to that used by Cowell [26, pp. 57, 154–156] for measures of inequality, except for the relative (versus absolute) difference used in (19). This expression shows that the relative effect on from a small transfer from to depends on their rank difference, but not explicitly on their values. The extreme effect occurs with a transfer from to .

It may be of interest to compare these sensitivity and transfer effects of with those corresponding to Q in (1). Therefore, when the equivalent expressions to (18) and (19) are applied to Q, the following formulations are obtained:

(20)

The general results from (18)-(20) show that the form of the sensitivity to changes in and the transfer effects are comparable for the two measures, with the difference that those of are determined by the ranks while those of Q depend on the ranked ‘s.

3 m-Category

It is clear from the definition of in (2) that when a number of ’s are very small, ignoring those from the computation of only marginally affects the value of . In practice, this characteristic of is in effect an advantage since reported data often ignore very small ’s or group them into an “all others” category. It is therefore worth determining more closely the effect of such exclusion on the value of .

Therefore, expressing as

(21)

the concern is basically with the size of the “error” term . From majorization theory [23, Ch. 1], the following majorization applies:

(22)

so that from (22) and the Schur-convexity of in (21), the following inequality is obtained:

(23)

It is clear from (23), especially from the upper bound, that little information is lost by disregarding the smallest ’s if m is not small.

The extent of in (23) can also be examined empirically by using various distributions . Thus, a random sample of such distributions was generated using the computer algorithm described in [1] in which n and each were generated as random numbers within specified intervals. Specifically, for each randomly generated integer n, each was generated as a random number (to the desired decimals) in decreasing order within the following intervals:

A total of 30 such distributions were randomly generated for and, for each distribution, computations were made for the values of and in (21) for the chosen m = 5 and 10 as given in Table 1.

Download:

Table 1. Values of

in (2) and

in (21) for

and 10 from randomly generated

and

.

https://doi.org/10.1371/journal.pone.0343034.t001

As could reasonably be expected, it is apparent from Table 1 that a substantial amount of information may be lost when using a value of m as small as m = 5. For example, for the error term in (21), it is seen that for Data Sets 3, 8, 14, 16, and 21, with a mean error value of 0.030 for the 30 data sets. By comparison, for m = 10, Table 1 shows that the errors are substantially lower and values of and are generally quite comparable (except for, say, Data Sets 3 and 16). The linear regression of on is found to be with , indicating the considerable agreement between the two indices.

A conservative conclusion from these results would be as follows: utilize all when computing the value of , but if some of the smaller ’s are ignored, the effect on is likely to be rather negligible.

4 Generalizations of

The in (2) could be potentially generalized in a number of different ways by introducing some additional parameter . One such generalization would be the following -order weighted mean of the reciprocal ranks:

(24)

of which is the particular member . Another rather obvious generalization would be

(25)

with being the member . Note that is also a member of the concentration class in (5) with for .

From a property of generalized means (e.g., [27, Ch. III]), the index (family) in (24) is strictly increasing in , whereas the parameterized index in (25) is seen to be decreasing in for any given . An example of these two generalized indices as functions of the parameter for the distribution is given in Fig 1. As , and . The two curves cross at when in (2).

Download:

Fig 1. Example of

in (24) (Curve A) and

in (25) (Curve B) as functions of the parameter

for the distribution P₅ = (0.40, 0.30, 0.15, 0.10, 0.05).

https://doi.org/10.1371/journal.pone.0343034.g001

The effect on these two families of indices from changing is basically one of changing the weights or emphasis given to the different ‘s. While increasing places increasing weights on the larger ’s for in (24), the effect on in (25) is the reverse. The value-validity condition in (10) can be seen to be satisfied by for all , but only for in the case of when in (2).

In spite of the flexibility offered by these two families of potential concentration indices, there seems to be no compelling reason to prefer any particular alternative over in (2) as a single measure. Using such curves as in Fig 1 to provide comparison between the concentration for two distributions and is restricted by by the fact that the two curves may potentially cross such that for some and similarly for . Nevertheless, such generalized formulations may provide some useful information in real applications. In the case of Fig 1, for example, with being a compromise single concentration measure, both curves show precisely how changing emphasis on the larger or smaller component ’s affect the concentration measurement.

5 Statistical inferences about

In situations when are multinominal random sample probabilities with for sample size , it may be of interest to make statistical inferences about C_K in (2), especially confidence-interval construction. That is, if P_n = (p₁,.., p_n) is the sample probability distribution and Π_n = (π₁,..., π_n) is the corresponding population distribution, one may want to make inferences about the population index C_K(Π_n). Besides resampling methods such as bootstrap and jackknife, such statistical inferences can be done by means of the delta method. The delta method is a useful and powerful results of statistical limit theory that is widely discussed in textbooks on categorical data (e.g., [28, Ch. 16], [29, Ch. 14]).

Concisely stated, it follows from the delta method applied to that the following convergence-in-distribution holds:

(26)

so that for a large multinomial sample of size N, the estimator C_K(P_n) is approximately normal with mean and variance . The accuracy of this asymptotic result depends, of course, on the sample size N. By taking the partial derivatives of with respect to and then substituting those with the corresponding sample estimates for , the estimated variance in (26) becomes

(27)

From the definition of in (2), the expression in (27) becomes

(28)

Instead of performing the statistical inferences directly on C_K(Π_n), it is preferable to use the following logarithmic transformation and its inverse:

(29)

since this transformation provides a more rapid convergence to normality and ensures that a confidence internal will always fall inside the [0, 1]-interval (e.g., [28, pp. 70, 618]; [30, p. 106]). The estimated variance of L(P_n) in (29) becomes

(30)

An approximate confidence interval for then becomes

(31)

where is the standard normal quartile (e.g., for and for 95% confidence). The corresponding CI for C_K(Π_n) is then obtained by applying the inverse transformation in (29) to each side of the interval in (31).

As a numerical example, let be a sample distribution based on a sample size . With , it follows from (28) that so that, from (30), . Then, with from (29), a 95% confidence interval for L( from (31) becomes , or [0.6501, 1.2063]. Then, by applying the inverse transformation in (29), a 95% confidence interval for C_K(Π_n) becomes [0.66, 0.77].

6 Discussion

6.1 Illustrative example

As a real example of the computation of the new index defined in (2) and its interpretations, consider the results of the national elections in Norway in 2025. By including only the parties that received at least one percent of the votes (i.e., excluding 13 parties), the following percentage votes were obtained: 28.0, 23.8, 14.6, 5.6, 5.6, 5.3, 4.7, 4.2, 3.7 (for a total of 95.5%). Since these results are given in descending order with , etc., the value of becomes

or, in terms of proportions instead of percentages, . Note that for the two tied results of 5.6%, they are divided by consecutive ranks (rather than their mean). As a limiting case, had all 9 parties’ votes been tied, then or 31%. By comparison, a slightly larger concentration value than that of Norway’s election with is obtained for the 2025 general election in the Czech Republic with (56%) for 7 parties.

As a measure of political dominance (vote concentration), how can the (50%) be interpreted in terms of the extent of such dominance? Since belongs to the interval (0, 1], would seem to imply a political dominance that is neither high nor low. However, had the votes of the top three parties been combined, the resulting value of (74.2%) could arguably be interpreted as a high degree of political dominance.

In terms of a meaningful interpretation, what does the election result actually mean? The answer lies in (17). That is, with in (17), . This means that the probability is about 0.18 that two randomly chosen individuals voted for the same party. By comparison, the corresponding value of in (1) applied directly to the above voting results, gives .

As another meaningful interpretation of the election result , consider the numbers equivalent defined in (15). Then, with in (15), it is found that or about 4. This result means that the political dominance of would be the same as if Norway had had about 4 different political parties with equal party support.

6.2 Empirical comparison with other indices

Among the various alternative concentration indices with similar properties that have been proposed over the years [1], their numerical values may differ greatly for the same data sets. Consequently, the results and conclusions from any data analysis can depend strongly on the index being used. What sets the new index apart from other concentration indices is the so-called value-validity property (Property (P6) stated above). This property imposes a condition specifically on the numerical values taken on by a concentration index to ensure that those values can be justified as providing realistic, true, or valid representations of the concentration characteristic or attribute [1].

Therefore, it is of interest to compare values of with those of other indices for the same data sets. Consider, for example, the following indices:

(32)

where is the well-known 4-firm concentration ratio, HHI is the popular Herfindahl-Hirschman index [2, 3], RHT is the index by Rosenbluth [31] and Hall and Tideman [32], and is a member of a parameterized family of indices by Davies [33]. For this comparative analysis, real market-share data were used as in [1] for a variety of different types of firms or industries. The results are summarized in Table 2.

Download:

Table 2. Values of

in (2) and other concentration indices in (32) for a sample of market-share data from various types of markets or industries.

https://doi.org/10.1371/journal.pone.0343034.t002

These results show clearly how the values of various indices can differ substantially for the same sets of data. These indices range in potential value from 0 to 1, except for which can range from 0 to n, between the two extreme distributions in (3). Values of are seen to be consistently larger than those of HHI and RHT and frequently larger than those of (except for examples 12, 14, and 20) even though the potential range of values is greatest for . As expected from its definition in (32), values are seen to be consistently greater than those of , HHI, and RHT.

For each index, the results in Table 2 show that the market (industry) with the highest market concentration was that of the leading search engines in Norway (in 2020) (Example 12) while the lowest concentration occurred for the best-selling cars in Britain (in 2018) (Example 8). Although each of these indices provide the same rank order for these two extreme cases, other order (“larger than”) comparisons vary considerably between the indices. In the case of and RHT, for example, shows greater concentration for airline travel (number of flights per weekday between London and New York by 30 different carriers in 2000) (Example 2) than for global pharmaceutical products (Example 5) while this order is reversed when using RHT. Note that these two indices are the only rank-based ones. Similarly, when comparing and HHI, for instance, reverse order occurs between Examples 5 and 6 and between Examples 10 and 15.

Even though is highly correlated with the other indices, with Pearson’s correlation coefficient between and , HHI, RHT, and , respectively, for the data in Table 2, it is clear from these exemplary data that different indices can provide substantially different and contradictory assessment of concentration. Although these data are based on the market shares of a wide variety of markets or industries, similar results can be expected of any other real situations or applications involving some distribution . In spite of the fact that many proposed concentration indices share several of the above properties (P1)-(P6), as well as the convexity property, they lack one important property: the value-validity property (P6), which only has.

7 Conclusion

When considering the results derived in this paper together with those previously reported [1], one conclusion would seem to be clear: has the various types of properties required of an appropriate concentration measure. One of the interesting features of is its close functional relationship to the quadratic index Q in (1), implying that the various applications of Q or its functions can also be considered for .

There is, however, an important difference between and Q: has the value-validity property (Property(P6)), but Q does not since it cannot satisfy the equivalent of the equality part of (10) (with C = Q and , ) because its convexity is strict. The value-validity property is considered to be necessary in order to make true and reliable difference comparisons for the concentration characteristic. The validity of such comparisons is essential for determining trend information such as changes in concentration over time periods. Without the value-validity property, different indices can produce widely differing results and conclusions as demonstrated by using real market-share data for a variety of markets (industries). In that analysis, the popular Herfindahl-Hirschman index HHI in (32) equals Q in (1).

The parameterized generalizations in (24)-(25) do provide for some potentially interesting assessment of the effect on the concentration values caused by varying the relative weight or emphasis assigned to the ordered distribution components . However, as a choice for a single concentration index, there would seem to be no particular reason for a preference other than .

Supporting information

S1 Table.

Underlying data distributions .

https://doi.org/10.1371/journal.pone.0343034.s001

(PDF)

References

1. Kvålseth TO. Measurement of market (industry) concentration based on value validity. PLOS One. 2022;17(7):1–24.
- View Article
- Google Scholar
2. Herfindahl OC. Concentration in the steel industry. New York, NY: Columbia University. 1950.
3. Hirschman AC. National power and the structure of foreign trade. Berkeley, CA: University of California Press. 1945.
4. Kvålseth TO. A measure of homogeneity for nominal categorical data. Percept Mot Skills. 1993;76:1129–30.
- View Article
- Google Scholar
5. Craswell N. Mean Reciprocal Rank. Encyclopedia of Database Systems. Springer US. 2009. 1703–1703. https://doi.org/10.1007/978-0-387-39940-9_488
6. Olaosebikan R, Akinwonmi AE, Ojokoh BA, Daramola OA, Adeola OS. Development of a Best Answer Recommendation Model in a Community Question Answering (CQA) System. IIM. 2021;13(03):180–98.
- View Article
- Google Scholar
7. Simpson EH. Measurement of Diversity. Nature. 1949;163(4148):688–688.
- View Article
- Google Scholar
8. Friedman WF. The index of coincidence and its applications in cryptography. Geneva, IL: Riverbank Laboratories. 1922.
9. Alcantud JCR, Torrecillas MJM. Consensus measures for various informational bases. Three new proposals and two case studies from political science. Qual Quant. 2017;51(1):285–306.
- View Article
- Google Scholar
10. Mijoč I, Starčević DP. Measurement of accounting harmonization and standardization. MEST J. 2013;1(1):126–36.
- View Article
- Google Scholar
11. Magurran AE. Measurement of biological diversity. Malden, MA: Blackwell Science. 2004.
12. Weisberg HF. Central tendency and variability. Newbury Park, CA: Sage. 1992.
13. Greenberg JH. The measurement of linguistic diversity. Language. 1956;32(1):109–15.
- View Article
- Google Scholar
14. Gazzola M, Templin T, McEntee-Atalianis LJ. Measuring diversity in multilingual communication. Soc Indic Res. 2020;147(2):545–66.
- View Article
- Google Scholar
15. Drazanova L. Introducing the Historical Index of Ethnic Fractionalization (HIEF) Dataset: Accounting for Longitudinal Changes in Ethnic Diversity. Journal of Open Humanities Data. 2020;6.
- View Article
- Google Scholar
16. Rae DW, Taylor M. The analysis of political cleavages. New Haven, CT: Yale University Press. 1970.
17. Vajda I. Bounds on the minimal error probability in checking a finite or countable number of hypotheses. Problemy Peredachi Informatsii. 1968;4(1):9–17.
- View Article
- Google Scholar
18. Arndt C. Information measures: information and its description in science and engineering. Berlin: Springer. 2004.
19. Ellerman D. New foundations for information theory: logical entropy and Shannon entropy. Chan, Switzerland: Springer. 2021.
20. Kvålseth TO. Entropies and Their Concavity and Schur-Concavity Conditions. IEEE Access. 2022;10:96006–15.
- View Article
- Google Scholar
21. Bosyk GM, Portesi M, Plastino A. Collision entropy and optimal uncertainty. Physical Review A: Atomic, Molecular, and Optical Physics. 2012;85(1):012108.
- View Article
- Google Scholar
22. Pielou EC. Mathematical ecology. New York: Wiley. 1977.
23. Marshall AW, Olkin I, Arnold BC. Inequalities: theory of majorization and its application. 2nd ed. New York: Springer. 2011.
24. Hannah L, Kay JA. Concentration in modern industry. London, UK: Macmillan. 1977.
25. Kvålseth TO. Cautionary note about R2. Am Statistician. 1985;39(4):279–85.
- View Article
- Google Scholar
26. Cowell FA. Measuring inequality. 3rd ed. Oxford, UK: Oxford University Press. 2011.
27. Bullen PS. Handbook of means and their inequalities. Dordrecht, The Netherlands: Kluwer. 2003.
28. Agresti A. Categorical data analysis. 3rd ed. Hoboken, N.J.: Wiley. 2013.
29. Bishop YMM, Fienberg SE, Holland PW. Discrete multivariate analysis: theory and practice. Cambridge, MA: MIT Press. 1975.
30. Lloyd CJ. Analysis of categorical data. New York: Wiley. 1999.
31. Amdt S. Round table-Gespräch über Messung der industriellen Konzentration. Die Konzentration in der Wirtshaft. Berlin, Germany: Duncker & Humbolt. 1961. 391.
32. Hall M, Tideman N. Measures of concentration. J Am Stat Assoc. 1967;62(317):162–8.
- View Article
- Google Scholar
33. Davies S. Measuring industrial concentration: an alternative approach. Rev Econ Stat. 1980;62:306–9.
- View Article
- Google Scholar
34. Lijesen MG, Nijkamp P, Rietveld P. Measuring competition in civil aviation. Journal of Air Transport Management. 2002;8(3):189–97.
- View Article
- Google Scholar
35. Bain JS. Industrial organization. New York: Wiley. 1959.
36. Market Share Reporter. 27th ed. Gale. 2017.
37. Lindsley CW. New 2016 Data and Statistics for Global Pharmaceutical Products and Projections through 2017. ACS Chem Neurosci. 2017;8(8):1635–6. pmid:28810746
- View Article
- PubMed/NCBI
- Google Scholar
38. Statista. Market share of the leading insurance companies in Belgium as of 2016. 2018. https://www.statista.com/statistics/780454/market-share-leading-insurance-companies-belgium/
39. Market share of the leading exporters of major weapons between 2013-2017, by country. 2018. https://www.statista.com/statistics/267131/market-share-of-the-leading-exporters-of-conventional-weapons/
40. SMMT. Best-selling care marques in Britain in 2018 (Q1). 2018. https://www.best-selling-cars.com/britain-uk/2018-q1-britain-best-selling-car-brands-and-models/
41. Statista. U.S. market share of selected automobile manufacturers 2013. 2014. https://www.statista.com/statistics/249375/us-market-share-of-selected-automobile-manufacturers/
42. Statista. Market share held by the leading search engines in Norway as of July 2020. https://www-statista-com.ezp2.lib.umn.edu/statistics/621417/most-popular-search-engines-in-Norway/
43. Statista. Commercial water heater market share in the United States 2017-2019, by company. https://www-statista-co.epz2.lib.umn.edu/statistics/700299/us-commercial-gas-water-heater-market-share/
44. Statista. Global microprocessor market share from 1st quarter 2009 to 3rd quarter 2011, by vendor. https://www.statista.com/statistics/270560/global-microprocessor-market-share-since-2009-by-vendor/
45. Statista. U.S. market share of selected automobile manufacturers 2013Global cigarettes market share as of 2019, by company. https://www.statista.com/statistics279873/global-cigarette-market-share-by-group/
46. Carsalesbase.com. Global car sales analysis 2017-Q1. 2017. http://carsalesbase.com/global-car-sales-2017-q1/
47. Statista. Share of the world’s leading consumer products companies 2013, by product sector. https://www-statista-co.epz2.lib.umn.edu/statistics/225768/share-of-the-leading-consumer-products-companies-worldwide-by-product-sector/

[ref1] 1. Kvålseth TO. Measurement of market (industry) concentration based on value validity. PLOS One. 2022;17(7):1–24.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Herfindahl OC. Concentration in the steel industry. New York, NY: Columbia University. 1950.

[ref3] 3. Hirschman AC. National power and the structure of foreign trade. Berkeley, CA: University of California Press. 1945.

[ref4] 4. Kvålseth TO. A measure of homogeneity for nominal categorical data. Percept Mot Skills. 1993;76:1129–30.
View Article
Google Scholar

[7] View Article

[8] Google Scholar

[ref5] 5. Craswell N. Mean Reciprocal Rank. Encyclopedia of Database Systems. Springer US. 2009. 1703–1703. https://doi.org/10.1007/978-0-387-39940-9_488

[ref6] 6. Olaosebikan R, Akinwonmi AE, Ojokoh BA, Daramola OA, Adeola OS. Development of a Best Answer Recommendation Model in a Community Question Answering (CQA) System. IIM. 2021;13(03):180–98.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref7] 7. Simpson EH. Measurement of Diversity. Nature. 1949;163(4148):688–688.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref8] 8. Friedman WF. The index of coincidence and its applications in cryptography. Geneva, IL: Riverbank Laboratories. 1922.

[ref9] 9. Alcantud JCR, Torrecillas MJM. Consensus measures for various informational bases. Three new proposals and two case studies from political science. Qual Quant. 2017;51(1):285–306.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref10] 10. Mijoč I, Starčević DP. Measurement of accounting harmonization and standardization. MEST J. 2013;1(1):126–36.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref11] 11. Magurran AE. Measurement of biological diversity. Malden, MA: Blackwell Science. 2004.

[ref12] 12. Weisberg HF. Central tendency and variability. Newbury Park, CA: Sage. 1992.

[ref13] 13. Greenberg JH. The measurement of linguistic diversity. Language. 1956;32(1):109–15.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref14] 14. Gazzola M, Templin T, McEntee-Atalianis LJ. Measuring diversity in multilingual communication. Soc Indic Res. 2020;147(2):545–66.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref15] 15. Drazanova L. Introducing the Historical Index of Ethnic Fractionalization (HIEF) Dataset: Accounting for Longitudinal Changes in Ethnic Diversity. Journal of Open Humanities Data. 2020;6.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref16] 16. Rae DW, Taylor M. The analysis of political cleavages. New Haven, CT: Yale University Press. 1970.

[ref17] 17. Vajda I. Bounds on the minimal error probability in checking a finite or countable number of hypotheses. Problemy Peredachi Informatsii. 1968;4(1):9–17.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref18] 18. Arndt C. Information measures: information and its description in science and engineering. Berlin: Springer. 2004.

[ref19] 19. Ellerman D. New foundations for information theory: logical entropy and Shannon entropy. Chan, Switzerland: Springer. 2021.

[ref20] 20. Kvålseth TO. Entropies and Their Concavity and Schur-Concavity Conditions. IEEE Access. 2022;10:96006–15.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref21] 21. Bosyk GM, Portesi M, Plastino A. Collision entropy and optimal uncertainty. Physical Review A: Atomic, Molecular, and Optical Physics. 2012;85(1):012108.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref22] 22. Pielou EC. Mathematical ecology. New York: Wiley. 1977.

[ref23] 23. Marshall AW, Olkin I, Arnold BC. Inequalities: theory of majorization and its application. 2nd ed. New York: Springer. 2011.

[ref24] 24. Hannah L, Kay JA. Concentration in modern industry. London, UK: Macmillan. 1977.

[ref25] 25. Kvålseth TO. Cautionary note about R2. Am Statistician. 1985;39(4):279–85.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref26] 26. Cowell FA. Measuring inequality. 3rd ed. Oxford, UK: Oxford University Press. 2011.

[ref27] 27. Bullen PS. Handbook of means and their inequalities. Dordrecht, The Netherlands: Kluwer. 2003.

[ref28] 28. Agresti A. Categorical data analysis. 3rd ed. Hoboken, N.J.: Wiley. 2013.

[ref29] 29. Bishop YMM, Fienberg SE, Holland PW. Discrete multivariate analysis: theory and practice. Cambridge, MA: MIT Press. 1975.

[ref30] 30. Lloyd CJ. Analysis of categorical data. New York: Wiley. 1999.

[ref31] 31. Amdt S. Round table-Gespräch über Messung der industriellen Konzentration. Die Konzentration in der Wirtshaft. Berlin, Germany: Duncker & Humbolt. 1961. 391.

[ref32] 32. Hall M, Tideman N. Measures of concentration. J Am Stat Assoc. 1967;62(317):162–8.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref33] 33. Davies S. Measuring industrial concentration: an alternative approach. Rev Econ Stat. 1980;62:306–9.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref34] 34. Lijesen MG, Nijkamp P, Rietveld P. Measuring competition in civil aviation. Journal of Air Transport Management. 2002;8(3):189–97.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref35] 35. Bain JS. Industrial organization. New York: Wiley. 1959.

[ref36] 36. Market Share Reporter. 27th ed. Gale. 2017.

[ref37] 37. Lindsley CW. New 2016 Data and Statistics for Global Pharmaceutical Products and Projections through 2017. ACS Chem Neurosci. 2017;8(8):1635–6. pmid:28810746
View Article
PubMed/NCBI
Google Scholar

[70] View Article

[71] PubMed/NCBI

[72] Google Scholar

[ref38] 38. Statista. Market share of the leading insurance companies in Belgium as of 2016. 2018. https://www.statista.com/statistics/780454/market-share-leading-insurance-companies-belgium/

[ref39] 39. Market share of the leading exporters of major weapons between 2013-2017, by country. 2018. https://www.statista.com/statistics/267131/market-share-of-the-leading-exporters-of-conventional-weapons/

[ref40] 40. SMMT. Best-selling care marques in Britain in 2018 (Q1). 2018. https://www.best-selling-cars.com/britain-uk/2018-q1-britain-best-selling-car-brands-and-models/

[ref41] 41. Statista. U.S. market share of selected automobile manufacturers 2013. 2014. https://www.statista.com/statistics/249375/us-market-share-of-selected-automobile-manufacturers/

[ref42] 42. Statista. Market share held by the leading search engines in Norway as of July 2020. https://www-statista-com.ezp2.lib.umn.edu/statistics/621417/most-popular-search-engines-in-Norway/

[ref43] 43. Statista. Commercial water heater market share in the United States 2017-2019, by company. https://www-statista-co.epz2.lib.umn.edu/statistics/700299/us-commercial-gas-water-heater-market-share/

[ref44] 44. Statista. Global microprocessor market share from 1st quarter 2009 to 3rd quarter 2011, by vendor. https://www.statista.com/statistics/270560/global-microprocessor-market-share-since-2009-by-vendor/

[ref45] 45. Statista. U.S. market share of selected automobile manufacturers 2013Global cigarettes market share as of 2019, by company. https://www.statista.com/statistics279873/global-cigarette-market-share-by-group/

[ref46] 46. Carsalesbase.com. Global car sales analysis 2017-Q1. 2017. http://carsalesbase.com/global-car-sales-2017-q1/

[ref47] 47. Statista. Share of the world’s leading consumer products companies 2013, by product sector. https://www-statista-co.epz2.lib.umn.edu/statistics/225768/share-of-the-leading-consumer-products-companies-worldwide-by-product-sector/

Figures

Abstract

1 Introduction

2 Properties of

2.1 Convexity of

2.2 Comment on

2.3 Numbers equivalent of

2.4 Probability interpretation of

2.5 Sensitivity and transfer of

3 m-Category

4 Generalizations of

5 Statistical inferences about

6 Discussion

6.1 Illustrative example

6.2 Empirical comparison with other indices

7 Conclusion

Supporting information

S1 Table.

References