^{1}

^{*}

^{2}

^{3}

Analyzed the data: SP WHW GSH. Contributed reagents/materials/analysis tools: SP WHW. Wrote the paper: SP WHW GSH. Conceived the original idea and designed the study: SP.

The authors have declared that no competing interests exist.

Transitive inference, class inclusion and a variety of other inferential abilities have strikingly similar developmental profiles—all are acquired around the age of five. Yet, little is known about the reasons for this correspondence. Category theory was invented as a formal means of establishing commonalities between various mathematical structures. We use category theory to show that transitive inference and class inclusion involve

Children acquire various reasoning skills during a remarkably similar period of development. Yet, the reasons for these similarities are a mystery. Two examples are Transitive Inference and Class Inclusion, which develop around five years of age. Older children understand that if John is taller than Mary, and Mary is taller than Sue, then John is also taller than Sue. This form of reasoning is called transitive inference. Older children also understand that there are more fruits than apples. This inference is called class inclusion. We explain why these and a variety of other abilities show the same development using a branch of mathematics called category theory. Category theory reveals that they have related underlying structure. So, despite their apparent superficial differences these reasoning abilities have similar profiles of development because they involve related sorts of processes.

Children acquire various reasoning skills over remarkably similar periods of development. Transitive Inference and Class Inclusion are two behaviours among a suite of inferential abilities that have strikingly similar developmental profiles—all are acquired around the age of five years

Since Piaget, decades of research have revealed important clues regarding the development of inference, yet little is known about the reasons underlying these correspondences (see

This theoretical difficulty is symptomatic of the general problem in cognitive science where the basic components of cognition are unknown. In the absence of such detailed knowledge, cognitive modelers have been forced to assume a particular representational format (e.g., symbolic

In this section, we provide the basic category theory definitions and constructs used in our subsequent analysis of various inferential abilities. Detailed introductions to category theory are found in

A

a class

a set

a morphism

a composition operation, denoted “

One immediately recognizable example is the category

Categories exist for a diverse range of structures, with objects more complex than sets of elements, and structure-preserving morphisms more complex than associations. For example, the following morphism

We need to introduce the notion of the

Some examples of duals involve certain types of morphisms, called epimorphisms, monomorphisms and isomorphisms. A morphism

Cognitive behavior generally involves some means of integrating information. A general notion of integration is the categorical product. In any category

In

One can think of tasks involving stimuli that vary along two task-relevant dimensions as examples involving categorical products. For example, classification tasks where the rule is based on, say, stimulus colour and size involves a product, with the set of task stimuli as the product object and the determination of colour and size features as the projection morphisms. Conservation tasks, for example, predicting whether the amount of liquid in one container is the same as another where the containers vary in, say, height and width also involve products. In this case, the product object is a set of volumes and the projection maps recover the associated heights and widths. We will see further examples of tasks involving products in the next section.

For our purposes, the categorical product

A related notion of information integration is the categorical coproduct. In any category

In

If we reverse all the arrows in the definition of a coproduct we get a product. A product in a category

One way to think about coproducts in terms of cognitive tasks is to regard the label as the context or condition under which a stimulus is associated with a particular action. Experimental paradigms designed to assess cognitive flexibility, such as the Wisconsin Card Sorting Task, are examples. For instance, in one context, say, a reward schedule based on colour, a red triangle may require one type of response, but for a reward schedule based on shape, the red triangle requires a different type of response. In this case, the coproduct object is the disjoint union of the stimulus set with itself with colour and shape as labels, and the response is determined by a map from the coproduct object to a set of actions.

More generally, information integration is often subject to satisfying some constraint. Hence, product and coproduct are instances of more general constructs known as pullbacks and pushouts, respectively. A

Intersection is an example of pullback in

Pushout is dual to pullback. A

Given the duality, union is an example of pushout in

For our purposes, the commutative squares in the pullback and pushout diagrams pertain to statements about cognitive (sub)systems, and

An

These “special” cases are important for determining whether a system that apparently involves a (co)product is in fact isomorphic to one that does not. We will see an example of this situation in the next section. In these situations, we say that task difficulty is related to the simpler, non-(co)product form.

Notice that we could have explained all this just in terms of the particular product, coproduct, pullback, pushout, initial and terminal object that prevail in

In this section, we apply category theory concepts to the analysis of results from several studies that have provided empirical evidence of within group similarities and between group differences in behavioural performance across multiple tasks. The objective is to identify a formal basis for an equivalence class of tasks that accounts for these similarities and differences. Our seed paradigms are Transitive Inference and Class Inclusion, which were tested on age groups ranging from three to eight years

The data of primary concern here are the correlations in achievement across paradigms and the significant differences between age groups within paradigms. Age five is regarded as a “nominal” timepoint in that some children exhibit success at a younger or older age. For example, 11% of the three- and four-year-olds, and 71% of six-year-olds succeeded on Transitive Inference, and respectively 15% and 67% succeeded on Class Inclusion

A transitive inference has the general form that given

Transitivity is a property of relations, so a transitive inference is just a particular operation in relational algebra. In relational algebra, an

To contrast younger versus older children's performance, children were presented with difficult and simple versions of this paradigm

In a Class Inclusion task, participants are given examples of a superclass, and two complementary subclasses and asked about their relative sizes. For example, given the superclass,

Class inclusion is a property of sets, so a class inclusion inference involves a particular set operation—disjoint union. As we have seen, the disjoint union of two objects in the category of sets is the coproduct. Suppose, for example, the set of apple referents, or indices

The same groups of children who were tested on Transitive Inference were also tested on Class Inclusion

There is a subtle difference between the diagrams for Transitive Inference and Class Inclusion. Transitive Inference involves a

Transitive Inference and Class Inclusion are both difficult for children below about the age of five years. Our analysis indicates that underlying this common difficulty is a lack of capacity to compute categorical (co)products. In the remainder of this section, we analyze other tasks used to compare performance within and contrast performance between groups of younger and older children.

In a modified version of Matrix Completion

Participants are presented with rows of items (e.g., four ducks, five frogs, or seven balls), and are asked three types of questions: (1)

In this task, participants are presented cards identifiable by coloured shapes on the visible side. Two target cards are placed on a table. Children play two sorting games by placing additional (sort) cards under one of the target cards based on the same colour (colour game), or same shape (shape game). Suppose, for example, the target cards were labeled

In a modified form of the balance-scale task, called weight-distance integration, participants were shown a one-arm balance and asked to predict the degree of tilt given a weight placed at a distance from the pivot

For this paradigm two sorts of tasks were employed: appearance-reality; and false-belief

The distinguishing characteristic at the heart of the behavioural difference between younger (less than five years old) versus older (more than five years old) children is the categorical (co)product. In the case of Transitive Inference, Matrix Completion, and Card Sorting, this difference was realised by task design (e.g., one versus two relevant feature dimensions). In the case of Class Inclusion and Cardinality, this difference was realized by questions probing, for example, one versus two feature dimensions. And, in the case of Balance-scale and Theory of Mind, this difference was realized by alternative task strategies as inferred from the types of response errors. In each paradigm, the more difficult situation observed in the older children required access to a (co)product. By contrast, the less difficult situation observed in younger and older children involved directly accessing the component objects without computing or accessing a (co)product. These correspondences have been confirmed directly with the same participants performing multiple paradigms that included: Transitive Inference, Class Inclusion, and Cardinality

So far, our analysis has been confined to early development around the age of five, where the capacity to compute (co)products was identified as crucial. The more interesting statistic for our purposes is the correlation across paradigms, rather than a specific age of attainment. That is, for example, whether or not a four(six)-year-old who succeeds (fails) at Transitive Inference also succeeds (fails) at Class Inclusion. However, the simpler versions of these tasks often form baselines that are within the capacity of all children. In these situations,

A number of studies point to higher complexity levels, at least in adult cognition. For example, adults were tested on their ability to identify the number of interactions underlying fictitious data sets reported as bar graphs

In category theory, the (co)product extends naturally to any finite number of objects. Moreover, the degenerate case where the number of objects is one corresponds to the

In any category

The finite coproduct is defined similarly. In any category

There are four ways of constructing a product of three objects, and the product objects, though not equal, are isomorphic, i.e.,

There are also four ways of constructing a coproduct of three objects, and the coproduct objects are also isomorphic, i.e.,

All seven paradigms analyzed in the previous section can be extended in terms of (co)products of more than two objects. Only ternary (co)products are considered here, but extensions to more objects are also possible. We focus on Transitive Inference and Class Inclusion, and sketch extensions to the other paradigms. Transitive Inference can be extended to include an additional premise EF, and an additional nonadjacent test pair BE that requires two equijoins, for example, BC and CD to infer BD, and BD and DE to infer BE. In category theory terms, this inference involves three pullbacks, indicated by the diagram

Class Inclusion can be extended dually by supposing an additional subclass (e.g., squares). For example, participants are presented with small blue triangles (T), small red circles (C) and large red squares (S). They are asked: (1)

Notice that although this diagram involves a ternary coproduct, which is dual to a ternary product, the diagram itself is not dual to Diagram 31 for extended Transitive Inference. The reason is that the two initial objects in the extended Class Inclusion diagram are the same (and so too are the two morphisms with

An alternative version of extended Class Inclusion that uses constrained coproducts involves subclasses containing common elements. For example, suppose that instead we have a collection of small and large rectangular bars of various colours and orientations. Within this collection, three subclasses are relevant: small (

Matrix Completion, Dimensional Change Card Sorting, and Balance-scale involve similar extensions to ternary products, though the latter two are redesigned to accommodate all three levels of products (i.e., unary, binary, and ternary) within the one paradigm. For Matrix Completion, the figures vary along a third feature dimension, such as size. In this case, the task involves a ternary product of colour, shape and size, i.e.,

Theory of Mind and Cardinality involve more substantial changes, so we address these two tasks separately. Theory of Mind can be extended by including an additional transformation condition that involves mixing powered chocolate, which changes the colour of milk to brown. In this case, there are two binary coproducts for separately combining the reality and filtered glass contexts, and reality and mixing contexts, and one ternary coproduct for combining all three contexts, as indicated by the following diagram

Cardinality in its current form, though, does not appear to have an extension to ternary coproducts. A possible alternative form, similar to extended Class Inclusion, requires participants to count various combinations subclasses/superclasses (e.g., triangles, circle and squares). The diagram for this case involves a ternary coproduct like the one for unconstrained or constrained Class Inclusion (see Diagram 32 and 33), where the objects are sets of indices and the coproduct is disjoint union (see Diagram 16).

For some tasks, there may exist alternative task strategies for achieving the same goals without exceeding capacity limits. In the context of Relational Complexity Theory, two general strategies were identified as segmenting and chunking

From the definitions, we saw four ways of constructing ternary (co)product objects. Although these objects are isomorphic, Diagram 27 and 28 for constructing

A related situation also arises for Class Inclusion, which is shown in the following diagram

Using category theory constructs, we have revealed a formal connection between Transitive Inference and Class Inclusion. Transitive Inference involves a categorical product of premise relations. Class Inclusion involves a coproduct between two complementary subclasses. In category theory, product and coproduct are dual. Thus, the formal connection between Transitive Inference and Class Inclusion is that they involve the “same” (isomorphic) processes in the categorical sense. This connection extends to other tasks establishing an equivalence class of inferential abilities formally based on the need to compute (co)products. In the simpler, one-dimensional version of Matrix Completion, the apparent product is isomorphic to a structure that does not involve a (co)product. Note that children are not required to first compute the (co)product to realize that it's reducible: they use a (co)product-free strategy which works, because of the simpler nature of the task. These results point to a fundamental principle under development during childhood that is the capacity to compute (co)products.

The implication that computing (co)products is fundamental to cognitive development raises two general questions: (1) Is the connection between these inferential abilities real, or just a coincidence? (2) If the connection is real, what does computing a (co)product mean in terms of possible neurocognitive processes? To the first question, as with any theory, one cannot rule out the possibility of being discounted by new data. The best one can hope for is to account for a wide variety of cases that are within the intended scope of the theory. In this regard, the empirical evidence now available and the variety of cases analyzed, both positive and negative conditions in each of seven paradigms, gives us cause for confidence that the connection is indeed real.

There are several caveats, however, in regard to establishing correspondences between paradigms and age groups. First, as already mentioned, an important consideration is the correlation across paradigms, not a specific age of achievement. Second, task knowledge and familiarity with materials will obviously be modulating factors. Third, in some cases, there may exist alternative task strategies that circumvent a particular level of complexity, as shown in the extended versions of Transitive Inference and Class Inclusion. These sorts of considerations have been discussed elsewhere in the context of Relational Complexity theory

Category theory offers a potentially powerful approach to theorizing about cognition by not having to presuppose an, as yet, unknown internal structure for cognitive states representing task elements. Notice that the definition of a functor, and therefore duality (see

While the abstractness afforded by category theory is generally seen as a strength, it leaves open the question of what exactly is being computed in these situations. To the second question, then, we look to neuroscience. One of the major attractions of category theory for mathematicians and computer scientists is that it offers abstraction (hence, generalization)

Research on the neural basis of reasoning has focussed on localizing functionality to specific cortical regions, particularly within the prefrontal cortex. Yet, the commutative diagrams clearly show the importance of transformations between objects. One intriguing possibility is that the morphisms correspond to functional connectivity realized in part by long-distance cortical connections. An area where the neural basis of cognitive function has been studied in detail is visual attention (see

A recurring theme in our analysis of these tasks is the integration (either multiplicatively, or additively) of multiple sources of information. Regions within the prefrontal cortex are often assigned this role, both anatomically and functionally (see

More generally, we have used category theory to propose new experiments that directly test comparisons and contrasts for all levels. The basis for determining whether tasks belong to the same level is isomorphism, either between objects or the diagrams (categories) to which they belong. In regard to the latter, we identified a subtle difference between diagrams containing constrained versus unconstrained (co)products. This difference speaks to the potential power of category theory in that it affords a finer grained analysis within the major levels defined by (co)product arity (i.e., unary, binary, ternary, etc). Although further work is needed to ascertain the empirical implications of these differences both within and across higher levels, the examples provided show how this work may proceed.

There are two main types of predictions for these extended paradigms that follow naturally from the arities of computed (co)oroducts. They are: (1) tasks involving (co)products of arity

One may wonder whether other category theory-based models could account for the same developmental data.

These two categorical bases for Transitive Inference (

Category theory affords a view of the forest despite the trees. It helps reveal unseen connections between (cognitive) structures. And, in doing so, the methods and results from one field become applicable to another. That was the original motivation for having a science of cognition.

Duality

(0.09 MB PDF)

We thank the reviewers for comments that have helped improve the presentation of this work, and the suggested link to the notion of cognitive flexibility.