Co-Housing Rodents with Different Coat Colours as a Simple, Non-Invasive Means of Individual Identification: Validating Mixed-Strain Housing for C57BL/6 and DBA/2 Mice

Standard practice typically requires the marking of laboratory mice so that they can be individually identified. However, many of the common methods compromise the welfare of the individuals being marked (as well as requiring time, effort, and/or resources on the part of researchers and technicians). Mixing strains of different colour within a cage would allow them to be readily visually identifiable, negating the need for more invasive marking techniques. Here we assess the impact that mixed strain housing has on the phenotypes of female C57BL/6 (black) and DBA/2 (brown) mice, and on the variability in the data obtained from them. Mice were housed in either mixed strain or single strain pairs for 19 weeks, and their phenotypes then assessed using 23 different behavioural, morphological, haematological and physiological measures widely used in research and/or important for assessing mouse welfare. No negative effects of mixed strain housing could be found on the phenotypes of either strain, including variables relevant to welfare. Differences and similarities between the two strains were almost all as expected from previously published studies, and none were affected by whether mice were housed in mixed- or single-strain pairs. Only one significant main effect of housing type was detected: mixed strain pairs had smaller red blood cell distribution widths, a measure suggesting better health (findings that now need replicating in case they were Type 1 errors resulting from our multiplicity of tests). Furthermore, mixed strain housing did not increase the variation in data obtained from the mice: the standard errors for all variables were essentially identical between the two housing conditions. Mixed strain housing also made animals very easy to distinguish while in the home cage. Female DBA/2 and C57BL/6 mice can thus be housed in mixed strain pairs for identification purposes, with no apparent negative effects on their welfare or the data they generate. This suggests that there is much value in exploring other combinations of strains.


Introduction
Individual identification provides the only link between a subject and the data collected from it. Many research paradigms and experiments therefore require the individual marking of laboratory rodents. Three broad methods are common: temporary markings (e.g. tail marking with a marker pen [1] or shaving a patch of hair [2]), permanent mutilations (e.g. ear notching [3] or toe clipping [4]), or the addition of permanent identification tags (e.g. tattooing [5] or micro-chipping [6]). Methods are constantly being refined and improved (e.g. [7]). Nevertheless, as we review below, all common marking methods have the potential to negatively impact animal welfare or influence the results obtained from them; they may also be laborious and/or costly for researchers. Temporary markings, for example, often need to be reapplied at regular intervals (e.g. [1]), especially in mice [8], which is timeconsuming. Human handling and restraint are also aversive and stressful to mice [9][10][11], as is the scent of marker pen to rats [1]. Furthermore, rats tail-marked with ink show altered behaviour in standardized tests (being more likely to enter, and spend more time in, the open arms of an elevated plus maze [1]). Turning to mutilations, ear notching without analgesia causes acute pain, as evidenced by a short term sympathetic stress response (assessed via increases in blood pressure [7]) and an increased number of audible vocalizations compared with sham treated control mice [12] (audible vocalizations are an established indicator of pain in rodents [13]). The toe clipping of neonatal mice (,5-7 days old) does not appear to induce a stress response any more than regular handling, in contrast, nor have any negative long-term consequences on health or performance [4,14]. However, some caution is needed here: there is a current lack of knowledge about the perception of pain in young rodents [15], and objectively assessing low-moderate pain in mice is also recognised as difficult [4,16]. Furthermore, evidence from rats indicates that toe-clipping can impair later performance in certain behavioural tasks, such as the grip suspension test or a swimming task [5]. Toe-clips and earnotches may also be hard for researchers to detect without very close proximity or handling, especially in animals within their home cages and/or under red light, in turn raising dangers of observer effects and making these marks inappropriate for identification in video recordings [8]. The last set of techniques is similarly invasive, but involves permanent identification tags such as tattoos and microchips. These methods require specialized equipment and some technical skill to administer. Traditional ear tattoo methods caused a significant acute increase in heart rate and blood pressure in rats (comparable to ear notching) [7], although apparently no long-term effects on growth, behaviour, or sensory-motor function [5,14,17]. Microchips are generally injected into the subcutaneous region of the dorsal surface of the rodent, sometimes with anaesthesia (e.g. [6,18]), sometimes without (e.g. [19]). Microchips can be extremely valuable when used with technologies allowing automatic collection of behavioural and physiological data (e.g. [6,20]), although they are obviously not appropriate when continuous visual/video monitoring is needed, because not detectable without a chip-reader. In terms of animal welfare, injection of the device is likely to be painful if conducted without anaesthesia (e.g. [19]). Furthermore, microchips have been implicated in tumour development [21][22][23]. These have only been found in older animals in long-term studies, and typically the incidence rate is low (1-4%); still, because the prognosis for animals with foreign body tumours is typically poor [24], this raises welfare concerns for these older subjects, as well as suggesting that microchips may be inappropriate for long-term or oncological studies. Finally, the Federation of European Laboratory Animal Science Associations (FELASA) has recently published a comprehensive overview of the protocols and procedures associated with all of the above identification methods [25]. In it, they identify all permanent marking techniques, from mutilations to implants and tattoos, as painful upon application (unless analgesics are used), and thus potentially a welfare concern.
Here we propose a new approach that would eliminate the need to mark individual animals: mixing visually distinctive strains within cages. In mice, for example, there are hundreds of strains, many of which can be readily visually distinguished. Coat pigmentation for instance, varies greatly as a function of genetic mutation [26]. Therefore, if mice from differentially pigmented strains were housed together, they could be distinguished as individuals. This would obviate needs for technical help in marking or specialized equipment; eliminate concerns about pain or stress resulting from marking practices; and allow great ease of identification from a distance, within the home cage, under red light, in video recordings, and even by many video tracking systems (e.g. Noldus EthoVisionH XT) if appropriately contrasting backgrounds are used. In addition, using multiple strains of mice increases systematic variation within animal experiments (compared to experiments that only use a single strain), which will in turn lead to greater reproducibility and external validity of results [27]. However, our proposed novel mixed strain approach would only be ethically acceptable if it can be shown not to cause new welfare concerns; and only scientifically acceptable if it does not alter animals' previously well-characterized phenotypes (e.g. [28]) or increase the variance of measured variables (so making it harder to detect significant effects) [29]. Therefore, in this preliminary study of two common strains we tested two hypotheses: that mixed strain housing affects the phenotypes of mice (including states related to welfare), and that mixed strain housing increases the variance in data obtained from the animals. We housed C57BL/6 (black) and DBA/2 (brown) females in either single or mixed strain pairs between 3 weeks (weaning age) and 22 weeks (when mice are well into adulthood), and took a total of 23 behavioural, physiological, morphological, and haematological measures.

Ethical Note
All procedures listed here were approved by the University of Guelph Animal Care Committee (Animal Utilization Protocol number: 1398) and comply with the Canadian Council on Animal Care guidelines.
Animals & Housing 31 female, non-related, DBA/2 and 31 female, non-related, C57BL/6 mice were purchased from Charles River Labs at three weeks of age. We chose these inbred strains, not just for their different coat colours, but also because they are both widely used, comparable in body weight [30], and similarly sociable [28]. We used females because they are commonly group-housed [31], necessitating individual identification, and because females make up a large proportion (approximately 70%) of the inbred mice sold by Charles River Laboratories (personal communication).
Upon arrival, mice were randomly divided up into either same strain or mixed strain pairs. The day after arrival, all mice were given carprofen in their water supply, and the next day, once analgised [32], one mouse in each single strain cage was ear notched. Carprofen was continued for a day afterwards. Due to a few malocclusion cases, the final experimental setup comprised: 9 DBA/2 pairs, 8 C57BL/6 pairs, and 11 mixed strain pairs (total n = 56). Mice were all housed in conventional polysulfone plastic 'shoebox' cages (12 Hcm627 Lcm616 Wcm; Allentown, Inc.) on shelves in a room kept at 2161uC and 48% relative humidity and was on a 12-hour reverse light schedule (lights out at 10 am). The cages were arranged systematically along the shelves in a rotating pattern between the three different cage setups, so that all cagetypes were evenly represented on each of three shelves. The cages were furnished with corncob bedding, Shepherd Shack Envirodryß nesting material, a UDEL polysulfone plastic mouse house shelter and ad lib. food and water. The cages were completely cleaned once a week.

Preliminary Behavioural Data Collection
After six weeks of differential housing, preliminary home cage observations and behavioural tests were conducted for two weeks in order to ensure behavioural compatibility between cage mates, and also validate and finalize all testing protocols. During this time, it was determined that some mice were more active in the early part of the day and others during the later part, shaping our final test schedule (see below). Behavioural observers (MW & CF) were trained, and their independently-collected data were then compared for intra-and inter-observer reliability for all behavioural observations (p always ,0.05 for all variables by the end of training). For home cage data, 16 hours of observation over two days were also ascertained to be sufficient to produce reliable, consistent results. No aggression was observed between cage mates, and so they were left in their current pairs for an additional seven weeks before the final data collection phase. Data were collected in the order below, and no data were ever collected on a cage-cleaning day.

Home Cage Time Budgets During the Active Phase
Home cage observations were conducted in two four-hour blocks per day (12 pm-4 pm; 5 pm-9 pm) during the dark period, for two days. On Day 1, MW observed mice in the early block and CF observed them in the late block, this being reversed on Day 2. The silent observer recorded them every 12 minutes during the block, using a mixture of focal and scan sampling [33], and following a previously determined, well-validated, ethogram (see [34] for details). For analysis, behaviour types were pooled into three categories: normal activity (e.g. locomotion, grooming, eating/drinking), inactivity (e.g. standing still, sleeping), and stereotypic behaviour (e.g. repetitive route tracing, patterned climbing, involving elements repeated three or more times). Behaviours that did not fall within these categories, such as borderline stereotypies (i.e. only two repetitions of a behavioural pattern), were scored as 'ambiguous'. These behavioural variables were selected to allow comparison with published strain typical values ( [35,36]) and for their use in assessing mouse welfare [37].
Behavioural tests. For all tests, any test that required more than one trial was conducted at an early time one day (12 pm), and a later time on the next day (5 pm), so that all subjects would be assessed during one of their active times (see Preliminary behavioural data collection). Behavioural tests began after 13 weeks of differential housing and continued for three weeks, with no more than one test/trial being performed per day. Each test was selected to allow comparison with published strain typical values (e.g. [38]) and for their potential value in mouse welfare assessment (e.g. [39]).

Sucrose Consumption Test
Lower levels of sucrose consumption indicate increased anhedonia (e.g. [40]). This is usually assessed via ingestion of sucrose solution, but the use of solid sucrose is a validated alternative [41]. To collect individual data on sucrose consumption, mice were placed individually for 30 minutes in wire mesh compartments (0.64 cm60.64 cm mesh) that fitted inside their home cage, and contained a sugar lump, along with a normal food pellet (as an experimental control for feeding motivation). These compartments were designed to separate the mice physically while still allowing them visual and olfactory contact with each other. This was conducted for five consecutive days pre-test, to habituate mice so that stress and hyponeophagia responses would be minimized. Two test trials were then conducted, one on each of two consecutive days. The sugar lump was weighed before and after each of these tests, and an average taken to quantify sugar consumption per mouse. Mice were weighed at the end of the second trial so that body weight could be added to the statistical model as a blocking factor for the analysis. Finally, to check that the mesh compartments did not affect sucrose consumption, two pre-weighed sugar cubes were placed in the home cage for 30 minutes, on two consecutive days, with both mice thus allowed equal access (cf. e.g. [2]). These consumption values were regressed against the average values for both cage mates in the trials with the mesh compartments. Sugar consumption correlated strongly between the two types of test (R 2 = 0.43, F 1,22 = 14.09, p = 0.001), thus validating our new technique.

Novel Object Test
Long latencies to make contact with a novel object are typically interpreted as reflecting higher levels of anxiety or neophobia (e.g. [42]). To assess this, we used a previously determined protocol [2], involving exposing mice to a novel object in their home cage by inserting it through the cage lid. Two trials were conducted, one at 12 pm (using a standard wooden popsicle stick) and one the following day at 5 pm (using a white plastic fork). After an object had been used once, it was discarded, each cage always being tested with a new item, so that no odour cues were left on the object between cages. The maximum allowed duration was five minutes; any mouse making no contact at all was given the maximum score (300 seconds).

Startle Response Test
Large responses to sudden auditory tones reflect more anxious phenotypes [43]. Acoustic startle responses were assessed using four Kinder Scientific startle boxes and Startle Monitor software for analysis. The four startle boxes were calibrated prior to use using the protocol provided by the manufacturer. In batches of four, mice were each placed individually in one box such that they could move around but could not rear up, and were allowed to habituate to the box for 6 minutes (50 dB white background noise). At the 6-minute mark, a loud (115 dB for 40 ms) auditory tone was played in all four boxes simultaneously. The force generated by each mouse immediately prior to the tone was recorded (to account for the body weight of the mouse), as was the force generated by the mouse over the duration of the tone. The startle response was calculated as the peak force minus the initial force.

Physiological, Haematological and Morphological Data
Baseline levels of faecal corticosterone metabolites. Faeces were collected from each mouse during the startle response test and then during a half hour period of isolation three days later. Rodents tend to defecate in response to stressors [44], and because corticosterone metabolites gradually accumulate in the faeces after a delay of several hours (reviewed [45]), this method is a good way to collect samples that reflect baseline levels of circulating corticosterone. The two samples were pooled per mouse and then frozen at 220uC until processed as follows: each sample was homogenized and an aliquot of 0.05 g was shaken with 1 ml of 80% methanol; after centrifugation, an aliquot of the supernatant was diluted (1:20) with assay buffer and frozen at 220uC until analysis. A 5a-pregnane-3b,11b,21-triol-20-one EIA, which has proven well suited to assess corticosterone metabolites in mouse faeces, was used for analysis (for details see [46]; for validation for mice, see [47]). Nine mice did not produce enough faeces for a complete assay, so were not counted in the analysis.

Body Condition
Mice were weighed immediately prior to euthanasia so that we could use body weight as a dependent variable, and also include it as a blocking factor in the model for spleen weight. All mice were euthanized three weeks following the end of behavioural testing, and a gross examination of body condition was done, specifically looking for bite marks/wounds and evidence of barbering (an abnormal behaviour where a mouse will pluck the whiskers or body hair from itself or a cage mate [48]).

Post Mortem Measures
Euthanasia was conducted by cervical dislocation after 19 weeks of differential housing, and was performed by a trained technician. Immediately following death, a blood sample was taken via cardiac puncture. A small sample of blood was used to determine blood glucose, using a ContourH blood glucose meter; the rest of the sample was put into a heparinized tube (,50 mL). After this, the mouse was dissected and the spleen was removed and weighed. Spleen mass is likely to reflect immune status in mammals (larger spleens suggest higher immune-competence) [49], and also possibly inherently differs between C57BL/6 and DBA/2 mice [50]. Heparinized blood samples were sent to the University of Guelph Animal Health Laboratory for a Complete Blood Count analysis. Ten samples were lost due to clotting prior to analysis (six ''single'' DBA/2; two ''mixed'' DBA/2; two ''single'' C57BL/6).

Statistical Analysis
All analyses were conducted in JMPH 10. General linear models (GLMs) were used to test all hypotheses (except where otherwise indicated), and to run the behavioural consistency checks mentioned in the Methods. Originally, ear notching (Y/N) was included in all models, but this was never a significant effect (p always .0.10) and so was removed. The GLM used for each dependent variable was similar: Cage is a blocking factor in order to avoid pseudoreplication because mice housed in the same cage are non-independent (see [51,52]), and was set as a random effect [53]. Strain and Cage Type are both nested within Cage (Cage Type being either single or mixed strain). In certain cases, additional terms were added to reflect other variables considered necessary as controls in specific analyses (e.g. body weight in the sucrose consumption analysis). Type 3 sums of squares were used except when there was a continuous variable in the model (causing non-orthogonality), in which case Type 1 sums of squares were used, with each term of interest being placed last in the model in turn [29]. Data were transformed where necessary to fit the parametric assumptions of GLMs. If mixed strain housing alters phenotypes, Cage Type would have significant effects; and if mixed strain housing altered the magnitude of strain differences (a more important concern), Cage Type*Strain would be significant. Although a total of 69 pvalues were generated during hypothesis testing, we did not control for multiple testing; this was to increase our ability to detect any effects of mixing strains, although it potentially made us vulnerable to Type 1 errors (see Discussion).
To investigate the impact of mixed strain housing on the variability of measures, we ran three additional tests on the standard errors of the dependent variables. 23 standard error values for each housing type were used in a GLM to test for differences between Cage Types, blocking by strain; and also to assess their co-variance. Since the slope of relationship between the two sets of values did not vary with Strain (see Results), both strains were pooled to enable a linear regression in which we tested the null hypothesis that the slope of the line was 1.

Home Cage Time Budgets
Behavioural consistency between days proved to be very high (inactivity: F 1,52 = 33.12, p,0.001; normal activity: F 1,52 = 21.59, p,0.001; stereotypic behaviour: F 1,52 = 44.57, p,0.001). Because ambiguous behaviours were rare (,5% of observations), they were not included in any analyses. The two strains differed in time budgets, with DBA/2 s being more stereotypic, and thence less inactive as well as spending less time in normal activity (Table 1). However, the magnitude and direction of strain differences were unaffected by mixed strain housing: Cage  Type*Strain never approached significance ( Table 2). The one possible main effect of Cage Type on both strains was a trend for mice in mixed strain cages to be less stereotypic than their samestrain peers in single strain cages (Table 3). Aggressive interactions were never observed (and nor did the animal care technician ever report any behavioural issues over the duration of the experiment).

Behavioural Tests
Again, marked strain differences were evident, at least in the two tests related to fear and anxiety (novel object test and startle response test); DBA/2 mice had shorter latencies to touch the novel objects and were less reactive in the startle response test (Table 1). However, mixed strain housing had no influence on results ( Table 2). Anhedonia was unaffected by Strain, Cage Type, or its interaction. This result was consistent whether or not 'body weight' was included in the model (not in practice a predictor of

Physiological, Haematological, and Morphological Variables
Strain affected hematocrit, haemoglobin, and mean corpuscular volume (Table 4), and levels of faecal corticosterone metabolites (FCM; Table 1); strain also showed a trend to affect blood glucose (Table 1). However, like the behavioural measures, these strain effects did not interact with Cage Type (Tables 2 & 5). Cage Type had one significant main effect; red blood cell distribution width was significantly higher in single strain pairs (Table 6). Cage Type showed weak trends to affect basophil counts, single-strain mice having lower levels, and to affect body weight, with mice in single strain pairs being slightly heavier (Table 3); however there were no interactions between these measures and Strain. The blood glucose result was unchanged by the inclusion of 'time since food removal' (a significant influence on glucose [F 1,35 = 6.98, p = 0.012]), and spleen weight was unchanged by the inclusion of 'body weight' (a significant predictor of spleen weight [F 1,34 = 48.84, p,0.001]), so we left them in the model to best test our hypotheses by taking biological confounds into account. No evidence of bite marks, wounds, or barbering was found post mortem.

Effects of Mixed Strain Housing on Variance
There were no significant differences in the variables' standard errors between the two Cage Types (F 1,88 = 0.11, p = 0.738). The standard errors co-varied closely between the two Cage Types (F 1,42 = 641.6, p,0.001) and were not affected by Strain (F 1,42 = 0.40, p = 0.53). Furthermore, the linear regression of one Cage Type against the other (Fig. 1) revealed that the slope of the relationship did not differ from one (F 1,21 = 2.88, p = 0.104). Discussion Several indicators were used to determine the impact of mixed strain housing on mouse welfare, namely stereotypic behaviours and barbering, anhedonia and anxiety/fear under test, faecal corticosterone metabolites (FCM), and body condition (including weight). In no case was any significantly affected by mixed strain housing. Two trend effects suggested mixed strain mice to be less stereotypic but have smaller body weights than their single strain peers (although because we did not correct for multiple comparisons these may be Type 1 errors, and so these results need replicating). Notably, there was a complete lack of aggressive interactions, barbering, and wounds indicating good behavioural compatibility between all cage mates, regardless of whether housed with a like strain. Thus overall, being in mixed strain C57BL-6-DBA/2 pairs did not compromise the welfare of our subjects.
Our second concern was that mixed strain housing might affect normal strain effects on phenotype: thus expected differences between DBA/2 and C57BL/6 mice could be altered in magnitude or even direction by mixed strain housing. There was no evidence of this. Consequently looking first at the indicators that were used to evaluate welfare, stereotypic behaviours (e.g. route tracing) were performed more frequently by DBA/2 mice than C57BL/6 mice, as expected from previous studies [35]. DBA/2 mice were also bolder in the novel object tests and less reactive in the startle response tests, indicating lower levels of 'trait' anxiety (cf. 'state' anxiety) [54,55], consistent with known strain differences in startle responses [56] as well as with data from open field tests measuring the same trait [57,58]. Again this strain difference was similarly expressed in single-and mixed-strain pairs, as was a strain difference in FCM: DBA/2 mice had higher baseline FCM levels than C57BL/6 s, regardless of housing type, a result consistent with known strain differences in endocrine response to stressors such as restraint [59,60]. Body weights in contrast did not differ between strains, regardless of how housed: this lack of strain effect was, again, an expected finding [30]. Finally, no effect of strain or its interaction with cage type was found on anhedonia either. Other studies had found significant strain differences between C57BL/6 and DBA/2 mice (e.g. [38]), but only in animals subjected to unpredictable chronic mild stress; in our housing conditions the lack of strain difference in this variable was therefore again an expected finding.
A further 17 other variables were quantified including: blood glucose, spleen weight, home cage activity and inactivity levels, and numerous haematological measures. Once again, no strainby-cage type interactions were found: any strain differences detected were thus as expected, and all were stable across mixedand single-strain housing. One such effect was a strong trend for C57BL/6 mice to have higher blood glucose (regardless of housing type): a strain difference consistent with published literature [61]. C57BL/6 mice also had higher haemoglobin and hematocrit levels, and higher mean corpuscular volume, but lower levels of eosinophils than DBA/2 mice, again regardless of housing type, and all as consistent with the strain differences reported in The Jackson Laboratory's mouse phenome database [62]. One surprising finding was that spleen weight did not differ between these two strains (cf. [50]), although the direction of non-significant effect was in the predicted direction (with DBA/2 s having higher values). This could reflect low power, or instead that the previous findings from males [50] do not apply to females. A second surprise was the emergence of one main effect of cage type: mice housed in single strain pairs had significantly higher red blood cell distribution widths (RDW) compared to peers in mixed strain cages. RDW, a measure of the variation in red blood cell size, was found to be a significant predictor of all-cause mortality in a long term study on humans [63]. Like our stereotypy finding, this suggests that mixed-strain housing may have some benefits, although likewise it should be treated with caution until replicated (as a potential Type 1 error). Overall, the fact that there were no strain-by-cage type interactions for any of the 23 variables measured, and that many well-established strain differences were maintained in our mixed strain pairs, indicates that the mixed strain housing used here has no readily detectable effects on mouse phenotype.
Our third research question was whether this form of mixedstrain housing would adversely affect inter-individual variation, so potentially increasing the numbers of subjects needed to detect significant effects. We found no evidence that mixed-strain housing increases data variability: for all variables, the standard errors of data from mixed-strain-housed mice proved extremely similar to those from same-strain-housed animals. If data variance had been increased by mixed-strain housing, then using this paradigm would mean more animals would be needed in order to obtain the same degree of statistical power as single-strain housing: not cost-effective and a clear violation of the 3Rs [64]. However, that this was not the case suggests that researchers can utilize mixed C57BL/6 and DBA/2 females without increased variability compromising the statistical power of their experiments.
One other finding was of note. We found that ear notching did not affect any variable measured. This suggests there are no longterm consequences of ear notching, at least when applied with concurrent analgesia (although without analgesia this method still causes acute pain and thus constitutes a welfare issue [7,12,25]).
Of course, that our experiment failed to find any adverse effects of mixed strain housing does not mean that none are possible. It is possible that effects were very subtle (only detectable with larger sample sizes) or that other traits, ones we did not measure, were altered by our mixed strain paradigm. It is also possible that welfare would have been compromised, strain-typical phenotypes altered, or data rendered more variable, had the experiment gone on longer, or started at an earlier age (perhaps via cross-fostering dependent pups, cf. [65]) or had we used male subjects (cf. [66]). Finally, it is likely that not all mouse strains would cohabit in such a problem-free way (especially strains with large differences in body weight and/or temperament (see [67] for strain typical differences in male aggression). For example, in a similar experiment [66], mixing C57BL/6 with 129S mice did cause significant changes in the 129S animals' home-cage social and feeding behaviour, and anxiety-like responses in open field tests (with anxiety-like behaviours in C57BL/6 mice also potentially modified by the mixed-strain housing too, in a manner determined by a subject's weaning weight). Thus, it would be rash to generalize from our results to all strains/sexes/ages/etc., and more research is now needed on a range of other strains and housing/ rearing conditions, as well as on male mice.
Mixed strain housing may not be appropriate for all research programs, and we do not advocate that it is adopted without further study by researchers interested in other models or variables beyond those used here. It is obviously unusable for all research involving single-housed animals (e.g. aggressive males). It is useless, like other simple marking schemes (e.g. tail marking; shaving; simple ear-notching), to anyone who needs colony level unique IDs (c.f. cage level unique IDs); and like these methods requires extra care that animals' cage identities are always known. Gastrointestinal microflora typically differ between strains [68,69], and cross-contamination would be possible in mixed strain cages -a potential confound in certain areas of research (e.g. immunology; gastroenterology). Mixed strain housing may also affect, but perhaps even render more normal, the social behaviour of mice: inbred mice have trouble distinguishing their own scent marks from those of genetically identical cage mates [70], and so mixed strain housing may facilitate more natural social behaviour and less aggression. This requires investigating, partly for its positive welfare implications, but also because it may alter results of tests reliant on social interactions. As a final caution, due to the conspicuousness of individual mice when subjects are housed like this, data collectors may need to be blind to the hypothesis (rather than the treatment, which may now be challenging), to ensure blinding.
Nevertheless, as a proof of principle and a first step in validating a refinement in laboratory mouse husbandry, this study shows that co-housing mouse strains with different coat colours can potentially be practical and safe. Specifically, researchers using female C57BL/6 and DBA/2 mice can house them together from weaning into young adulthood and still expect to replicate straintypical results without compromising welfare. For mice housed in pairs, this practice then obviates the need for other marking techniques, with all their potential drawbacks (see Introduction), and subjectively we also found that distinguishing individuals in our mixed strain cages was far easier than relying on ear notches, as we had to for conventionally housed subjects. Therefore, in a world where group housing mice is generally both good for welfare (reviewed [71]), and sensible economically, where still we need individual-level data, and where external validity is improved by using multiple strains [27], mixed strain housing, at least for C57BL/6 and DBA/2 females, represents a new, ethically preferable, and practically and scientifically valuable way to identify individuals. There is now value in exploring other combinations of differentially-pigmented strains, especially those that are similar in aggression (see [67] for example) and body weight, so most likely to cohabit with negligible impact.