Characterization of two family AA9 LPMOs from Aspergillus tamarii with distinct activities on xyloglucan reveals structural differences linked to cleavage specificity

Aspergillus tamarii grows abundantly in naturally composting waste fibers of the textile industry and has a great potential in biomass decomposition. Amongst the key (hemi)cellulose-active enzymes in the secretomes of biomass-degrading fungi are the lytic polysaccharide monooxygenases (LPMOs). By catalyzing oxidative cleavage of glycoside bonds, LPMOs promote the activity of other lignocellulose-degrading enzymes. Here, we analyzed the catalytic potential of two of the seven AA9-type LPMOs that were detected in recently published transcriptome data for A. tamarii, namely AtAA9A and AtAA9B. Analysis of products generated from cellulose revealed that AtAA9A is a C4-oxidizing enzyme, whereas AtAA9B yielded a mixture of C1- and C4-oxidized products. AtAA9A was also active on cellopentaose and cellohexaose. Both enzymes also cleaved the β-(1→4)-glucan backbone of tamarind xyloglucan, but with different cleavage patterns. AtAA9A cleaved the xyloglucan backbone only next to unsubstituted glucosyl units, whereas AtAA9B yielded product profiles indicating that it can cleave the xyloglucan backbone irrespective of substitutions. Building on these new results and on the expanding catalog of xyloglucan- and oligosaccharide-active AA9 LPMOs, we discuss possible structural properties that could underlie the observed functional differences. The results corroborate evidence that filamentous fungi have evolved AA9 LPMOs with distinct substrate specificities and regioselectivities, which likely have complementary functions during biomass degradation.


Introduction
For many years, enzymatic conversion of plant polysaccharides was thought to be achieved exclusively by a consortium of hydrolytic enzymes, i.e. glycoside hydrolases (GHs) such as total, seven expressed AA9 genes were identified in the transcriptome datasets, with sequences of the predicted proteins listed in S1 Table in S1 Appendix. Sequences were analyzed for domains corresponding to carbohydrate-active enzymes or domains of unknown function using the dbCAN2 metaserver [40] and by multiple sequence alignment with similar domains in other proteins in the UniProt database using MUSCLE [41]. The genes encoding full-length AtAA9A, AtAA9B, and AtAA9G, excluding introns but including the native signal peptide, were codon optimized for Pichia pastoris (GenScript, Piscataway, NJ, USA). The synthetic genes were inserted into the pPink-GAP vector as previously described [42]. To generate truncated proteins containing the catalytic domain only, gene fragments encoding the AA9-domains of AtAA9A (AtAA9A-N; 702 nucleotides, encoding 234 residues) and AtAA9B (AtAA9B-N; 732 nucleotides, encoding 244 residues) were PCR amplified from the pPINK-GAP-AtAA9A and pPINK-GAP-AtAA9B constructs, respectively. PCR products were ligated into the pPINK_-GAP_TaCel5A vector [42] using restriction enzymes EcoRI and Acc65I and the In-Fusion HD cloning kit (Clontech Laboratories, Mountain View, CA). The expression vectors were transformed into P. pastoris PichiaPink™ cells (Invitrogen, Carlsbad, CA, USA) and transformants screened for protein production in BMGY medium as previously described [42].
For the production of AtAA9A-N and AtAA9B-N, transformants with greatest expression for each protein were grown in 25 mL of BMGY medium (containing 1% (v/v) glycerol) in 250-mL Erlenmeyer flasks at 29˚C and 200 rpm for 16 h. These pre-cultures were subsequently used to inoculate 500 mL BMGY medium (containing 1% (v/v) glycerol) in 2-L Erlenmeyer flasks, then incubated at 29˚C and 200 rpm for 48 h. After 24 h of incubation, the media were supplemented with 1% (v/v) glycerol. Cells were removed by centrifugation at 8,000 g for 15 min at 4˚C. The supernatants were dialyzed against 20 mM Tris-HCl pH 8.0 until they reached a conductivity of 4 mS.cm -1 , then concentrated to 50 mL using a VivaFlow 200 tangential crossflow concentrator (MWCO 10 kDa, Sartorius Stedim Biotech Gmbh, Germany).

Purification and Cu(II) saturation
AtAA9A-N and AtAA9B-N were purified using a two-step purification protocol, starting with anion exchange chromatography followed by size exclusion chromatography. The concentrated broth after buffer exchange (see above) was loaded onto a 5-mL Q Sepharose FF column (GE Healthcare BioSciences AB, Sweden) equilibrated with 20 mM Tris-HCl pH 8.0. The bound proteins were eluted by applying a linear gradient from 0 to 0.5 M NaCl in the same buffer. Fractions were analyzed by SDS-PAGE and those containing AtAA9A-N or AtAA9B-N then pooled, dialyzed against 20 mM BisTris-HCl pH 6.0 and concentrated to 2 mL using Amicon Ultra centrifugal filters (MWCO 10 kDa, Merck Millipore, Carrigtwohill, Ireland). The concentrated samples were applied to a 120-mL Superdex 75 16/600 gel filtration column (GE Healthcare BioSciences AB) in 20 mM BisTris-HCl pH 6.0 supplemented with 150 mM NaCl. Protein purity was analyzed by SDS-PAGE and the fractions containing AtAA9A-N or AtAA9B-N were pooled and concentrated to 1 mL using Amicon Ultra centrifugal filters (MWCO 10 kDa, Merck Millipore), followed by sterilization by filtration through a 0.2-μm syringe filter. Protein concentrations were determined by measuring absorbance at 280 nm, using theoretical extinction coefficients calculated with the ExPASy server [43] (AtAA9A-N, 39545 M -1 cm -1 ; AtAA9B-N, 41620 M -1 cm -1 ). AtAA9A-N and AtAA9B-N were saturated with Cu(II) by incubating the enzymes with an excess of CuSO 4 (3:1 molar ratio of copper to enzyme) for 30 min at room temperature, as described previously [44]. The solution was then loaded onto a PD MidiTrap G-25 desalting column (GE Healthcare, UK), equilibrated with 20 mM BisTris-HCl pH 6.0. Fractions containing AtAA9A-N and AtAA9B-N, eluted with 1 mL of the same buffer, were collected and stored at 4˚C before further use.

In silico analysis of A. tamarii LPMOs
For phylogenetic analysis, a multiple sequence alignment of AA9 domains (without the C-terminal extension) was generated using MUSCLE [41] and the phylogenetic tree was generated using Interactive Tree of Life (iTOL) [45]. The sequence alignment of xyloglucan-active LPMOs was generated using T-Coffee's Expresso tool [46]. Structural models of xyloglucanactive LPMOs were generated with SWISS-MODEL [47], using the templates with PDB IDs 3ZUD (for AtAA9B-N and FgAA9A-N), 4B5Q (for

Enzyme reactions
Reaction mixtures, in 200 μL total volumes, contained 0.2% (w/v) PASC or 1% (w/v) of the other substrates, and 1 μM of AtAA9A-N or AtAA9B-N in 50 mM of BisTris-HCl pH 6.0, supplied with 1 mM ascorbic acid as indicated. Purified recombinant cellobiose dehydrogenase (MtCDH) from Myriococcum thermophilum [49], at a concentration of 1 μM, was also used as an electron donor in reactions with PASC, instead of ascorbic acid. Samples were incubated at 37˚C with shaking at 1000 rpm for 18 h. After incubation, soluble and insoluble fractions were separated using a 96-well filter plate (Merck Millipore) and a Merck Millipore vacuum manifold. C4-oxidized cello-oligosaccharide standards were produced by incubating PASC with 1 μM NcAA9C [50], using the same conditions as for AtAA9A-N and AtAA9B-N. C1-oxidized standards were generated in the same manner, using 1 μM NcAA9F [51]. Product formation was analyzed by high-performance anion-exchange chromatography (HPAEC) and MALDI--TOF mass spectrometry (MS), as described below.

Time course analysis and quantification of released oxidized products
Reaction mixtures with 0.2% (w/v) PASC, 1 μM of AtAA9 and 1 mM ascorbic acid were set up in 800 μL total volumes in 50 mM of BisTris-HCl pH 6.0 and incubated as specified above. Samples (150 μl) were collected after 20, 40, 60, 120, and 240 min of incubation and boiled at 97˚C for 10 min to stop the reaction. Soluble and insoluble fractions were separated using a 96-well filter plate (Merck Millipore) and a Merck Millipore vacuum manifold. Next, 25 μl of the soluble fractions were supplemented with 1 μL TrCel7A in 150 mM Na-acetate pH 4.75 (to a final concentration of 1 μM), followed by incubation at 37˚C for 18 h in order to convert the solubilized oxidized oligosaccharides to the corresponding oxidized dimers. After the incubation, the samples were incubated at 97˚C for 10 min to stop the reaction. For product quantification, cellobionic acid (as C1-oxidized) and C4-oxidized dimer standards were prepared as described before [5,52].

Analysis of enzyme products
Native and oxidized oligosaccharides were analyzed by HPAEC using a Dionex ICS-5000 system equipped with pulsed-amperometric detection (PAD) and a CarboPac PA1 analytical column with a CarboPac PA1 guard column (Dionex, Sunnyvale, CA, US). A 0.25 mL/min flow and 50-min gradient were employed as previously described [53]. Additional product analysis was performed by MALDI-TOF MS, using an Ultraflex MALDI-TOF/TOF instrument (Bruker Daltonics, Bremen, Germany) equipped with a nitrogen 337-nm laser beam, as described previously [1]. Prior to MALDI-TOF MS analysis, samples (1 μL) were spotted on an MTP 384 ground steel target plate TF (Bruker Daltonics) together with 1 μL of a saturated 2,5-dihydroxybenzoic acid solution and dried.

Amino acid sequence analysis of A. tamarii AA9s
The genome of A. tamarii CBS 117626 has only recently been published, and it contains nine predicted proteins annotated as AA9 LPMOs [54]. Previous analysis of the transcriptome of A. tamarii BLU37 during cultivation on steam-exploded sugarcane bagasse as exclusive lignocellulosic carbon source [39] revealed seven expressed genes encoding putative AA9 enzymes, which we named AtAA9A, AtAA9B, AtAA9C, AtAA9D, AtAA9E, AtAA9F, and AtAA9G (see S1 Table for the predicted sequences, S2 Table for related LPMOs, including AA9s found in the A. tamarii CBS 117626 genome, as well as the closest related characterized LPMOs from aspergilli, and S3 Table for predicted properties in S1 Appendix). All seven AA9 LPMOs are secreted, as predicted using the SignalP program [55]. Of these, AtAA9D is a fragment only, AtAA9E is a single-domain LPMO, AtAA9A and AtAA9G carry a C-terminal CBM1, whereas AtAA9B, AtAA9C, and AtAA9F carry a 129-, 78-and 61-amino acid extension, respectively, at the C-terminal end, none of which are similar to any previously described domain (S1 and S3 Tables and S1 Fig in S1 Appendix). The C-terminal extension of AtAA9F seems to be a region of low complexity, while AtAA9B and AtAA9C are likely to carry small C-terminal domains of unknown function (S3 Table in S1 Appendix). Blasting the C-terminus of AtAA9B against the UniProt database (E = 0.001) resulted in 98 hits, all of which were LPMO sequences, with 96 originating from Aspergillus and Penicillium species (S1A Fig in S1 Appendix). This C-terminus potentially encodes a novel carbohydrate-binding module (CBM) that is characteristic to these species, with the sequence features are highlighted below in the Discussion below. Analysis of the C-terminus of AtAA9C against the UniProt database and the recently published A. tamarii genome [54] revealed that it may be a truncated version of a domain of unknown function. Comparison of the C-terminus of AtAA9C with the C-termini of proteins sharing >90% identity, which were all LPMOs from Aspergillus species, indicated that, in the AtAA9C sequence derived from the RNA-seq data, this domain lacks ca. 50 amino acids (S1B Fig in S1 Appendix).
Multiple sequence alignment of the AA9 domains with the AA9 LPMOs characterized to date revealed that the closest characterized relative of AtAA9A is LsAA9A, a C4-oxidizing LPMO from Lentinus similis (UniProt ID, A0A0S2GKZ1; 58% sequence identity) [56], while the closest relative of AtAA9B is TaAA9A, a well-studied C1/C4-oxidizing LPMO from Thermoascus aurantiacus (UniProt ID, G3XAP7; 71% and 69% sequence similarity, respectively) [2,57] (S2 Fig and S2 Table in S1 Appendix). The closest characterized relatives of the other predicted LPMOs are listed in S2 Table in S1 Appendix. While sequence similarities may be indicative of regioselectivity and substrate specificity, they are not 100% predictive (S2 Fig in S1 Appendix and as discussed below).

Heterologous expression of A. tamarii AA9s
Of the seven AA9 LPMOs identified in the transcriptome of A. tamarii BLU37, five were upregulated after 48 hours when growing A. tamarii on sugarcane bagasse [39] (S2 Table in S1 Appendix), indicating a role in lignocellulosic biomass degradation. As a first step to understanding the LPMO potential of A. tamarii, we attempted to clone three of the five upregulated LPMOs, namely AtAA9A, AtAA9B and AtAA9G, with and without the C-terminal extension after the AA9 domains. AtAA9D was omitted because the sequence was incomplete, and AtAA9E was omitted because of sequence ambiguities. We successfully expressed in P. pastoris the catalytic domains of two of the three other LPMOs, namely AtAA9A-N and AtAA9B-N. The catalytic domain of AtAA9G was also expressed but in low quantities so we decided to focus on AtAA9A-N and AtAA9B-N. AtAA9A-N and AtAA9B-N were purified to homogeneity using two chromatographic steps (S3 Fig in S1 Appendix), and further characterized.
Electrophoretic analysis revealed that recombinant AtAA9A-N and AtAA9B-N had a slightly higher apparent molecular mass (27 kDa and 29 kDa, respectively) than the theoretical values (23 kDa and 24 kDa, respectively). This modest difference could be due to low levels of glycosylation. AtAA9A-N is predicted to have three potential O-glycosylation sites (Ser29, Thr37, and Thr42) and AtAA9B-N is predicted to have one potential N-(Asn135) and one potential O-glycosylation site (Ser34), as predicted by the NetNGlyc v1.0 [58] and NetOGlyc v4.0 [59] servers of the Technical University of Denmark. Based on the position of these amino acids in the predicted structure (models built with LsAA9A [PDB:5ACI] and TaAA9A [PDB:2YET], respectively; more details below), only Ser29 in AtAA9A is close to the catalytic surface, but still at a distance where an effect of a possible glycosylation on LPMO activity is unlikely.

Cellulolytic activity of AtAA9A-N and AtAA9B-N
The recombinant AtAA9A-N and AtAA9B-N were active on phosphoric acid-swollen cellulose (PASC), but with different regioselectivities (Fig 1 and S4 Fig in S1 Appendix). AtAA9A-N generated native and C4-oxidized cello-oligosaccharides only (Fig 1A), whilst AtAA9B-N generated both C1-and C4-oxidized (as well as native) cello-oligosaccharides ( Fig 1B). No products were detected in control reactions without electron donor. When using MtCDH as an electron donor, MtCDH oxidized the reducing end of all solubilized cello-oligosaccharides, hence no native cello-oligosaccharides or C4-oxidized cello-oligosaccharides (or on-column degradation products thereof [60]) were detected, whereas small amounts of C1-oxidized cello-oligosaccharides were detected for both LPMOs (Fig 1A and 1B). These C1-oxidized cello-oligosaccharides, which were not observed in the control reaction without CDH and which must derive from native products generated by the LPMO, shows that CDH indeed was capable of driving the reactions with both LPMOs (Fig 1A and 1B).
Product formation by AtAA9A-N and AtAA9B-N over time was also assessed using PASC as a substrate. To facilitate quantification of product formation, the soluble products generated by the LPMOs were treated with a cellobiohydrolase, which converts both C1-and C4-oxidized cello-oligosaccharides to the corresponding oxidized dimers (for details, see the Materials and methods). As expected based on previous LPMO studies using the same reaction conditions (e.g. [61,62]), both enzymes showed a linear phase of product formation followed by termination of the reaction. AtAA9A-N was faster than AtAA9B-N and reached a higher yield (AtAA9A-N, 199±9 μM in 120 min; AtAA9B-N, 38±2 μM in 60 min). In addition, product formation by AtAA9B-N leveled off sooner, after 60 min of incubation, while AtAA9A-N continued to release oxidized oligosaccharides for up to 120 min (Fig 3). The initial rates that can be estimated from the linear parts of the progress curves in Fig 3, 3.3 min -1 , and 0.6 min -1 for AtAA9A-N and AtAA9B-N, respectively, are in the same range as the rates of other LPMOs working under the same conditions [63].

Hemicellulolytic activity of AtAA9A-N and AtAA9B-N
AtAA9A-N and AtAA9B-N were both able to cleave tamarind xyloglucan but yielded different product mixtures (Figs 4 and 5). While the peaks in the chromatographic profiles could not be annotated due to unavailability of xyloglucan oligosaccharide standards, it is clear that these profiles are very different (Fig 4). The nature of this difference was revealed by MALDI-TOF MS analysis of the reaction products. The xyloglucan backbone contains an unsubstituted glucose (G) every four sugars, whereas the other glucoses are substituted with a pentose, xylose (X), which again may be substituted with another hexose, galactose (L) [17]. AtAA9A-N produced a clustered product profile typical for enzymes that can cleave xyloglucan only next to the unsubstituted glucose units, yielding for example a cluster of (oxidized) Hex 4 Pen 3 (e.g. GXXX), Hex 5 Pen 3 (e.g. GXXL) and Hex 6 Pen 3 (e.g. GXLL) (Fig 5; [9]). From the current data,

PLOS ONE
Two AA9 LPMOs from Aspergillus tamarii with distinct activities on xyloglucan it is not possible to say whether, for example, the Hex 4 Pen 3 product is ox GXXX or ox XXXG but previous detailed studies of NcAA9C, with a substrate-binding surface similar to that of AtAA9A-N (see below), have shown that this enzyme predominantly cleaves on the nonreducing side of a non-substituted glucose [9] and would thus, in this example, produce ox GXXX. On the other hand, AtAA9B-N produced a myriad of products indicating that xyloglucan was also cleaved in between substituted glucose units (Fig 5).
For both LPMOs, detected products were almost exclusively oxidized, as illustrated, e.g., by the product cluster for AtAA9A-N in Fig 5 showing oxidized GXXX (m/z 1083.5), GXXL (m/z 1245.6), and GXLL (m/z 1407.7). The relatively low signals for hydrated oxidized products (e.g. the signal at 1425.7 for GXLL) and the absence of signals representing the sodium salts of aldonic acids both suggest that oxidation of XG happened at C4 only for both LPMOs, although this cannot be concluded with certainty. Apart from activity on xyloglucan, we could not detect activity on the other hemicellulosic substrates tested (lichenan, ivory nut mannan, and birchwood xylan).

Discussion
Transcriptome analysis of A. tamarii growing on sugarcane bagasse as a carbon source revealed expression of seven AA9 LPMOs [39]. Of these, five were upregulated during the 48h growth period on this plant biomass, namely AtAA9A, AtAA9B, AtAA9D, AtAA9E and AtAA9G [39] (see also S2 Table in S1 Appendix). The differences in domain organization and variations in amino acid sequence indicate distinct roles of these LPMOs in biomass degradation. Here, we report characteristics of the catalytic domains of two of these LPMOs, AtAA9A, and AtAA9B.
AtAA9A and AtAA9B are both multi-modular enzymes. AtAA9A has a CBM1, which is commonly found attached to fungal AA9-type LPMOs. AtAA9B contains a short C-terminal domain of unknown function that seems specific to LPMOs of Aspergillus and Penicillium species. It is worth noting that the alignment of these small domains of approximately 40 residues (S1A Fig in S1 Appendix) shows a fully conserved Tyr/Trp, Trp, and His residue, i.e. residues that are often seen to contribute to the binding of carbohydrates. The differences in the C-terminal domains of the two proteins suggest that the two enzymes may target different parts of the plant cell wall.
LPMOs are prone to oxidative inactivation [70] and tend to be unstable under commonly used reaction conditions such as those employed here [71]. While in-depth assessment of LPMO stability is beyond the scope of this study, the progress curves with PASC revealed another difference between the two LPMOs: AtAA9A-N was more stable and gave higher product yields than AtAA9B-N. In both cases, the final yields (ca. 200 and 40 μM soluble oxidized products, respectively) stayed well below the theoretical maximum, which is defined by the presence of 1 mM ascorbic acid and the degree by which oxidized products become soluble [72]. It has recently been shown that efficient binding to substrate increases the redox stability of LPMOs and that removal of the CBM may lead to increased LPMO inactivation [72][73][74][75]. Thus, it is possible that the catalytic domains studied here are less stable than the full-length enzymes. Nevertheless, the difference in the progress curves of Fig 3 add to the notion that the catalytic domains of these two LPMOs have different properties.
Accumulating data for multiple LPMOs active on cello-oligosaccharides and xyloglucan, summarized in Fig 6 and S4 Table in S1 Appendix, now allow for meaningful speculation about the possible structural causes of the varying substrate specificities. In order to understand structural differences behind the distinct xyloglucan cleavage patterns and the ability to cleave soluble cello-oligosaccharides, we first aligned the sequences of xyloglucan-active AA9s with known cleavage types (S5 Fig in S1 Appendix), paying particular attention to the L2, L3, and LC loops that all contribute to shaping the substrate-binding surface [67,69,76] for which NMR and crystallographic studies have shown that they contain residues that are involved in binding of substrate [56,64,77]. Notably, existing data for LPMO-substrate interactions show that residues putatively involved in substrate binding not only occur in these three loops but also in a region between the LS and LC loops (residues from His147 to Tyr 166 for LsAA9A in S5 Fig in S1 Appendix). This was named "Seg4" in a recent study by Laurent et al. [78], based on the observation that this region is a meaningful discriminator for the phylogenetic grouping of AA9 LPMOs.
Multiple sequence alignment (S5 Fig in S1 Appendix) of xyloglucan-active AA9s showed that the substitution-tolerant XG-active AA9s have shorter L3 loop regions compared to their more restricted counterparts, with the structural effects of this difference illustrated in Fig 7 as well as in S6 and S7 Figs in S1 Appendix. Crystallographic and NMR studies have shown that the L3 loops of (substitution-intolerant) NcAA9C and LsAA9A carry multiple residues that interact with the substrate, namely His64 and Ala80 in NcAA9C [77] and His66, Asn67, Ala75 and Ser77 in LsAA9A [56]. The crystal structure of the LsAA9A-cellohexaose complex (PDB: 5ACI) reported by Frandsen et al. [56] revealed that the C6 hydroxyl of the glucose at subsite +1 is accommodated in a small pocket that is largely shaped by residues in the L3 loop, namely His1, His66, and Ala75, corresponding to His1, His64, and Ala80 in NcAA9C (Fig 7). This pocket is too small to accommodate a glucose with a xylosyl substitution at C6, as it would occur in xyloglucan [17]. In agreement with this, Agger et al. [9] concluded that NcAA9C converts the xyloglucan-oligosaccharide XG14 primarily to XXX and ox GXXXG, which implies that cleavage occurs when an unsubstituted glucose is bound to the +1 subsite. Recently, Sun et al. confirmed that NcAA9C cleaves polymeric XG predominantly by non-reducing side of a non-substituted glucose [69]. Notably, this pocket is lacking in substitution-tolerant XG-active AA9 LPMOs that have a shorter L3 loop and in which, with one exception (McAA9H), the Ala75/80 is replaced by a proline (Fig 7 and S5 Fig in S1 Appendix). This proline (or Tyr in McAA9H), which is part of a more open substrate-binding surface, may interact with a xylosyl moiety at subsite +1. Expectedly, the structural models of AtAA9A and AtAA9B (S7 Fig in S1 Appendix) correspond to their template structures LsAA9A (PDB ID, 5N05) and TaAA9A (PDB ID, 3ZUD), respectively, predicting a similar pocket formed by the corresponding residues in the L3 loop of AtAA9A and a surface-exposed proline in AtAA9B.
Next to having a sterically less restrained +1 subsite, the substitution-tolerant xyloglucanactive AA9s contain insertions in their L2 loop that are absent in substitution-intolerant xyloglucan-active AA9s. Sun et al. recently suggested that this extension in the L2 loop (referred to as Seg1) combined with a shorter L3 loop (referred to as Seg2) may correlate with substitution-tolerant cleavage of XG [69]. While there is little experimental data in support of involvement of the L2 region in substrate binding, such involvement seems obvious from looking at available LPMO structures (Fig 7C and S8 Fig in S1 Appendix). The region with insertions (residues 16-40 for TaAA9A) carries one or two aromatic (Tyr or Phe) residues ( S5 Fig in S1 Appendix). The structures of TaAA9A ( Fig 7C) and NcAA9M, the only substitution-tolerant xyloglucan-active AA9s with a resolved crystal structure [2,68], show that these aromatic residues may be surface exposed and thus contribute to binding of polymeric substrates. An extended substrate-binding surface could facilitate binding of a multitude of substrates, since the mere size of the interacting surface may compensate for suboptimal interactions in one or a few subsites. Of note, while the correlations above seem quite general, and are supported by a recent comparative study by Sun et al. [69], PcAA9H is a notable exception, since its activity on XG is substitution-tolerant, while its sequence and predicted structure resemble that of substitution-intolerant XG-degrading LPMOs (Fig 6  and S5 and S8 Figs in S1 Appendix).
It is worthwhile noting that the substitution-intolerant xyloglucan-cleaving LPMOs cleave at C4, whereas the substitution-tolerant enzymes tend to show C1 and C4 oxidation (Fig 6 and  S4 Table in S1 Appendix). Regioselectivity on xyloglucan has been confirmed unambiguously only for a handful of LPMOs, and the available data suggest that regioselectivity on xyloglucan corresponds to regioselectivity on cellulose [9,15,69,79]. This adds to the notion that substitution-intolerant LPMOs have more restrained subsites that lead to tight and precise binding close to the catalytic copper, whereas substitution-tolerant LPMOs bind their substrate in a manner that is not disturbed by the substitutions present, leading to mixed oxidation patterns. In this respect, one might expect that the substitution-intolerant enzymes, with their smaller but potentially tighter binding substrate-binding surfaces, would be the only ones acting on soluble cello-oligosaccharides. However, while, indeed, compared to the substitution-tolerant LPMOs, a larger fraction of substitution-intolerant LPMOs cleaves soluble oligomers (Fig 6  and S4 Table in S1 Appendix), the correlation between the type of xyloglucan cleavage and the ability to cleave oligomers is far from absolute and may even not exist. As yet, it seems not possible to make meaningful predictions regarding the structural features that determine activity on soluble substrates. Such predictions await structural information for more LPMOs and more LPMO-substrate complexes.
In summary, we show that two of the AA9 LPMOs from A. tamarii have distinct substrate and product profiles, which corroborates that filamentous fungi have evolved LPMOs with diversified substrate specificities and oxidative regioselectivities and that these LPMOs likely complement each other in natural biomass degradation. Fungi are singular microorganisms that are adapted to multiple ecological niches and different conditions, thus displaying a potential for innumerable applications. Gaining a better knowledge of their enzymatic repertoire is crucial for exploiting and (re)designing their biotechnological applications.  Supporting information S1 Appendix. (PDF) , both cleaving xyloglucan adjacent to an unsubstituted unit; panel C shows the lack of cavity and a conserved surface-exposed proline (white arrow) in TaAA9A (PDB: 2YET), being able to cleave xyloglucan between two substituted units. The cellohexaose was superposed from the LsAA9A-cellohexaose (PDB: 5ACI) structure. The same cavity is present in all other substitution-intolerant AA9 LPMOs with a known crystal structure, CvAA9A (PDB: 5NLT), NcAA9A (PDB: 5FOH) and NcAA9D (PDB: 4EIR). As to other substitution-tolerant AA9 LPMOs, to date structural data is available only for NcAA9M (PDB: 4EIS), which similarly shows the lack of cavity and a conserved surface-exposed proline. Note that panel C shows side chains of residues in the extended L2 region of substitution-tolerant TaAA9A, including aromatic Tyr24. More details of the structures are provided in S6 Fig in S1 Appendix, which shows a structural superposition.