Unraveling the structural and molecular properties of 34-residue levans with various branching degrees by replica exchange molecular dynamics simulations

Levan has various potential applications in the pharmaceutical and food industries, such as cholesterol-lowering agents and prebiotics, due to its beneficial properties, which depend on its length and branching degree. A previous study also found that the branching degree of levan affected anti-tumor activities against SNU-1 and HepG2 tumor cell lines. Despite its promising potential, the properties of levans with different branching degrees are not well understood at the molecular level. In two models of the generalized Born implicit solvent (GBHCT and GBOBC1), we employed replica-exchange molecular dynamics simulations to explore conformational spaces of 34-residue levans (L34) with branching degrees of zero (LFO34B0), one (LFO34B1), three (LFO34B3) and five (LFO34B5), as well as to elucidate their structural and molecular properties. To ensure a fair comparison of the effects of branching degree on these properties, we focused on analyzing the properties of the central 21-residue of the main chains of all systems. Our results show that all major representative conformations tend to form helix-like structures with kinks, where two-kink helix-like structures have the highest population. As branching degree increases, the population of helix-like structures with zero or one kink tends to increase slightly. As the number of kinks in the structures with the same branching degree increases, the average values of the lengths and angles among centers of masses of three consecutive turns of residue i, i+3, and i+6 tended to decrease. Due to its highest occurring frequencies, the O6 (i)—H3O (i+1) hydrogen bond could be important for helix-like structure formation. Moreover, hydrogen bonds forming among the branching residue (br), branching position (bp) and other residues of L34B1, L34B3 and L34B5 were identified. The O1(bp)—H3O(br), O1(br)—H3O(br) and O5(br)—H1O(br) hydrogen bonds were found in the first-, second- and third-highest occurrence frequencies, respectively. Our study provides novel and important insights into conformational spaces and the structural and molecular properties of 34-residue levans with various branching degrees, which tend to form helix-like structures with kinks.

Beneficial properties of levan include its unusually low intrinsic viscosity [7], high water solubility and susceptibility to acid hydrolysis [8], and these properties are very advantageous to various industries, especially the food and pharmaceutical industries. Examples of potential applications include cholesterol-and triacylglycerol-lowering agents [9], prebiotics [10], binders, controlled-release matrices [11], antiviral agents against avian influenza HPAI, H5N1 and adenovirus type 40 [12], as well as antitumor agents, whose activities depend on chain length and the degree of branching of levan [13,14]. A previous study by Yoon et al. found that the branching degree of levan affected the antitumor activity of levan on the SNU-1 and HepG2 tumor cell lines [14]. The antitumor activity of SNU-1 decreased as the branching degree of levan decreased from 12.3% to 4.2%. Additionally, the antitumor activity of HepG2 rapidly decreased as the branching degree changed from 12.3% to 9.3%. This antitumor activity then gradually increased as the branching degree decreased from 9.3% to 4.2%. Their results suggested the importance of the degree of branching on the structural properties of levan and its antitumor activities on the SNU-1 and HepG2 tumor cell lines [14]. Although levan has numerous potential applications, the structural and molecular properties of levan with various branching degrees are not well-understood at the molecular level.
To enhance the sampling accuracy and the probability of attaining the global minimum, the replica exchange molecular dynamics (REMD) technique simulates systems at various temperatures simultaneously and exchanges non-interacting systems among them [15,16]. Various studies have employed this method to explore the conformational spaces of oligosaccharides in solution, such as ε-cyclodextrin [17], high-mannose-type oligosaccharides [18], cellulose [19] and N-glycan core pentasaccharides [20]. Moreover, REMD was employed to investigate the conformational properties of a linear N-glycan [21], and a branched N-glycan found in the HIV envelop protein gp120 [21,22]. Recently, we used REMD to elucidate the structural and molecular properties of levan oligosaccharides (LFOs) with chain lengths of 5, 10 and 15 residues in two models of generalized Born implicit solvent (GB HCT and GB OBC1 ) [23]. We found that LFOs tended to form helix-like structures as chain length increased from 5 to 15 residues [23]. However, to our knowledge, REMD has not been used to explore the conformational spaces of levan with various branching degrees in solution or to elucidate their properties.
In this study, we performed REMD on models of 34-residue levan (L 34 ) with branching degrees of zero (L 34B0 ), one (L 34B1 ), three (L 34B3 ) and five (L 34B5 ) in two models of generalized Born implicit solvent (GB HCT and GB OBC1 ). We aimed to explore their conformational spaces and elucidate their structural and molecular properties, as well as the relationship between these properties and the branching degree (Fig 1b). This knowledge may be beneficial for understanding how branching degree affects the structural and molecular properties of levan.

Structure preparation and minimization
The LEaP module in AMBER14 [24] was employed to build the structures of L 34B0 , L 34B1 , L 34B3 and L 34B5 and assign their atom types and force field parameters based on GLYCAM06j-1 [25] (Fig 1b). For minimization and simulation of each system, two implicit solvent models (GB HCT and GB OBC1 ) were employed [26,27]. Minimization of all systems involved the 2,500 steepest-descent minimization cycles and 2,500 conjugate-gradient minimization cycles.

Replica exchange molecular dynamics simulations
Initially, sixteen replicas per system were equilibrated for 500 ps to reach the desired temperature range from 284.0 to 584.5 K; these temperatures were distributed exponentially. Using the SANDER module in AMBER14, the REMD of each system was performed for 100 ns, and each replica was exchanged every 2 ps. The SHAKE algorithm was used to remove all bondstretching freedoms associated with hydrogen, allowing a time step of 0.002 ps [28]. The random number generator was employed to reseed the initial velocity for all simulations [29]. A cut-off of 999 Å was employed to compute non-bonded interactions. To calculate the pairwise summation involved in the effective Born radii calculation, a maximum distance of 999 Å between atom pairs was used. To control the temperatures in all systems, Langevin dynamics with a collision frequency of 1 ps -1 were employed. The 100 ns trajectories of the replicas at 298 K were used for analysis of the structural and molecular properties of all systems.
To ensure a fair comparison of the effects of branching degree on the lengths of all systems, the lengths of the central 21-residue of the main chain (LC 21 ) were measured (Fig 1b). Helixliked structures with kinks were found with large populations in all systems, and each helical turn consisted of three fructosyl residues. To identify kinks and ensure a fair comparison of the effects of branching degree on the number of kinks in all systems, the angles among the centers of masses of three consecutive turns of residue i, i+3, and i+6 of the central 21-residue of the main chain (angle 3tc21 ) were measured (Fig 1c). A kink was defined as an angle 3tc21 less than 120˚. The number of kinks was counted using three frames of reference, starting from the first, second or third residue of the central 21-residue of the main chain, respectively (Fig 1b  and 1c). As counted by these three frames of reference, the median number of kinks was used to represent the number of kinks per structure. Free energy maps were plotted using the number of kinks and LC 21 to characterize the structures of all systems. All structures were clustered based on the number of kinks. The structure most similar to the average structure of all members of each cluster was selected as a "centroid" to represent each cluster as a major representative conformer. This "centroid" is a structure with the lowest heavy-atom root-mean-squaredeviation to the average structure.
To identify hydrogen bonds important for the formation of helix-like structures and hydrogen bonds involved with branching residues, the values of the occurrence frequencies of hydrogen bonds per structure and the occurrence frequencies of hydrogen bonds were calculated, respectively. Additionally, the occurrence frequencies of three dihedral angles between every two fructosyl residues of the main chains, ω (C4-C5-C6-O6), ψ (C5-C6-O6-C2') and ϕ (C6-O6-C2'-O5') were determined to measure conformational flexibilities of all systems.

Reliability of REMD simulations
The acceptance ratios of the replica exchange were computed to verify that the temperatures were optimally distributed, and the number of replicas was sufficient. We found that the acceptance ratios of the simulations of L 34B5 in the GB HCT model were almost constant at approximately 29% (S1a Fig), implying a free random walk in the replica (temperature) space. Furthermore, our results show a free random walk both in the replica space (S1b Fig) and the temperature space (S1c Fig). Additionally, there is sufficient overlap between the canonical probability distribution of the total potential energy at each temperature and for those of neighbors (S1d Fig). Similar results were observed for the simulations of L 34B0 , L 34B1 and L 34B3 in the GB HCT model. Their average acceptance ratios are almost constant at approximately 29%, 30% and 29% for L 34B0 , L 34B1 and L 34B3 , respectively. Additionally, the results of REMD simulations of the systems simulated in the GB OBC1 model are also similar to those simulated in the GB HCT model. Their average acceptance ratios are almost constant at approximately 29%. Our results demonstrate good reliability of the REMD simulations of all systems.

Major representative conformers of L 34B0 , L 34B1 , L 34B3 and L 34B5
To elucidate major representative conformers of L 34B0 , L 34B1 , L 34B3 and L 34B5 , their free-energy maps were determined and shown with the major representative conformers and populations in Figs 2 and 3 for systems simulated in the GB HCT and GB OBC1 model, respectively. These free-energy maps were characterized by the number of kinks in the central 21-residue portion of the main chain and LC 21 . These properties were measured at the central 21-residue of the main chains of all systems to ensure a fair comparison of the effects of branching degree on these properties (Fig 1b). All major representative conformers contain helical elements (helixlike structures), with the number of kinks ranging from zero to five. Helix-like structures adopted conformations of left-handed 3-fold helices, where each helical turn consisted of three fructosyl residues (Fig 1c). These results are consistent with previous findings that levan tends to form helix-like structures as its chain length increases from 5 to 15 residues [23]. In nature, kinks are also observed in DNA structures [30], long α-helical membrane proteins and soluble proteins (! 20 residues) [31]. Based on the number of kinks, all systems were clustered into six major representative conformers: helix-like structure, one-kink helix-like structure, two-kink helix-like structure, three-kink helix-like structure, four-kink helix-like structure and five-kink helix-like structure (Figs 2 and 3). Table 1 shows that two-kink helix-like structures were found with the highest population of 38.5%, 37.2%, 38.2% and 36.7% for L 34B0 , L 34B1 , L 34B3 and L 34B5 simulated in the GB HCT model, respectively. For structures with the same branching degree, the populations of major representative conformers tend to increase as the number of kinks increased from zero to two and subsequently decreased as the number of kinks increased from two to five. Moreover, our results show that as the branching degree increases from zero to five, the populations of helixlike structures and one-kink helix-like structures tend to increase slightly. Similar trends were also observed for those simulated in the GB OBC1 model (S1 Table). Additionally, we calculated the conformational distributions of the first and last 50 ns of the replica exchange molecular dynamics simulations. Examples of the conformational distributions of L 34B5 in GB HCT and GB OBC1 models are shown in S2 Table. These results show that the conformational distributions of the first 50 ns and last 50 ns simulations in each solvent model are reasonably similar, implying the convergence of the replica exchange molecular dynamics simulations.

Distribution of LC 21
To elucidate the effects of branching degree on the values of LC 21 , the distribution of LC 21 was calculated and is shown in Table 1. As simulated in the GB HCT model, the average values and highest frequency values of the LC 21 of structures with the same branching degree tended to decrease as the number of kinks in the structures increases. These results indicate that the  presence of kinks in a structure may cause the central 21-residue region of the main chain to be less extended, as its two ends could become closer each other. However, branching degree does not seem to significantly affect the values of LC 21 because L 34B0 , L 34B1 , L 34B3 and L 34B5 appear to have similar ranges and trends of LC 21 for all six major representative conformations, which have zero to five kinks. For all branching degrees, two-kink helix-like structures have the highest population with the highest frequency of LC 21 value of 44 A˚, and the average LC 21 values are in the range of 41.1-42.3 A˚. Similar trends were also observed in the structures simulated in the GB OBC1 model (S1 Table).

Distribution of angle 3tc21
To elucidate the effects of branching degree on the values of angle 3tc21 , the distribution of angle 3tc21 was calculated and is shown in Table 1. As simulated in the GB HCT model, the average and highest frequency values of angle 3tc21 of the structures with the same branching degree tend to decrease as the number of kinks in the structures increases, due to the fact that kinks have low angle 3tc21 (< 120˚) values according to the definition in this study. However, branching degree does not seem to significantly affect the values of angle 3tc21 because L 34B0 , L 34B1 , L 34B3 and L 34B5 appear to have similar ranges and trends for angle 3tc21 values for all six major representative conformations, which have zero to five kinks. For all branching degrees, two- Similar trends were also observed for the structures simulated in the GB OBC1 model (S1 Table).

Hydrogen bonds that are important for formation of helix-like structures
Since most structures contain helical elements, the occurrence frequencies of hydrogen bond per structure were calculated to identify hydrogen bonds important for the formation of helixlike structures. Hydrogen bonds with an occurrence frequency of hydrogen bonds per structure of at least 3% are shown in Table 2 and the S3 Table. The O6 (i) -H3O (i+1) hydrogen bond (between residue i and i+1) had the highest frequency for all systems simulated in the GB HCT or GB OBC1 model. Its glycosidic oxygen acts as an important hydrogen bond acceptor that interacts with the hydroxyl groups of C3 atoms in the furanose ring and likely helps to stabilize a helix-like structure (Fig 4a). This hydrogen bond is likely to be important for the formation of a helix-like structure, as its occurrence frequency is significantly higher than that for other hydrogen bonds. For all systems simulated in the GB HCT or GB OBC1 model, the O1 (i) -H3O (i) and O5 (i) -H1O (i) hydrogen bonds had the second and third highest frequencies, respectively;  O1 (i) -H3O (i) and O5 (i) -H1O (i) hydrogen bonds are hydrogen bonds within the same residue ( Table 2, S3 Table and Fig 4a). The trends for O6 (i) -H3O (i+1), O1 (i) -H3O (i) and O5 (i) -H1O (i) hydrogen bonds of these systems are similar to those found in the helix-like structures of LFO 10 and LFO 15 [23] probably because they all form helix-like structures, and these hydrogen bonds are likely to be important for helix formation. However, the occurrence frequencies of O6 (i) -H3O (i+1) hydrogen bond per structure of L 34B0 , L 34B1 , L 34B3 and L 34B5 are lower than those of the helix-like structures of LFO 10 and LFO 15 . These results may be caused by the fact that L 34B0 , L 34B1 , L 34B3 and L 34B5 form curved helix-like structures and contain kinks in most of the structures; therefore, it is more difficult for O6 (i) -H3O (i+1) hydrogen bonds to form.
On the other hand, LFO 10 and LFO 15 form helix-like structures that are less curved than L 34B0 , L 34B1 , L 34B3 and L 34B5 because they have less number of residues than L 34B0 , L 34B1 , L 34B3 and L 34B5 and they are less flexible and too short to form kinks. As a result, O6 (i) -H3O (i+1) hydrogen bonds are more likely to form in helix-like structures of LFO 10 and LFO 15 .
As branching degree increases, the occurrence frequency of hydrogen bonds per structure for the O5 (i) -H1O (i) hydrogen bond tends to decrease, while that of O1 (i) -H3O (i) hydrogen bonds tend to increase for systems simulated in the GB HCT or GB OBC1 model ( Table 2 and S3  Table). These trends are likely caused by the fact that H1O (i) is removed at each branching position to build an O1 glycosidic linkage with a branching residue. Therefore, less H1O (i) are available for hydrogen bond formation with O5 (i) , but more O1 (i) are available for hydrogen bond formation with H3O (i) .

Hydrogen bonds involved with branching residues
To identify hydrogen bonds involved with branching residues, the occurrence frequencies of hydrogen bonds forming among branching residues (br), branching position (bp) and other residues of L 34B1 , L 34B3 and L 34B5 were measured. The occurrence frequencies of these hydrogen bonds are drastically lower than those forming in the main chains (Table 3 and S4 Table) because the number of branching residues is significantly less than the number of residues in the main chains. For all systems simulated in the GB HCT or GB OBC1 model, the O1 (bp) -H3O (br) hydrogen bond between the glycosidic oxygen of the branching position and the hydroxyl group of the C3 atom of the furanose ring of the branching residue were found at the highest frequency (Table 3, S4 Table and Fig 4b). Moreover, the O1 (br) -H3O (br) and O5 (br) - Table 3. Occurrence frequency of hydrogen bond involved with branching residues of L 34B1 , L 34B3 and L 34B5 simulated in the GB HCT model.   4b). These O1 (br) -H3O (br) and O5 (br) -H1O (br) hydrogen bonds are hydrogen bonds within the same residue, which are similar to those found in the main chain (O1 (i) -H3O (i) and O5 (i) -H1O (i) hydrogen bonds). O5 (bp) -H6O (br), O3 (bp-2) -H6O (br) and O4 (bp-3) -H1O (br) hydrogen bonds were also observed. The formation of O1 (bp) -H3O (br) and O5 (bp) -H6O (br) hydrogen bonds are probably caused by the fact that H1O (bp) were removed at a branching position to build an O1 glycosidic linkage with a branching residue. Therefore, the O5 (bp) -H1O (bp) hydrogen bond could not be formed in the main chain. As a result, O1 (bp) was available to form a hydrogen bond with H3O (br) , while O5 (bp) was available to form a hydrogen bond with H6O (br) .

Conformational flexibilities
To measure conformational flexibilities of all systems, the occurrence frequencies of ω, ψ and ϕ of the main chains were computed (Fig 5 and S2 Fig). The value of ω with the highest frequency is around -65˚to -60˚(major peak). Moreover, there are two additional peaks centering around -170˚to -165˚and 55˚to 60˚, respectively. The major peak of ψ centers around 175˚to 180˚with two shoulders. The first shoulder starts from ψ value around -65˚, and the second shoulder starts from ψ value around 40˚or 45˚for the systems simulated in the GB HCT or GB OBC1 model, respectively. The major peak of ϕ centers around -65˚to -60˚. There are also two very small peaks centering around -170˚to -165˚and 60˚to 65˚. These results show that ω is probably more flexible than ψ and ϕ because it has more possibilities in rotating and changing levan conformations. The trends of the occurrence frequencies of ω, ψ and ϕ of the main chains of these systems are similar to those of levan oligosaccharides with the chain lengths of 5, 10 and 15 residues [23].

Conclusions
To explore the conformational spaces of 34-residue levans with various branching degrees and elucidate their structural and molecular properties and the effects of branching degree on these properties, we performed REMD on 34-residue levans with branching degrees of zero, one, three and five. To ensure a fair comparison of the effects of branching degree on the structural and molecular properties, we focused on analyzing the properties of the central 21-residue region of the main chains. Our results show that all major representative conformations of all branching degrees tend to form helix-like structures with kinks, and two-kink helix-like structures had the highest population. Moreover, our findings reveal that as the branching degree increases from zero to five, the populations of helix-like structures and one-kink helixlike structures tend to increase slightly. Furthermore, the average values and highest-frequency values of LC 21 and angle 3tc21 for structures with the same branching degree tend to decrease as the number of kinks in the structures increases. For all systems, the O6 (i) -H3O (i+1) hydrogen bond has the highest occurrence frequency per structure and is likely to be important for the formation of a helix-like structure, as its frequency is substantially higher than other hydrogen bonds. Hydrogen bonds involved with branching residues were found with substantially lower frequencies than those important for helix-like structure formation. Examples of these hydrogen bonds are the O1 (bp) -H3O (br) , O1 (br) -H3O (br) and O5 (br) -H1O (br) hydrogen bonds that were found with the first, second and third highest frequencies, respectively. It is worth mentioning that GB implicit solvent models may not be able to model highly specific solutewater interactions as accurate as explicit solvent model and may affect conformational sampling. Our study employed GB implicit solvent models because it is currently computationally prohibitive to perform REMD on 34-residue levans in explicit solvent model. Furthermore, GB implicit solvent models were successfully employed to explore conformational spaces of oligosaccharides such as ε-cyclodextrin [17] and levan oligosaccharides [23]. This work provides important and novel insights into the structural and molecular properties of levan with various branching degrees at the molecular level, as well as the effects of branching degree on these properties.