Gastrointestinal Endogenous Proteins as a Source of Bioactive Peptides - An In Silico Study

Dietary proteins are known to contain bioactive peptides that are released during digestion. Endogenous proteins secreted into the gastrointestinal tract represent a quantitatively greater supply of protein to the gut lumen than those of dietary origin. Many of these endogenous proteins are digested in the gastrointestinal tract but the possibility that these are also a source of bioactive peptides has not been considered. An in silico prediction method was used to test if bioactive peptides could be derived from the gastrointestinal digestion of gut endogenous proteins. Twenty six gut endogenous proteins and seven dietary proteins were evaluated. The peptides present after gastric and intestinal digestion were predicted based on the amino acid sequence of the proteins and the known specificities of the major gastrointestinal proteases. The predicted resultant peptides possessing amino acid sequences identical to those of known bioactive peptides were identified. After gastrointestinal digestion (based on the in silico simulation), the total number of bioactive peptides predicted to be released ranged from 1 (gliadin) to 55 (myosin) for the selected dietary proteins and from 1 (secretin) to 39 (mucin-5AC) for the selected gut endogenous proteins. Within the intact proteins and after simulated gastrointestinal digestion, angiotensin converting enzyme (ACE)-inhibitory peptide sequences were the most frequently observed in both the dietary and endogenous proteins. Among the dietary proteins, after in silico simulated gastrointestinal digestion, myosin was found to have the highest number of ACE-inhibitory peptide sequences (49 peptides), while for the gut endogenous proteins, mucin-5AC had the greatest number of ACE-inhibitory peptide sequences (38 peptides). Gut endogenous proteins may be an important source of bioactive peptides in the gut particularly since gut endogenous proteins represent a quantitatively large and consistent source of protein.


Introduction
The main role of dietary proteins is to provide amino acids for body protein synthesis [1]. However, investigations over the last two decades have shown that dietary protein can also be a source of latent bioactive peptides (from 2 to greater than 40 amino acids long) that when released during digestion in the gastrointestinal tract can act as modulators of various physiological functions [2,3,4]. These peptides are reported to possess a range of effects including antihypertensive, cholesterol-lowering, antioxidant, anticancer, immunomodulatory, antimicrobial, opioid, antiobesity and mineral binding effects [2,5,6]. The most extensively studied dietary sources of these bioactive peptides include milk, egg, meat, soya and cereal proteins [2,3,7,8]. The bioactive peptides released during the digestion of dietary proteins are believed to act either within the gastrointestinal tract or are possibly absorbed into the bloodstream where they may act systemically [3,9,10,11].
The supply of dietary proteins, and therefore the supply of gastrointestinal bioactive peptides derived from those proteins, will likely be highly variable as humans do not consume the same foods or amounts of food on a day to day basis. However, a considerable amount of endogenous (non-dietary) protein is also present in the lumen of the gastrointestinal tract during digestion, consisting of proteins such as mucins, serum albumin, digestive enzymes, protein within sloughed epithelial cells and microbial protein, and this material may be a source of bioactive peptides [12]. When compared to dietary protein, gut endogenous proteins represent a larger and more constant supply of protein in the gastrointestinal tract [13,14,15,16], with endogenous nitrogen entering the digestive tract of humans being quantitatively equal or greater than the dietary nitrogen intake [16,17,18,19,20]. In a study conducted using pigs fed a casein-based diet, it has been reported that up to 80% of endogenous proteins are digested and reabsorbed by the end of the small intestine [14]. During digestion a wide range of endogenous protein-derived peptides are likely to be generated, but the biological activity of such endogenously sourced gut peptides has not yet been considered. Potentially, gut endogenous proteins could be an important source of gut bioactive peptides given the amount of endogenous proteins present in the gastrointestinal tract. This study aimed to use an in silico approach to investigate whether known bioactive peptide sequences are present within the amino acid sequences of endogenous proteins secreted along the gastrointestinal tract and whether these bioactive peptides may potentially be released during enzymatic digestion in the human gastrointestinal tract. To our knowledge, the present study is the first to show that the amino acid sequences of gut endogenous proteins hold within them abundant bioactive peptide sequences and that the possibility exists that these peptides are released during gastrointestinal digestion.

Methods
Twenty six known human gut endogenous proteins with well characterised amino acid sequences were examined. Additionally, 7 dietary proteins, which have been reported to contain bioactive peptides, were also examined [2,21,22,23,24]. The proteins analysed are shown in Table 1.

Prediction of the Total Number of Bioactive Peptide Sequences Present in the Intact Gut Endogenous and Dietary Proteins
To predict the number of bioactive peptide sequences encoded within the gut endogenous and dietary proteins, the amino acid sequence of each protein was obtained from an online protein database [25]. The amino acid sequence of each protein was examined for the presence of known bioactive peptide sequences using an online bioactive peptide database [22]. The latter database contained the amino acid sequences of 2609 known bioactive peptides with 48 different bioactivities known to be bioactive based on either in vitro or in vivo studies [22,26]. The bioactive peptide database was used according to the instructions given and peptides possessing one or more of the following bioactivities were identified [antiamnestic, angiotensin converting enzyme (ACE)-inhibitor, antithrombotic, stimulating (glucose uptake-, -vasoactive substance release), regulating (ion flow-, stomach mucosal membrane activity-, phosphoinositol mechanism peptide-), antioxidative, bacterial permease ligand, inhibitor (dipeptidyl peptidase IV-, dipeptidyl-aminopeptidase IV-, dipeptidyl carboxypeptidase-, cyclic nucleotide phosphodiesterase 1 (CaMPDE)-, neuropeptide-), hypotensive, activating ubiquitin mediated proteolysis]. The total number of bioactive peptide sequences identified in the intact proteins was recorded for each gut endogenous and dietary protein.
The bioactive peptide frequency (A) and the relative bioactive peptide frequency (Y) are often used to describe the potency of proteins as sources of bioactive peptides [22,27]. In the present study the frequency of bioactive peptide sequences within the intact protein (A o ) was calculated as follows: where, a o is the total number of identified bioactive peptides present in the protein or the number of bioactive peptides with a specific activity based on the BIOPEP database [22], N is the total number of amino acid residues within the protein.
The relative frequency of occurrence of bioactive peptides with a specific activity (Yj)[%]: where, A oj is the number of peptides with a specific activity, l is the total number of peptide sequences across all activity categories present within the protein, j is the specified activity.
where, a D is the number of identified bioactive peptides present after the simulated (in silico) digestion and N is the total number of amino acid residues within the protein.
A prediction of the number of bioactive peptides that would be released from gut endogenous proteins and dietary proteins after upper gastrointestinal tract digestion was made using an in silico simulation based on the amino acid sequence of the proteins and the reported specificity of the major proteases present in the gastrointestinal tract. The site of secretion of the gut endogenous proteins was also taken into account. For the gut endogenous proteins secreted in the mouth and stomach and for the dietary proteins, gastric digestion was simulated in silico based on the amino acid sequence of the dietary or gut endogenous protein and the specificity of pepsin as documented by Keil [28,29]. Gastric and small intestinal digestion was predicted based on the specificity of pepsin, trypsin and chymotrypsin as documented by Keil [28,29]. For endogenous proteins secreted in the small intestine, only small intestinal digestion was simulated taking into account the reported specificity of trypsin and chymotrypsin only. The amino acid sequences of the endogenous and dietary proteins were obtained from a protein sequence database as described above [25]. The in silico simulated digestion was conducted using an online Peptide Cutter tool application [28]. The amino acid sequence of each of the predicted resultant peptides for each of the gut endogenous and dietary proteins was then compared to the amino acid sequence of known bioactive peptides using an online bioactive peptide sequence database [22].

The Total Number and Frequency (A O ) of Bioactive Peptide Sequences within the Amino Acid Sequence of Intact Gut Endogenous Proteins and Intact Dietary Proteins
Among the dietary proteins studied, the amino acid chain length of the proteins varied from 209 (b-casein) to 1939 (myosin) amino acids, while for the gut endogenous proteins the range was from 80 (human gastrin) to 5159 (human mucin-2) amino acids ( Table 1). The total number of bioactive peptide sequences identified and their corresponding potential bioactivities, within the amino acid sequences of the intact gut endogenous and dietary proteins are shown in Table 2. In addition, the A O values for each activity and for all the activities considered along with Y values for each of the proteins are also shown.
The total number of bioactive peptides, present within the amino acid sequences of the intact protein molecules for the gut endogenous proteins, ranged from 46 peptides for somatostatin to 2507 peptides for Mucin-5AC (Table 2). When based on the subclasses of proteins presented in Table 1, the total number of identified bioactive peptide sequences present within the amino acid sequences of the intact protein molecules ranged from 142-2507 for the mucins, 339 for serum albumin, 125-268 for the digestive enzymes, 46-86 for the hormones and 68-223 for the remaining ''other'' proteins. For the dietary proteins, the total Compiled from the UniProtKB Protein Database [25]. 2 The given chain length excludes signal peptide. 3 Initiator methionine not removed from the intact protein sequence (chain length inclusive of the initiator methionine). doi:10.1371/journal.pone.0098922.t001 Table 2.  Table 2. Cont.  Table 2. Cont.   where, a O is the total number of identified bioactive peptides present in the protein or the number of bioactive peptides with a specific activity based on the BIOPEP database [22], N is the total number of amino acid residues within the protein. 2 Y is the relative frequency of occurrence of bioactive fragments with a specific activity in a protein sequence, calculated as Yj~(Aoj= where, Aoj is the number of peptides with a specific activity, l is the total number of peptide sequences across all activity categories present within the protein, j is the specified activity. 3 1 antiamnestic, 2 ACE-inhibitor, 3 antithrombotic, 4 stimulating (glucose uptake-, -vasoactive substance release), 5 regulating (ion flow-, stomach mucosal membrane activity-, phosphoinositol mechanism peptide-), 6 antioxidative, 7 bacterial permease ligand, 8 inhibitor (dipeptidyl peptidase IV inhibitor-, dipeptidyl-aminopeptidase IV inhibitor-, dipeptidyl carboxypeptidase-, CaMPDE-, neuropeptide-), 9 hypotensive, 10 activating ubiquitin mediated proteolysis. 4 Overall A O represents the total number of amino acid sequences corresponding to known bioactive peptides identified per protein molecule across all bioactivity categories normalised for amino acid chain length. 5 The total number of bioactive peptides represents the total number of amino acid sequences corresponding to known bioactive peptides identified per protein molecule across all bioactivity categories (not just the 10 bioactivity categories shown above). doi:10.1371/journal.pone.0098922.t002 number of identified bioactive peptide sequences present within the amino acid sequence of the intact proteins ranged from 148 for glutenin to 1072 for myosin. Among the observed bioactivity categories, angiotensin converting enzyme (ACE)-inhibitory peptide sequences were present in the largest numbers for all of the examined dietary and gut endogenous proteins with Y ranging from 43% for gastrin to 75% for mucin-2 for the gut endogenous proteins and from 44% for gliadin to 67% for actin. For the gut endogenous proteins, the A O for the ACE-inhibitory peptide sequences ranged from 212 for mucin-3A to 485 for secretin while for the dietary proteins A O for the ACE-inhibitory peptide sequences ranged from 243 for glutenin to 608 for b-casein.
In addition to the 10 most abundantly observed bioactive peptide categories presented in Table 2, peptide sequences reportedly possessing other bioactivities were also observed in a few select proteins. For example, opioid peptide sequences were present within the amino acid sequences of all of the dietary proteins but only a few of the endogenous proteins. Similarly, coeliac toxic peptide sequences were present within the amino acid sequences of the wheat proteins gliadin and glutenin only (data not shown).

Predicted Number and Frequency (A D ) of Bioactive Peptides Released After Gastric Digestion of Dietary Proteins and Gut Endogenous Proteins Based on an in silico Simulation
The number of bioactive peptides (and their corresponding predicted bioactivities) predicted to be released after gastric digestion of gut endogenous proteins secreted in the mouth and stomach and of dietary proteins based on an in silico simulation of gastric digestion are presented in Table 3. The total number of bioactive peptides predicted to be released after gastric digestion of the gut endogenous proteins ranged from 0 to 12 bioactive peptides per protein molecule for lysozyme C and serum albumin respectively. When grouped into the protein subclasses shown in Table 1, the total number of predicted bioactive peptides after digestion was 1-11 peptides per molecule for the mucins, 12 for serum albumin, 2-8 for the digestive enzymes, 0-2 for the hormones and 0-4 for the ''other'' proteins. For the dietary proteins, between 1 (glutenin and gliadin) and 11 (myosin) bioactive peptides were predicted to be released per protein molecule after gastric digestion. When the number of predicted bioactive peptides was presented in relation to the number of amino acids in each protein, the A D value for the mucins, serum albumin, digestive enzymes, hormones and ''other'' proteins was 1-6, 20, 4-21, 0-22 and 0-10 respectively. For the dietary proteins, the A D value ranged from 3 for (glutenin and actin) to 14 for (b-casein).
Bioactive peptides with ACE-inhibitory activity were predicted to be present after gastric digestion in higher numbers compared to peptides in the other activity categories with a total of 51 ACEinhibitory peptides predicted to be present post-digestion across all of the examined proteins as compared to 0-22 predicted peptides for all of the other activity categories. Serum albumin and myosin were predicted to yield the largest number of ACE-inhibitory peptides after peptic digestion with 8 ACE-inhibitory peptides per protein molecule. This was closely followed by 7 ACE-inhibitory peptides for mucin-5AC. Considerably fewer ACE-inhibitory peptides were predicted (0-4 peptides per molecule) for the remaining gut endogenous and dietary proteins. The other predicted bioactivities with identified peptides were stimulating (glucose uptake-), inhibitor (dipeptidyl peptidase IV-, dipeptidyl-aminopeptidase IV-), and antioxidative activities and activation of ubiquitin mediated proteolysis.

Predicted Number and Frequency (A D ) of Bioactive Peptides Released After Gastric and Small Intestinal Digestion of Dietary Proteins and Gut Endogenous Proteins Based on an in silico Simulation
The total number of bioactive peptides predicted to be released after gastric and small intestinal digestion in silico for the gut endogenous proteins secreted into the mouth and stomach, and that therefore underwent digestion in the stomach and small intestine, ranged from 1 peptide per protein molecule for secretin to 39 peptides per protein molecule for mucin-5AC (Table 4). When the proteins were divided into subclasses based on their functions as shown in Table 1, the predicted bioactive peptides released per protein molecule were 2-39 for the mucins, 22 for serum albumin, 4-15 for the digestive enzymes, 1-5 for the hormones and 3-10 for the ''other'' proteins. For the dietary proteins, the predicted number of bioactive peptides released after digestion (in silico) ranged from 1 for gliadin to 55 for myosin. When the size of the proteins were taken into account, the predicted A D value for the mucins, serum albumin, digestive enzymes, hormones and ''other'' proteins was 3-17, 37, 11-40, 10-56 and 23-31 respectively. For the dietary proteins, the predicted A D value ranged from 4 for gliadin to 38 for b-casein.
After in silico simulated gastric and small intestinal digestion, the most abundant bioactive peptides predicted to be present were the ACE-inhibitory peptides ranging from 1 peptide per protein molecule for secretin, somatostatin, gastric inhibitory peptide and gliadin to 38 for mucin-5AC. Other gut endogenous proteins from which notable amounts of ACE-inhibitory peptides were predicted to be released were mucin-6 (17 peptides per molecule), serum albumin (17 peptides per molecule) and gastric triacylglycerol lipase (10 peptides per molecule). Among the food proteins evaluated, myosin was predicted to yield the greatest number of ACE-inhibitory peptides (49 peptides per molecule). Other bioactive peptides predicted to be present after gastric and small intestinal digestion (based on an in silico simulation) across all proteins were glucose uptake-or vasoactive substance releasestimulating (0-4 per molecule), dipeptidyl peptidase IV-or dipeptidyl-aminopeptidase IV-inhibitor (0-5 peptides per molecule), antioxidative (0-5 peptides per molecule), ion flow-or stomach mucosal membrane activity-regulating (0-1 peptides per molecule), and hypotensive (0-1 peptides per molecule) peptides and peptides activating ubiquitin mediated proteolysis (0-1 peptides per molecule).

Predicted Number and Frequency (A D ) of Bioactive Peptide Sequences Released After Small Intestinal Digestion of Gut Endogenous Proteins Secreted in the Small Intestine Based on an in silico Simulation
For endogenous gut proteins that are secreted into the small intestine (for example, the pancreatic enzymes and small intestinal mucins) and therefore would not be subject to digestion in the stomach, an in silico analysis of the bioactive peptides that would be predicted to be released after intestinal digestion alone was performed (Table 5). For these proteins mucin-2, serum albumin and pancreatic amylase had the greatest predicted numbers of bioactive peptides released, with 24, 14 and 14 bioactive peptides respectively per molecule; while secretin had the least (1 peptide per molecule). Within the subclasses of proteins based on protein function and presented in Table 1, the predicted number of bioactive peptides released after digestion was 2-24 peptides per where, a D is the number of identified bioactive peptides present after the simulated (in silico) digestion and N is the total number of amino acid residues within the protein. 2 2 ACE-inhibitor, 4 stimulating (glucose uptake-, -vasoactive substance release), 6 antioxidative, 8 inhibitor (dipeptidyl peptidase IV inhibitor-, dipeptidyl-aminopeptidase IV inhibitor-, dipeptidyl carboxypeptidase-, CaMPDE-, neuropeptide-), 10 activating ubiquitin mediated proteolysis. 3 The total number of peptides released is a summation of all the bioactive peptides predicted to be released after digestion of the intact proteins. 4 Some of the predicted bioactive peptides have more than one activity. Hence, the total number of bioactive peptides released may be less than the summation of the number of bioactive peptides from the individual activity categories. doi:10.1371/journal.pone.0098922.t003 Table 4. Number of potential bioactive peptides (per protein molecule) predicted to be released and A where, a D is the number of identified bioactive peptides present after the simulated (in silico) digestion and N is the total number of amino acid residues within the protein. 2 2 ACE-inhibitor, 4 stimulating (glucose uptake-, -vasoactive substance release), 5 regulating (ion flow-, stomach mucosal membrane activity-), 6 antioxidative, 8 inhibitor (dipeptidyl peptidase IV inhibitor-, dipeptidylaminopeptidase IV inhibitor-, dipeptidyl carboxypeptidase-, CaMPDE-, neuropeptide-), 9 hypotensive, 10 activating ubiquitin mediated proteolysis. 3 The total number of peptides released is a summation of all the bioactive peptides predicted to be released after digestion of the intact proteins. 4 Some of the predicted bioactive peptides have more than one activity. Hence the total number of bioactive peptides released may be less than the summation of the number of bioactive peptides from the individual activity categories. doi:10.1371/journal.pone.0098922.t004 Table 5. Number of potential bioactive peptides (per protein molecule) predicted to be released and A where, a D is the number of identified bioactive peptides present after the simulated (in silico) digestion and N is the total number of amino acid residues within the protein. 2 2 ACE-inhibitor, 5 regulating (ion flow-, stomach mucosal membrane activity-, phosphoinositol mechanism peptide-), 6 antioxidative, 8 inhibitor (dipeptidyl peptidase IV inhibitor-, dipeptidyl-aminopeptidase IV inhibitor-, dipeptidyl carboxypeptidase-, CaMPDE-, neuropeptide-), 9 hypotensive. 3 The total number of peptides released is a summation of all the bioactive peptides predicted to be released after digestion of the intact proteins. 4 Some of the predicted bioactive peptides have more than one activity. Hence the total number of bioactive peptides released may be less than the summation of the number of bioactive peptides from the individual activity categories.
molecule for the mucins, 14 peptides per molecule for serum albumin, 4-14 peptides per molecule for the digestive enzymes and 1-2 peptides per molecule for the hormones and 2 peptides per molecule for lysozyme C. The corresponding A D values were 5-10 for the mucins, 24 for serum albumin, 13-28 for the digestive enzymes and 10-25 for the hormones and 15 for lysozyme C.

Discussion
All of the protein amino acid sequences were sourced from the UniProt Protein Knowledgebase, a standard repository of protein sequences related information [25]. BIOPEP, the database of bioactive peptides used in the present study, is a widely recognised and utilised tool for the bioinformatics based prediction of bioactive peptides in a given amino acid sequence [22,27,30,31]. The associated bioactivity of the peptide sequences listed in the BIOPEP database is documented and continually updated based on previous and on-going in vitro and in vivo studies [22,26,32,33]. The resultant peptides generated after simulated gastrointestinal digestion were predicted using Peptide Cutter, an enzymatic cleavage prediction software [28], that is hosted by the ExPASY server, a standard tool used in bioinformatics and mass spectrometry-based studies [34].
The findings of the present study are based on an in silico gastrointestinal digestion prediction-model. The model is based on the amino acid sequence (primary structure) of the intact proteins and knowledge about the specificity of proteases in the gastrointestinal tract. Being an in silico model, it cannot be concluded with certainty that the purported bioactive peptides will be generated after the actual in vivo gastrointestinal tract digestion of gut endogenous proteins. However, there are similarities between data generated in the presently reported study and data generated in other in silico, in vitro and in vivo studies. For example, in the present study b-casein was found to be the greatest potential source of bioactive peptides, including ACE-inhibitory peptides. This finding is consistent with another in silico study that examined a range of food proteins and predicted that bovine caseins were the greatest source of ACE-inhibitory peptides [35]. In addition, Boutrou et al [11] investigated the kinetics of the release of peptides from either casein or whey proteins in the jejunum of humans, and reported that b-casein released both larger numbers of bioactive peptide fragments and generated peptides with a diverse range of bioactivities. Moreover, and in line with our own findings, in vitro studies have shown that the antihypertensive peptides VPP and IPP present in the amino acid sequence of bovine b-casein, which are known to be released during lactobacilli-based fermentation of milk [36], are not released during enzymatic digestion using an in vitro digestion model that simulated digestion in the gastrointestinal tract [37].
Overall, the in silico technique used in the presently reported study does demonstrate that large numbers of bioactive peptide sequences do exist within the amino acid sequences of endogenous proteins that may be cleavable by the digestive enzymes and it is likely that in the process of digestion within the gut, bioactive peptides would be liberated from the gut endogenous proteins, particularly given that it is known from in vivo studies that as much as 80% of the endogenous protein secreted into the gastrointestinal tract is digested and reabsorbed [14,15]. The presently reported study does not include analysis of two major contributors to the non-dietary nitrogenous losses in the gut, namely, bacterial proteins and the sloughed epithelial cells. Also factors that may influence in vivo protein digestion in the gastrointestinal tract, such as, the tertiary structure of the proteins, the effects of food processing on protein digestion, and the influence of bacterial enzymatic digestion have not been taken into account. An attempt has been made, however, to analyse a range of gut endogenous proteins secreted at different sites within the gut and with known amino acid sequences.
All of the dietary and gut endogenous proteins evaluated in the present study contained large numbers of peptide sequences within the greater amino acid sequence of the intact protein that corresponded to the sequences of known bioactive peptides, at least based on the BIOPEP bioactive peptide database [22]. Furthermore, the total number of bioactive peptide sequences present in the overall amino acid sequence of the intact proteins varied across both dietary and gut endogenous proteins, although the range was much greater for the endogenous proteins. The mucin proteins generally contained the greatest number of bioactive peptide sequences while the hormone molecules contained the least. In comparison with the dietary proteins, 16 of the 26 gut endogenous proteins contained a similar or greater number of bioactive peptide sequences per molecule. This suggests that based on amino acid sequence, the gut endogenous proteins may contain quantitatively significant amounts of bioactive peptides. In general, for both the food and gut endogenous proteins, smaller proteins contained comparatively fewer bioactive peptide sequences when compared to the larger proteins. The latter observation indicated, not unexpectedly, that the longer the amino acid chain of a protein, the higher the probability of finding peptide sequences that correspond to previously studied and reported bioactive peptides documented in the BIOPEP database [22].
If the gut endogenous proteins and food proteins are considered in terms of the potential bioactive profile (the relative number of bioactive peptide sequences within each bioactivity category), both gut endogenous and dietary proteins were similar with ACEinhibitory peptide sequences being present in the greatest numbers. This may be attributed to the fact that ACE-inhibitory peptides have been researched more extensively in comparison to all of the other bioactivities and hence the bioactive peptide database used in the present study contains a much higher proportion of known ACE-inhibitory peptides as compared to bioactive peptides with other activities [2,38]. Both the gut endogenous proteins and the dietary proteins seem to contain remarkably similar relative numbers of bioactive peptides within each activity category particularly given the very different amino acid sequences across the different proteins. For example, ACEinhibitory peptides comprised 43-75% of the total number of bioactive peptides found across the proteins while inhibitor peptides comprised 10-29%, antioxidative peptides comprised 3-14%, stimulating peptides comprised 3-13% and hypotensive peptides comprised 0-2%. Overall, large numbers of bioactive peptide sequences were observed in the intact gut endogenous protein amino acid sequences. In comparison to the dietary proteins examined in the present study, gut endogenous proteins were similar in terms of being a potential source of bioactive peptides.
Significant numbers of bioactive peptides were predicted to be released after gastric digestion (based on an in silico digestion model) of both food and gut endogenous proteins; however the numbers predicted were only 0-3.5% (average across all examined proteins = 1.0%) of the total number of bioactive peptide amino acid sequences identified in the intact amino acid sequences of each protein. It would appear, based on the in silico prediction used in the present study, that for both the dietary and gut endogenous proteins most of the predicted bioactive peptide sequences present in the intact proteins would not be released during gastric enzymic digestion. In terms of the bioactive peptides that were predicted to be released after gastric digestion, the gut endogenous proteins appeared to be similar to the dietary proteins both in terms of the total number of predicted bioactive peptides and the number of predicted bioactive peptides normalised for the amino acid chain length of the protein (A D values).
The number of bioactive peptides predicted to be released after gastric and small intestinal digestion combined were considerably higher compared to gastric digestion alone but were still much fewer in comparison to the number of bioactive peptide amino acid sequences identified within the intact protein (3.3% of the total number of the identified bioactive peptides were predicted to be released across protein sources). It was predicted that after combined gastric and small intestinal digestion, many endogenous proteins were an equal source of bioactive peptides compared to the selected dietary proteins with a mean A D across all of the endogenous proteins of 23 compared to 22 for the dietary proteins. Moreover at least two of the endogenous proteins had a greater A D value in comparison with b-casein, a known rich source of bioactive peptides.
Not all gut endogenous proteins are secreted ubiquitously throughout the gastrointestinal tract [25]. For example while serum albumin is known to be secreted into both the stomach and the small intestine [39,40], trypsin is only secreted into the duodenum and therefore is only subject to digestion in the small intestine. For proteins that are secreted in the small intestine, digestion in the gastrointestinal tract was predicted based on an in silico model for small intestinal digestion alone with the two major intestinal enzymes trypsin and chymotrypsin. The number of bioactive peptides predicted to be present after small intestinal digestion alone were much fewer in comparison to those predicted after both gastric and intestinal digestion. For example, the total number of bioactive peptides predicted to be released after small intestinal digestion of serum albumin (14 bioactive peptides per protein molecule) was much lower than that predicted for gastric and intestinal digestion (22 bioactive peptides per protein molecule). Despite this, the results of the present study would predict that gut endogenous proteins secreted into the small intestine also appear to be significant sources of bioactive peptides.
After small intestinal digestion alone, the predicted released bioactive peptides possessed fewer bioactivities. For example, across all of the examined proteins (gut endogenous and dietary) the bioactive peptides predicted to be released after gastric and intestinal digestion had collectively up to 7 different bioactivities, while after small intestinal digestion alone, the predicted bioactive peptides collectively had only up to 3 different bioactivities, with an exception of serum albumin and mucin-2 which were predicted Table 6. Amino acid 1 sequences of bioactive peptides predicted to be released after mouth to ileum digestion of selected proteins based on an in silico digestion model.  to release bioactive peptides in two additional bioactivity categories. Furthermore, for proteins that are secreted in both the stomach and small intestine, the same protein was predicted to release different bioactive peptide sequences in terms of total number and amino acid sequence depending on the site of digestion (gastric+small intestinal vs. small intestinal alone; Table 6).
For the most abundantly predicted bioactivity, ACE-inhibition, based on the present in silico digestion model, it would appear that on a per molecule basis, gut endogenous proteins may be similar to dietary proteins in terms of the potential to release ACE-inhibitory peptides in the upper gastrointestinal tract as a result of digestion.
The majority of the bioactive peptide sequences present in the amino acid sequence of the intact gut endogenous protein and after ''in silico'' digestion were di-or tri-peptides, while for the dietary proteins, bioactive peptides of 6 to 9 amino acids in length were also observed ( Table 6). The 3 known opioid agonist peptides in b-casein (5 to 11 amino acid long, data not shown) were also longer in chain length than the average bioactive peptide chain length observed in the gut endogenous proteins. In terms of the amino acid composition of the gut endogenous proteins evaluated, it is of note that, many of them contain significant amounts of glycine or proline or both, and it has been reported that a high content of glycine and proline is related to a higher probability of finding bioactive peptide fragments [32].
This study makes no attempt to investigate the efficacy of bioactive peptides but rather provides an in silico prediction of the number and types of bioactive peptides that potentially can be generated in the gastrointestinal tract during digestion.
To put the current findings into context an attempt was made to predict the amounts of bioactive peptides that may be released into the gastrointestinal tract per day from either dietary protein or gut endogenous protein sources (Table 7). For the daily dietary protein intake, the food proteins examined in the presently reported study were used as ingredients for a theoretical diet. This model diet was formulated to contain approximately 40 g of protein which represents the Food and Agricultural Organisation of the United Nations' (FAO) recommended daily protein intake for a healthy adult weighing 60 kg [1]. The proportion of each individual dietary protein was derived based on a model diet assumed to contain 127 g of dairy products, 128 g of wheat-based products, 25 g of soya products, 44 g (1, medium sized) egg and 46 g of roasted chicken (the vegetables, fruits, fats and sugars in the diet were omitted from the present estimations as they contain negligible amounts of proteins). The amount of endogenous protein secreted into the gastrointestinal tract was estimated based on the reported amounts of gut endogenous protein nitrogen secreted into the gastrointestinal tract, but excludes protein nitrogen derived from epithelial and bacterial cells [41]. Based on the model diet, it is predicted that in a healthy adult, dietary proteins may contribute 1842 mg, while the gut endogenous proteins (excluding microbial protein and sloughed cells) may yield up to 2689 mg of bioactive peptides per day. Given that microbial protein and sloughed cells, which make up approximately two thirds of the total gut non-dietary protein, were not included in the latter prediction it is likely that the amount of bioactive peptides derived from gut endogenous proteins would be much higher.
In conclusion, based on an in silico prediction it would appear that gut endogenous proteins may be an important and diverse source of bioactive peptides, in comparison with food proteins, particularly given that gut endogenous proteins are likely to be present in the gastrointestinal tract at a more constant concentration and composition than proteins derived from the diet. However, further in vitro and in vivo work is needed to corroborate the in silico predictions of the present study. Estimated based on the predicted total number of bioactive peptides released after gastric and small intestinal digestion (from Table 4), and the moles and molar masses of the respective proteins; and considering that the majority of the predicted bioactive peptides are 'dipeptides'. All of the evaluated food proteins are used as a model for the remaining proteins in the respective food product. 2 The model diet is based on a recommended diet for a healthy adult weighing 60 kg, supplying 0.66 g/kg body weight protein per day, amounting to a protein intake of 40 g per day, designed to comply with the FAO recommendations 1]; whereby dairy, wheat, soya products, chicken egg products, chicken meat contribute 4, 14, 3, 6 and 13 g of protein respectively; Protein content of food products estimated based on the United States Department of Agriculture (USDA) Nutrient Data Laboratory database 42]. 3 Calculated based on Moughan, 2011 41], using the amount of gut endogenous protein nitrogen secreted into the gastrointestinal tract, but, excludes protein nitrogen derived from epithelial and bacterial cells. doi:10.1371/journal.pone.0098922.t007