Figures
Abstract
Serratia marcescens, a Gram-negative bacterium (Enterobacteriaceae) is a hospital-acquired opportunistic pathogen that infects the urinary and central nervous systems. The identification of new therapeutics against S. marcescens is crucial since it is now multi-drug resistant. Therefore, the current study was aimed to identify potential drug targets against S. marcescens strains i.e. WW4, SM39, and Db11 using comparative metabolic pathway analysis and subtractive genomics approach. The applied bioinformatics-based method was used to identify the unique metabolic pathways as the prioritized drug targets. The downstream analysis has led to the identification of three pathways that are specifically absent and/or present in the specific strain. Consequently, six proteins were identified through subtractive genomic analysis. The identified proteins were found as non-homologous and essential to the pathogen’s survival as well as unique to the WW4 strain. The estimated features proposed it as a potential drug target. The selected protein was further subjected to in-depth structural analysis for the structure modeling, structure validation, and protein-protein interaction analysis. Furthermore, the library of ~1500 approved compounds was screened against selected drug target to identify potential drug candidates. The current work may help in repurposing of the drug compounds as novel medication against S. marcescens.
Citation: D’Souza SE, Khan K, Uddin R (2023) Proteogenomic analysis of Serratia marcescens using computational subtractive genomics approach. PLoS ONE 18(4): e0283993. https://doi.org/10.1371/journal.pone.0283993
Editor: Syed Hani Abidi, Nazarbayev University School of Medicine, PAKISTAN
Received: December 6, 2022; Accepted: March 21, 2023; Published: April 10, 2023
Copyright: © 2023 D’Souza et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting information files.
Funding: The authors received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Serratia marcescens is an opportunistic nosocomial pathogen and causative agent of many infections including urinary and central nervous systems (meningitis) as well as heart (endocarditis) and wound infections [1, 2]. S. marcescens affects immunocompromised patients specifically those undergoing broad-spectrum antibiotic therapy. The bacteria spread mostly through invasive instrumentation such as intubation material, intravenous and urinary catheters [3, 4]. The pathogenicity of S. marcescens can be attributed to several virulence factors such as the pore-forming toxin hemolysin, the serralysin protease or a phospholipase [5–7]. Presently, S. marcescens has been reported as an epidemic, primarily in The Neonatal Intensive Care Units (NICUs) and Intensive Care Units (ICUs) [8, 9]. S. marcescens are now generally multidrug-resistant [10].
Currently, antibiotic resistance is a threat to antibacterial therapies [11]. The antimicrobial resistance posed a serious threat to treat a pathogen because of the limited options. During the last twenty years, S. marcescens was empirically treated using aminoglycoside, piperacillin-tazobactam, carbapenem, or fluoroquinolone. Later, modification of the treatment was suggested based on the strain susceptibility test results [1]. A recent study revealed that S. marcescens is intrinsically polymyxin resistant which is an antibiotic being used as a last resort to treat carbapenem-resistant Enterobacteriaceae [12]. The WHO has classified species of Serratia under the critical category as they are resistant to carbapenem [13]. Research into the antimicrobial resistance genes among the S. marcescens using public whole-genome datasets helped in the detection of 100 AMR (antimicrobial resistant) genotypes. Aminoglycoside and beta-lactam resistant genes including ESBL (Extended-spectrum beta-lactamases) and carbapenemase genes are highly prevalent in clinical isolates of S. marcescens [14]. The limited research against hospital-acquired infections during last decade has translated into a deficit of drugs against multidrug-resistant pathogens [15, 16]. There is a dire need of innovative therapeutic interventions against resistant bacteria [17]. There has been an increased demand of research on S. marcescens in recent times due to its antibiotic resistance capability [18].
Different computational methods have been used earlier to search new chemical entities as potential drug candidates. In this context, comparative subtractive genomics coupled with metabolic pathway analysis produced robust data to prioritize unique essential proteins that may serve as drug targets. These essential proteins may act as putative targets for designing novel drugs. Previously, numerous such applications are reported in literature to propose new drug targets against deadly pathogens [19]. Contrary to the traditional drug discovery process which is tedious, time-consuming and expensive, computational discovery of drug targets is a rational approach. It speeds up drug discovery, expands therapeutic options, and decreases rate of failure of drugs in clinical trials. Bioinformatics methods can predict drug targets based on its selectivity/specificity and essentiality, cellular function, location inside the cell, the ability to be targeted by broad-spectrum agents, its functional interactions with proteins of metabolic pathways and druggability [20].
Even though S. marcescens has been extensively studied to identify potential drug targets yet the current applied method has not been reported to our knowledge. Furthermore, this work will help to identify potential drug targets.
2. Material and methods
In this study, we used all available strains of S. marcescens as part of the analysis. The strains were analyzed to find potential drug targets retrieved from metabolic pathways of the pathogen. Metabolic pathways are a network of interconnected metabolic events that occur in a cell. Some of these metabolic pathways and metabolites are required essentially by the pathogens to survive. Therefore, it is prudent to target such metabolic pathways to eradicate the pathogen. Thus, the metabolic data of three available S. marcescens strains were retrieved. Metabolic pathways unique to the pathogen compared with Homo sapiens were selected using metabolic data. Essential proteins of these metabolic pathways were shortlisted using BLAST against the DEG database. Moreover, protein structural studies were conducted to model a 3D structure and locate the binding pockets of the protein. Finally, protein characterization and network topological analyses were performed. The current study is divided into two phases i.e. (1) Subtractive Genomics and (2) Structural Studies. The details of the study are provided in the workflow (Fig 1).
Potential drug targets against S. marcescens identified using this methodology.
2.1. Phase I: Subtractive genomics
2.1.1. Data retrieval.
Databases including NCBI [21] and KEGG [22] were used to retrieve data on three specific strains of S. marcescens. Dataset of the human proteome was retrieved from the NCBI. The data of the three available strains were accessed on 10th Feb 2022 from KEGG i.e., WW4, SM39, and Db11 strains. The htext files of the mentioned strains of S. marcescens were retrieved. Bash scripting was used to extract the metabolic pathway data of each strain from the BRITE hierarchy file. The script is provided in S1 Appendix. Table 1 contains information such as the organism’s name, its 3 lettered code and its respective T-number.
2.1.2. Identifying differences in the strains.
A tool known as Diffuse (http://diffuse.sourceforge.net/) was employed to sort the distinction among three strains graphically. Metabolic pathways that were uniquely present or absent among three pathogenic strains were visualized. Studies have reported the role of specific pathways and how their presence or absence impact the survival of the pathogen [23]. Further analysis was conducted on sequences of protein belonging to unique pathways, retrieved via KEGG. Proteins found in common metabolic pathways were discarded.
2.1.3. Non-homologous protein identification.
The shortlisted proteins of three strains from S. marcescens were subjected to a comparative analysis with the host’s human proteome using the BLASTp standalone tool [24]. Two inputs are required to execute the BLASTp namely (i) a sequences database and (ii) a query sequence with an optimum e-value such as 1e-3. The E value is the BLAST tool’s statistical parameter defining the expected maximum hits while making a comparison against a specific portion of the database. An inversely exponential relationship exists between the scores and the number of matches. Later on, “no hits” proteins against the human proteome were used for further downstream analysis.
2.1.4. Druggability of non-homologous proteins.
Additionally, the resulting nonhomologous proteins were assessed using BLASTp with an E value of 10−5 against the DrugBank database [25] to establish the druggability and find new drug targets. Proteins that showed the strong sequence similarity with the DrugBank version 5.1.9 All Drug Target library were inferred as potential drug targets, whereas those with no similarity were termed as novel drug targets.
2.1.5. Identifying essential proteins.
Annotation of the S. marcescens non-homologous proteins was performed using BLASTp. The Database of Essential Genes (DEG) helped in the identification of proteins essential to the survival of bacteria [26]. The BLASTp was performed for the identification of essential genes retrieved form the DEG prokaryotic data with a cut-off value of 1e-5, identity more than 30%, and query length more than 70% to identify essential proteins.
2.1.6. Protein characterization.
The characterization of protein’s subcellular location is a crucial aspect of its functional analysis. The PSORTb v 3.0 [27] predicted the subcellular location of the shortlisted proteins as it includes archaeal and bacterial diversified cellular morphologies. Thus, subcellular localization was performed on the identified drug targets. Furthermore, InterProScan analyzed the function of the protein and its annotation was performed using predictive signatures in the protein’s primary sequence [28].
2.1.7. Physicochemical parameter analysis.
The physicochemical properties of the selected proteins were estimated through ProtParam server [29]. The ProtParam computes the molecular weight, isoelectric point, atomic and amino acid composition, estimated half-life, coefficient of extinction, instability and aliphatic indices, and the protein’s Grand Average of Hydropathicity (GRAVY).
2.1.8. Protein interactome analysis.
The protein interaction network analysis for the pigC protein known as prodigiosin synthetase that is involved in the synthesis of the antimicrobial compound prodigiosin was conducted via a popular tool i.e., STRING [30]. It predicts a protein’s functional and physical associations. Data from various sources are integrated statistically for a wide number of species, STRING transmits information between these organisms. At the time of this study, there are 5,214,234 proteins from 1133 species in the database. In this study, the protein-protein interaction of the ’essential’ cytoplasmic proteins was performed using this tool.
2.2. Phase II—Protein structural studies
BLAST was implemented online against PDB using NCBI [31] to predict an appropriate template for the structural modeling of the selected protein. The protein homology modeling was performed using the RoseTTA fold method available on the Robetta webserver [32]. Additionally, ProCheck [33] and Verify3D [34] were used to validate the structure of the modeled proteins. PSIPRED [35] predicted the protein’s secondary structure.
2.2.1. Docking of ligands.
Moreover, DogSite Scorer was used to find protein binding pockets [36]. This uses statistical analysis to rank the binding site. The binding site with the highest score was used in the study. The binding ligands for the shortlisted protein were retrieved from the KEGG database. The KEGG database showed that our shortlisted protein binds with three ligands. The structures of these ligands were retrieved from PubChem [37]. Moreover, Autodock [38] was used for the docking of these ligands using the selected binding pocket in the individual sessions. In all sessions, the number of Genetic Algorithm Parameters (GA) was 50 runs, and the population size was 300 to ensure reasonable docking results. Other parameters, which included the ga_num_evals were selected in proportion to the number of torsions in the ligand. The docking conformation with the lowest binding energy was selected and the docked complex was visualized through Chimera [39]. Furthermore, Ligplot was also used to get an insight into the 2D interactions of the protein and ligand [40] in terms of hydrogen, and hydrophobic interactions.
2.2.2. Virtual screening.
~1500 compounds in the approved drugs library were retrieved from the Drug Bank database to screen against the shortlisted drug targets. The analysis of the structural error of all target proteins was performed using ADT (Auto Dock Tool). Grid box parameters and configuration files of the drug target were individually prepared using the above docking parameters. Configuration files for the target covering the active site of the protein predicted through DogSiteScorer were set as Nº of points in X, Y, and Z-dimensions were 126 and Center Grid Box: X center: 21.294, Y center: 47.682, and Z center: 25.044, respectively to allow the molecular docking to occur in the selected binding pocket having the active residues retrieved from DogSiteScorer (LEU_182, THR_183, ALA_301, LYS_302, VAL_304). The AutoDock Vina was used to perform molecular docking.
2.2.3. ADMET profiling.
The chemical descriptors and druggability of compounds were analyzed using the SWISSADME tool. The database was used to determine the pharmacokinetics of the compounds such as absorption, metabolism, druggability, rule of five, intestinal absorption, Caco2 permeability, and toxicity [41]. Ames, carcinogenicity, hepatotoxicity, and skin sensation were applied as a toxicity parameter to screen endpoint models of the chosen compounds.
2.2.4. Conservancy analyses of predicted sequences with other strains.
The extent of the pharmacological spectrum over the whole homologous bacterial population may be inferred through comparison of the predicted sequences’ conservation pattern with other strains that are used conventionally. Therefore, the online BLAST [42] was used to conduct a conservancy analysis of the pigC protein. The BLAST was implemented using default settings with the exception of the restricted taxonomy option, where we specified S. marcescens with the taxonomic ID of 615 and UniProt KB was selected as the target database.
3. Results
3.1. Phase I: Subtractive genomics
3.1.1. Potential drug target identification.
After retrieving data of the metabolic pathways, the differences among the strains were annotated and identification of unique or absent pathways among the strains was inferred.
It was observed that the SM39 showed the greatest number of missing pathways and the Db11 showed the most additional number of pathways such as the Styrene and Polycyclic Aromatic Hydrocarbon Degradation pathways. The WW4 strain was found with an additional pathway unique to itself only i.e., the Prodigiosin Synthesis pathway.
All pathways showed that they have played roles in the survival of bacteria with the Prodigiosin Synthesis pathway giving it a competitive edge over the other microbes. This compound was earlier reported to provide a survival advantage to bacteria against its competitors (other bacteria) and predators (nematodes and protozoans) as its secretion causes membrane disruption of other organisms [43]. A heatmap graphically representing the metabolic pathways present in the three strains of S. marcescens is shown (Fig 2). S1 Table shows the names and ids of these pathways as displayed in (Fig 2).
Blue represents absence and brick-red represents presence of a pathway. The ids for the first three pathways have been provided in the heatmap with the remaining ids provided in the key. The unique pathways have been highlighted in blue in the key. The strains are denoted as smw (WW4), smar (SM39) and smac (Db11) strains according to Kyoto Encyclopedia of Genes and Genome (KEGG).
3.1.2. Identifying non-homologous and essential proteins.
The heatmap showed that the strains Db11 and WW4 were unique and hence subsequently used for further analysis. The proteins of those specific metabolic pathways were retrieved from the KEGG database. Among twenty proteins from both strains, two proteins were found in the Db11 strain and four in the WW4 strain that were non-homologous to the human proteome. BLASTp was then implemented against the DEG database using these six proteins, which resulted in only one protein i.e., the pigC protein as essential to the survival of the bacteria. The implemented BLASTp parameters and their respective outcomes have been mentioned in Table 2.
Details of these pathways and their proteins are mentioned Table 3. According to the literature survey, non-homologous proteins can serve as potential drug targets as they are unique to the pathogen thus turning them suitable as drug targets [44].
3.1.3. Druggability of non-homologous protein.
BLASTp search against Drug Bank targets resulted in the identification of zero hits i.e., no hits were found using the set parameters for the pigC protein. Finding “No hits” does not necessarily mean that the protein cannot be used as a therapeutic target. However, the target was classified as novel target due to their absence in the Drug Bank database i.e., the protein in question could be a good target to be explored in further detail.
3.1.4. Essential protein analysis.
The main criterion that a potential drug target must satisfy is its essentiality for the survival of bacteria. Targeting such proteins may specifically kill bacteria. The essentiality of the non-homologous-druggable proteins was assessed by comparing them with the DEG database using BLASTp. This resulted in the identification of a single protein i.e, Prodigiosin synthetase as a novel drug target against S. marcescens that is vital for bacterial survival. The remaining five proteins without any hit were confirmed as non-essential and therefore, discarded. Hence, the identified essential protein was classified and selected as a novel drug target against S. marcescens i.e., prodigiosin synthetase.
3.1.5. Significance of selected protein.
The selected protein was annotated as Prodigiosin synthetase encoded by the gene pigC found in the WW4 strain of S. marcescens. This protein was found to be unique to the WW4 strain and is part of the accessory genome of the organism. The metabolic pathway in which it involved in is named Prodigiosin Biosynthesis (KEGG ID = 00333). A comprehensive literature review was conducted to find multiple roles of prodigiosin synthetase in pathogens and in other microbes.
In brief, this enzyme is involved in the prodigiosin biosynthesis i.e., Prodigiosin has anticancer, antimalarial, antifungal and antibacterial properties. As the pathogen invests a huge amount of energy into the biosynthesis of prodigiosin, it can be inferred that it is critical to the survival of S. marcescens, particularly during interspecies competition. Targeting such a protein will inhibit the essential pathway, inhibit the synthesis of this antimicrobial agent and making the bacteria susceptible to the treatment.
3.1.6. Comparative protein analysis.
Sequence alignment of the selected protein was performed with its homologs in different species to analyze its phylogenic relationship. The alignment was performed using Clustal Omega V.1.2.4. The protein sequence of the selected protein in S. marcescens was compared with homologous proteins in Listeria monocytogenes and Neisseria meningitides by Clustal Omega 1.2.4 [45]. The results showed very low similarity between the enzyme found in the S. marcescens and those of other bacterial species. The percentage similarity of the prodigiosin synthetase with different species is highlighted Table 4.
3.2. Phase II: Structure-based studies
De novo structural modeling was used as an alternative method in the absence of a suitable template by using the RoseTTA fold for the selected protein sequence. (Fig 3a) shows the modeled structure of the query protein, its angstrom error estimate, and the results of structural assessment studies. The results showed a confidence score of 0.84 with 1 being the highest.
(a) Modelled protein structure of prodigiosin synthetase through Robetta. (b) Angstrom Error Estimate graphically shown. (c) Verify 3D Plot showing the quality of modelled structure having 95.83% in allowed region.
The angstrom error estimate of the model showed an undulating pattern. The error estimate varied with 0.7 being the lowest and 6.02 is the highest, respectively (Fig 3b).
The modeled structure of the protein was evaluated using Verify 3D tool. This tool is used for the quantitative evaluation of the overall quality of the protein’s three-dimensional structure. Computationally, the modeled structure was classified with a quality score of 95.83% with at least 80% of the amino acids having a score of ≥ 0.2 (Fig 3c).
Additionally, the protein’s secondary structure was validated using PsiPred. It computationally estimates the secondary structural elements using the primary structure of the query protein. Yellow-colored bars represent strands while straight lines represent coils and pink colored bars represent helices. S1 Fig shows the PsiPred results predicting the position of strand, helices and coils. The earliest strand spans from residues 6–8 while the first helix spans from the residues 17–19. A comparison was made between the PsiPred results and the modeled three-dimensional structure, which revealed that the results mostly aligned with the modeled structure of the protein.
The Ramachandran plot was generated for the modeled protein to evaluate its quality. It showed that only 0.6% of the residues were found in the disallowed region. The ProCheck classified 91.5% residues in the core allowed region whereas 7.5% in the allowed and 0.4% in the generously allowed region. The Ramachandran Plot showed that five residues fall in disallowed region inclusive of asparagine, lysine, isoleucine and two serine residues, while showing the overall good quality of the structure as shown in S2 Fig.
3.2.1. Protein characterization.
The selected protein was characterized by studying its subcellular location, physicochemical characteristics, superfamily and interaction network. Details are provided in the following sections:
3.2.2. Sub–cellular localization prediction and functional family classifications.
The identification of protein sub-cellular localization is important to effectively characterize the protein’s function and chemical nature. Additionally, it aids in the evaluation of the molecular structure and druggability of the protein. Consequently, the selected protein was classified as cytoplasmic as 9.26% was computed as cytoplasmic while 0.24% was found in the cytoplasmic membrane. The bar chart presented in (Fig 4a) shows the protein’s sub-cellular localization distribution. It was observed that the identified drug target was located within the cytoplasmic region of the pathogen.
(a) The sub-cellular localization of protein. The results are indicating that the query protein belongs to cytoplasmic region in the cell. (b) Image showing the network topology of the query protein in red color (PigC) generated by STRING database showing the top 11 interactions.
The InterProScan is an online tool that computes a protein’s functional domains and predicts its super-family which function as diagnostic signatures for the classification of the proteins. The InterProScan predicted the protein’s ATP binding functional domain. This prediction serves as a diagnostic marker about the functional domain and the type of compounds it interacts with. This tool predicts that the protein is associated with the phosphorylation pathway within the cell. S3 Fig shows the InterProScan results.
3.2.3. Physicochemical properties prediction and analysis of network topology.
The ProtParam computed the protein’s physicochemical properties. The Gravy index was estimated at -0.197 inferring that the drug target is hydrophilic. The theoretical pI of the protein was 6.45 thus implying that the protein is acidic. As the protein’s instability index is higher than 40, the protein is considered unstable. Table 5 shows the ProtParam results.
(Fig 4b) showed the hub interaction of the pigC protein (red node) graphically with the top 11 nodes. The protein has 2721 edges, an expected 1126 edges, 199 nodes and an average of 27.3 nodes degree. The enrichment p-value for the predicted network is less than 1.0e-16, with an average local clustering coefficient of 0.701. These results showed that the protein is involved in various critical functions. Consequently, targeting such a protein can potentially cause the loss of function of the other associated proteins and thereby inhibit numerous biological pathways. Thus, it may be proposed as a potential therapeutic target.
3.2.4. Shortlisting of a binding pocket and prediction of ligand.
Furthermore, the DogSiteScorer was used to predict the protein’s binding pockets for the ligand for further protein structural characterization. Out of the 28 binding pockets identified, only one pocket was selected based on the highest average of simple and drug scores. The selected pocket was found with the highest drug score of 0.725 and a surface area of 1182.91 Å2. The descriptive details of the binding pocket can be found in S4 Fig. A potential ligand for the binding pocket was selected so that more detailed information on the protein’s structural properties could be obtained. For this purpose, MBC (4-Methoxy-2,2’-bipyrrole-5-carbaldehyde), MAP (2-Methyl-3-n-amyl-pyrrole) and ATP (Adenosine triphosphate) were used as ligands. These molecules are reported under R11662 in the KEGG database. MBC and MAP act as substrates and ATP as a cofactor for prodigiosin synthetase. The docking analysis revealed that all ligands bind with the proteins with favorable potencies. The ATP binds with the protein with a binding affinity of -4.64 kcal/mol, MBC with -6.04 kcal/mol and MAP with -4.83 kcal/mol, respectively.
Additionally, inter-molecular hydrogen bonding interaction of the protein and the ligands are depicted. The protein-ligand interactions are visualized using Chimera and Ligplot. The 2D interactions of the Ligplot showed that the protein makes favorable hydrogen bonds with all three ligands as the bond lengths fall within 2.7 to 3.3 Å. The ATP forms the greatest number of hydrogen bonds with the amino acid residues of the protein, whereas MAP forms only a single hydrogen bond but numerous nonpolar interactions (Fig 5a). Furthermore, hydrogen bonds with amino acid residues of lysine and asparagine are common among MBC and ATP, Fig 5b and 5c respectively. The 3D interactions of the protein and the ligand were also generated using Chimera (Fig 6).
Protein ligand interactions generated using LigPlot. (a) 2-Methyl-3-n-amyl-pyrrole (MAP), (b) 4-Methoxy-2,2’-bipyrrole-5-carbaldehyde (MBC), and (c) Adenosine triphosphate (ATP).
Protein ligand interactions generated using LigPlot. (a) 2-Methyl-3-n-amyl-pyrrole (MAP), (b) 4-Methoxy-2,2’-bipyrrole-5-carbaldehyde (MBC), and (c) Adenosine triphosphate (ATP).
Table 6 shows the ligand’s intermolecular interactions with the protein’s binding site.
3.2.5. Virtual screening and ADMET profiling.
Virtual screening was performed to identify compounds with binding energies similar to the docked ligands which were used as a reference. Among 1500 Approved drug library, 854 compounds were clustered in two peaks with binding energies in the range of -5.3 to -7.1 kcal/mol (Fig 7).
Results show that the most potential docked compounds within the range of -6.2 to -7.1 kcal/mol.
Moreover, ADMET profiling was performed to shortlist compounds with potential binding affinity against Prodigiosin synthetase. Consequently, compounds DB00865, DB00821, DB00369, DB00423, DB00775 and DB00693 were shortlisted as favorable drug candidates Table 7. These six compounds were further shortlisted to two compounds which met certain selection criteria such as impermeable to the blood-brain barrier and high absorption through the GI tract i.e., DB00423 and DB00775. Compounds DB00423 and DB00775 showed estimated binding energies of -6.4 kcal/mol -6.7 kcal/mol, respectively. Additionally, both compounds showed no inhibition of the CYP1A2, CYP2C19, CYP2C9, CYP2D6 and CYP3A4 enzymes. The Swiss ADMET profiling showed that all three of these compounds follow the Lipinski rule and have estimated bioavailability scores of 0.55. These compounds showed favorable binding energies compared with the reference and therefore could serve as potent drug candidates against Prodigiosin synthetase.
3.2.6. Conservancy analyses of predicted sequences with other strains.
Furthermore, the results of the conservancy analysis showed that the pigC protein from the WW4 strain was locally aligned with the pigC protein of multiple different Serratia marcescens strains. It was observed that six proteins from different S. marcescens strains were aligned to the reference protein pigC protein. The proteins having sequence identity > 90% can be used for the pigC conservancy analysis, details of these proteins are provided in the S3 Table.
4. Discussion
S. marcescens is a Gram-negative bacillus and an opportunistic nosocomial pathogen causing infections of the urinary tract, blood, central nervous system and pneumonia [1]. S. marcescens harbor an inducible AmpC beta-lactamase present on the chromosome, which confers resistance to numerous antibiotics intrinsically [46]. Approximately, 200 S. marcescens outbreaks have been documented dating back to the 1950s [1]. As S. marcescens strains are multi-drug resistant therefore effective therapies such as novel medications and vaccines are needed.
Computational Biology is an emerging field of biological sciences that aids data analysis. New predictions are currently being made successfully using data mining techniques. The post-genomic era produced several genomic annotations that could not have been analyzed solely by humans. Computational Biology prioritizes potential drug targets to treat deadly diseases within the realm of infectious diseases. One such methodology is subtractive genomics and metabolic pathway analysis. The metabolic pathway analysis approach for the identification of novel drug targets is among the most cited strategies found in the literature. Environmental stress may cause upregulation of certain genes and the activation of selective metabolic pathways in bacteria. Such conditions may cause resistant pathogens to evolve with unique metabolic pathways in their dataset as shown in (Fig 2). Consequently, a metabolic pathway that has evolved for bacterial survival might be thought of as a possible therapeutic target. Some bacterial strains may be more sensitive to drugs if they lack a particular metabolic pathway in their genome. Significantly, subtractive genomics is widely used for the prioritization of potential drug targets against various hazardous pathogens [47, 48].
Therefore, in the current study, a comparative metabolic pathway analysis along with subtractive genomics was applied to identify unique proteins that may serve as potential drug targets against S. marcescens. Using this technique, the pathogen’s unique metabolic pathways were identified, and essential proteins were prioritized as potential drug targets. Following this strategy, pathway 00333 (Prodigiosin biosynthesis) was found unique to a single strain (i.e., WW4) as shown in (Fig 2). We propose that inhibiting the Prodigiosin biosynthesis pathway by targeting any of the pathway’s essential proteins may result in a potential treatment against this particular strain of S. marcescens. The druggability analysis retrieved ‘no hits’ from the database meaning that the Prodigiosin synthetase protein can serve as a novel drug target [49]. Furthermore, DEG shortlisted Prodigiosin synthetase as an essential protein associated with the prodigiosin biosynthesis pathway out of the six proteins as shown in the Table 3. Prodigiosin synthetase is involved in the final condensation step that results in the production of the antimicrobials. As the pathogen invests a huge amount of energy in the biosynthesis of Prodigiosin, therefore it can be inferred that it is critical for the survival of S. marcescens, particularly during interspecies competition. Targeting such a protein might inhibit the pathway thus turning the bacteria susceptible to the treatment (i.e. drug).
Various bacteria including S. marcescens and those belonging to the Streptomycetaceae and Pseudoalteromonadaceae families produce Prodiginines [50]. The awareness of the benefits of this alkaloid has led to the research on its production at the industrial level. The ongoing research showed that fatty acids from powdered peanut broth nurtured the growth of S. marcescens and elevated Prodigiosin production [51].
Thus, these studies concentrated on the benefits of this alkaloid for humans. However, it is also important to understand the threats associated with a S. marcescens infection. For example, this bacterium may pose to our body’s natural flora as they are antimicrobial in nature. Knowledge of prodigiosin synthetase that synthesizes prodigiosin may help alleviate prevalence of infections by Serratia marcescens. A study has revealed that mutagenesis, particularly a 17 bp deletion in the phosphoenolpyruvate (PEP) domain of the protein resulted in the truncation and loss of function of the domain. As a result, the bacteria (i.e. S. marcescens) not only failed to synthesize but also failed to secrete Prodigiosin since the macrovesicle formation was inhibited [52]. This showed that this protein may serve as a viable drug target as it is quintessential in the synthesis of Prodigiosin.
Additionally, de novo modeling was used to model the structure of Prodigiosin synthetase as shown (Fig 3a). The structure was validated using Robetta server (Fig 3b), Verify 3D (Fig 3c), PSIPRED as shown in S1 Fig and Procheck. S2 Fig shows that the Ramachandran plot generated using Procheck classified 91.5% of the amino acid residues of the protein in the core-allowed region. Similar results were obtained in a previous study [53]. The Verify 3D assigned a quality score of 95.83% to the model as shown in (Fig 3c). The model is considered reliable when the compatibility scores are higher than 80%. This further confirmed that the generated model is accurate [34]. The prediction of the binding site and ligands provided detailed information regarding the protein structure. PSORTb analysis showed the suitability of the protein as a drug target and it was concluded that it is cytoplasmic (Fig 4a). The physicochemical analysis showed that the protein is unstable. It is in-line with the previous studies that hub proteins are more disordered than end proteins [54]. Such proteins are suitable targets for inhibition as they have been associated with signaling, regulatory pathways and cancer [55]. The targeted hub protein was shown to interact with 198 proteins through network topology analysis as shown in (Fig 4b). The protein has an aliphatic index of 88.41 and this means that the protein is thermally stable.
Significantly, the docking studies showed that ATP binds with the protein with a binding affinity of -4.64 kcal/mol, MBC with -6.04 kcal/mol and MAP with -4.83 kcal/mol. The same parameters that were used for the docking of these ligands were used for the virtual screening of ~1500 approved compounds library. About 854 compounds out of 1500 were found with binding energies in the range of -5.3 to -7.1kcal/mol. Compounds shortlisted for further analysis are highlighted as yellow bars in (Fig 7). A total of six compounds were shortlisted with binding energies in the range of -5.0 to -8.3 kcal/mol i.e., compounds DB00865, DB00821, DB00369, DB00423, DB00775 and DB00693. These six compounds were further shortlisted to two compounds according to certain criteria such as impermeable to the blood-brain barrier and high absorption through the GI tract i.e., DB00423 and DB00775 as shown in Table 7. As they have favorable binding energies they may serve as good drug candidates against Prodigiosin synthetase.
The results of the conservancy analysis part of the study showed that Prodigiosin Synthetase (PigC) displayed high conservancy pattern with homologues of Serratia marcescens strains from different geographical locations. The predicted protein should be conserved among the bacterial strains of specific pathogen to ensure that it can serve as an effective drug or vaccine target across multiple strains [56]. Hence, PigC protein might be a potent broad-spectrum drug target against Serratia marcescens.
Nevertheless, the study has some limitations as all methods were carried out on a computational approach. Further complementary experimental studies are recommended to validate these findings. In vitro and in vivo studies are the suggested follow-up for future collaborative research among the scientific community.
5. Conclusion
Pathogenic genome and proteome analysis helped in drug target identification. In this research, a subtractive genomic and metabolic pathway analysis technique was performed for the prediction of a non-homologous, essential, druggable protein against S. marcescens. Consequently, the protein Prodigiosin synthetase was proposed as a potential drug target. This protein is involved in a network with 198 other proteins, therefore targeting it may help eradicate the pathogen from the respective host. Moreover, using the molecular docking approach for the identified drug target, one can identify and select molecules such as DB00423 and DB00775 as the ones with the most favorable binding affinities with the target active site residues. This study may enable future researchers to produce efficacious drugs and vaccines against strain-specific S. marcescens.
Supporting information
S1 Appendix. Script for subtractive genomics.
https://doi.org/10.1371/journal.pone.0283993.s001
(DOCX)
S3 Fig. InterProScan results.
(A) The image shows the accession numbers of the functional domains while (B) image tabulates the names of the domains.
https://doi.org/10.1371/journal.pone.0283993.s004
(TIF)
S4 Fig. DogSiteScorer results.
The ligand binding pocket in protein predicted through DoGSite Scorer. Detailed description related to binding pocket and amino acids found in active are mentioned.
https://doi.org/10.1371/journal.pone.0283993.s005
(TIF)
S1 Table. KEGG pathway IDs displayed in the heatmap.
https://doi.org/10.1371/journal.pone.0283993.s006
(DOCX)
S2 Table. STRING interaction network.
Table showing the detailed annotation of the direct interaction formed by the hub protein (PigC).
https://doi.org/10.1371/journal.pone.0283993.s007
(DOCX)
S3 Table. The table shows the conservancy analysis results of the prodigiosin synthetase (PigC) protein in other Serratia marcescens strains.
https://doi.org/10.1371/journal.pone.0283993.s008
(DOCX)
References
- 1. Mahlen SD. Serratia infections: from military experiments to current practice. Clin Microbiol Rev. 2011 Oct 01. pmid:21976608
- 2. Wu Y-M, Hsu P-C, Yang C-C, Chang H-J, Ye J-J, Huang C-T, et al. Serratia marcescens meningitis: epidemiology, prognostic factors and treatment outcomes. J Microbiol Immunol Infect. 2013 Aug 01. pmid:22926070
- 3. Sader HS, Castanheira M, Streit JM, Carvalhaes CG, Mendes RE. Frequency and antimicrobial susceptibility of bacteria causing bloodstream infections in pediatric patients from United States (US) medical centers (2014–2018): Therapeutic options for multidrug-resistant bacteria. Diagn Microbiol Infect Dis. 2020 Oct 01. pmid:32640386
- 4. Voelz A, Müller A, Gillen J, Le C, Dresbach T, Engelhart S, et al. Outbreaks of Serratia marcescens in neonatal and pediatric intensive care units: clinical aspects, risk factors and management. Int J Hyg Environ Health. 2010 Mar 01. pmid:19783209
- 5. Kurz CL, Chauvet S, Andrès E, Aurouze M, Vallet I, Michel GP, et al. Virulence factors of the human opportunistic pathogen Serratia marcescens identified by in vivo screening. EMBO J. 2003 Apr 01. pmid:12660152
- 6. Shanks RM, Stella NA, Hunt KM, Brothers KM, Zhang L, Thibodeau PH. Identification of SlpB, a cytotoxic protease from Serratia marcescens. Infect Immun. 2015 June 15. pmid:25939509
- 7. Shimuta K, Ohnishi M, Iyoda S, Gotoh N, Koizumi N, Watanabe H. The hemolytic and cytolytic activities of Serratia marcescens phospholipase A (PhlA) depend on lysophospholipid production by PhlA. BMC Microbiol. 2009 Dec 16. pmid:20003541
- 8. Saralegui C, Ponce-Alonso M, Pérez-Viso B, Moles Alegre L, Escribano E, Lázaro-Perona F, et al. Genomics of Serratia marcescens isolates causing outbreaks in the same pediatric unit 47 years apart: position in an updated phylogeny of the species. Front Microbiol. 2020 Mar 31. pmid:32296400
- 9. Ferreira RL, Rezende GS, Damas MSF, Oliveira-Silva M, Pitondo-Silva A, Brito MC, et al. Characterization of KPC-producing Serratia marcescens in an intensive care unit of a Brazilian tertiary hospital. Front Microbiol. 2020 May 20. pmid:32670210
- 10. Cristina ML, Sartini M, Spagnolo AM. Serratia marcescens infections in neonatal intensive care units (NICUs). Int J Environ Res Public Health. 2019 Feb 20. pmid:30791509
- 11. Čepl J, Blahůšková A, Cvrčková F, Markoš A. Ammonia produced by bacterial colonies promotes growth of ampicillin-sensitive Serratia sp. by means of antibiotic inactivation. FEMS Microbiol Lett. 2014 May 01. pmid:24716667
- 12. Leclercq R, Cantón R, Brown DF, Giske CG, Heisig P, MacGowan AP, et al. EUCAST expert rules in antimicrobial susceptibility testing. Clin Microbiol Infect. 2013 Feb 01. pmid:22117544
- 13.
WHO. WHO publishes list of bacteria for which new antibiotics are urgently needed. World Health Organization. 2017 Feb 27 [cited 2023 Jan 07]. https://www.who.int/news/item/27-02-2017-who-publishes-list-of-bacteria-for-which-new-antibiotics-are-urgently-needed.
- 14. Hou J, Mao D, Zhang Y, Huang R, Li L, Wang X, et al. Long-term spatiotemporal variation of antimicrobial resistance genes within the Serratia marcescens population and transmission of S. marcescens revealed by public whole-genome datasets. J Hazard Mater. 2022 Feb 05. pmid:34844350
- 15. Lee C-R, Cho IH, Jeong BC, Lee SH. Strategies to minimize antibiotic resistance. Int J Environ Res Public Health. 2013 Sep 12. pmid:24036486
- 16. Trubiano JA, Padiglione AA. Nosocomial infections in the intensive care unit. Anaesth Intensive Care Med. 2015 Dec 01.
- 17. Dhusia K, Raja K, Thomas PPM, Yadav PK, Ramteke PW. Molecular dynamics simulation analysis of conessine against multi drug resistant Serratia marcescens. Infect Genet Evol. 2019 Jan 01. pmid:30396000
- 18. Tacconelli E, Carrara E, Savoldi A, Harbarth S, Mendelson M, Monnet DL, et al. Discovery, research, and development of new antibiotics: the WHO priority list of antibiotic-resistant bacteria and tuberculosis. Lancet Infect Dis. 2018 Mar 01. pmid:29276051
- 19. Barh D, Tiwari S, Jain N, Ali A, Santos AR, Misra AN, et al. In silico subtractive genomics for target identification in human bacterial pathogens. Drug Dev Res. 2011 Mar 16.
- 20. Shanmugham B, Pan A. Identification and characterization of potential therapeutic candidates in emerging human pathogen Mycobacterium abscessus: a novel hierarchical in silico approach. PloS One. 2013 Mar 19. pmid:23527108
- 21.
Jenuth JP. NCBI. In: Misener S, Krawetz S.A, editors. Bioinformatics methods and protocol. Humana Press: Totowa; 2000.
- 22. Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017 Jan 04. pmid:27899662
- 23. Uddin R, Khalil W. A comparative proteomic approach using metabolic pathways for the identification of potential drug targets against Helicobacter pylori. Genes Genomics. 2020 Mar 19. pmid:32193857
- 24. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinform. 2009 Dec 15. pmid:20003500
- 25. Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006 Jan 01 pmid:16381955
- 26. Luo H, Lin Y, Gao F, Zhang C-T, Zhang R. DEG 10, an update of the database of essential genes that includes both protein-coding genes and noncoding genomic elements. Nucleic Acids Res. 2014 Jan 01. pmid:24243843
- 27. Yu NY, Wagner JR, Laird MR, Melli G, Rey S, Lo R, et al. PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. Bioinformatics. 2010 Jul 01. pmid:20472543
- 28. Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014 May 01. pmid:24451626
- 29. Garg VK, Avashthi H, Tiwari A, Jain PA, Ramkete PW, Kayastha AM, et al. MFPPI–multi FASTA ProtParam interface. Bioinformation. 2016 Apr 10. pmid:28104964
- 30. Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic acids res. 2019 Jan 08. pmid:30476243
- 31. Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL. NCBI BLAST: a better web interface. Nucleic Acids Res. 2008 Jul 01. pmid:18440982
- 32. Kim DE, Chivian D, Baker D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res. 2004 Jul 01. pmid:15215442
- 33. Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr. 1993 Apr 01.
- 34. Eisenberg D, Lüthy R, Bowie J. VERIFY3D: assessment of protein models with three-dimensional profiles. Methods Enzymol. 1997 Jan 01. pmid:9379925
- 35. McGuffin LJ, Bryson K, Jones DT. The PSIPRED protein structure prediction server. Bioinformatics. 2000 Apr 01. pmid:10869041
- 36. Volkamer A, Kuhn D, Rippmann F, Rarey M. DoGSiteScorer: a web server for automatic binding site prediction, analysis and druggability assessment. Bioinformatics. 2012 Aug 01. pmid:22628523
- 37. Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, et al. PubChem substance and compound databases. Nucleic Acids Res. 2016 Jan 04. pmid:26400175
- 38. Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J Comput Chem. 2009 Apr 27. pmid:19399780
- 39. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF Chimera—a visualization system for exploratory research and analysis. J Comput Chem. 2004 Jul 01. pmid:15264254
- 40. Laskowski RA, Swindells MB. LigPlot+: multiple ligand–protein interaction diagrams for drug discovery. J Chem Inf Model. 2011 Oct 24. pmid:21919503
- 41. Kar S, Leszczynski J. Open access in silico tools to predict the ADMET profiling of drug candidates. Expert Opin Drug Discov. 2020 Jul 31. pmid:32735147
- 42. Consortium U. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019 Jan 08. pmid:30395287
- 43. Choi SY, Lim S, Yoon K-h, Lee JI, Mitchell RJ. Biotechnological activities and applications of bacterial pigments violacein and prodigiosin. J Biol Eng. 2021 Mar 11. pmid:33706806
- 44. Pourhajibagher M, Bahador A. Designing and in silico analysis of PorB protein from Chlamydia trachomatis for developing a vaccine candidate. Drug Res. 2016 Sep 01. pmid:27409330
- 45. Madeira F, Park YM, Lee J, Buso N, Gur T, Madhusoodanan N, et al. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 2019 Jul 02. pmid:30976793
- 46. Mahlen SD, Morrow SS, Abdalhamid B, Hanson ND. Analyses of ampC gene expression in Serratia marcescens reveal new regulatory properties. J Antimicrob Chemother. 2003 April 01. pmid:12654751
- 47. Uddin R, Saeed K. Identification and characterization of potential drug targets by subtractive genome analyses of methicillin resistant Staphylococcus aureus. Comput Biol Chem. 2014 Feb 01. pmid:24361957
- 48. Naorem RS, Pangabam BD, Bora SS, Goswami G, Barooah M, Hazarika DJ, et al. Identification of Putative Vaccine and Drug Targets against the Methicillin-Resistant Staphylococcus aureus by Reverse Vaccinology and Subtractive Genomics Approaches. Molecules. 2022 Mar 24. pmid:35408485
- 49. Kaur H, Kalia M, Taneja N. Identification of novel non-homologous drug targets against Acinetobacter baumannii using subtractive genomics and comparative metabolic pathway analysis. Microb Pathog. 2021 Mar 01. pmid:33166618
- 50. Klein AS, Domröse A, Bongen P, Brass HU, Classen T, Loeschcke A, et al. New prodigiosin derivatives obtained by mutasynthesis in Pseudomonas putida. ACS Synth Biol. 2017 Sep 15. pmid:28505410
- 51. Giri AV, Anandkumar N, Muthukumaran G, Pennathur G. A novel medium for the enhanced cell growth and production of prodigiosin from Serratia marcescens isolated from soil. BMC Microbiol. 2004 Mar 18. pmid:15113456
- 52. Tan D, Fu L, Sun X, Xu L, Zhang J. Genetic Analysis and Immunoelectron Microscopy of Wild and Mutant Strains of the Rubber Tree Endophytic Bacterium Serratia marcescens Strain ITBB B5–1 Reveal Key Roles of a Macrovesicle in Storage and Secretion of Prodigiosin. J Agric Food Chem. 2020 Mar 31. pmid:32227934
- 53. Basharat Z, Khan K, Jalal K, Ahmad D, Hayat A, Alotaibi G, et al. An in silico hierarchal approach for drug candidate mining and validation of natural product inhibitors against pyrimidine biosynthesis enzyme in the antibiotic-resistant Shigella flexneri. Infect Genet Evol. 2022 Mar 01. pmid:35104682
- 54. Haynes C, Oldfield CJ, Ji F, Klitgord N, Cusick ME, Radivojac P, et al. Intrinsic disorder is a common feature of hub proteins from four eukaryotic interactomes. PLoS Comput Biol. 2006 Aug 04. pmid:16884331
- 55. Metallo SJ. Intrinsically disordered proteins are potential drug targets. Curr Opin Chem Biol. 2010 Jul 02. pmid:20598937
- 56. Khan MT, Mahmud A, Iqbal A, Hoque SF, Hasan M. Subtractive genomics approach towards the identification of novel therapeutic targets against human Bartonella bacilliformis. Inform Med Unlocked. 2020 Jan 01.