Microstructure and in-depth proteomic analysis of Perna viridis shell

For understanding the structural characteristics and the proteome of Perna shell, the microstructure, polymorph, and protein composition of the adult Perna viridis shell were investigated. The P. viridis shell have two distinct mineral layers, myostracum and nacre, with the same calcium carbonate polymorph of aragonite, determined by scanning electron microscope, Fourier transform infrared spectroscopy, and x-ray crystalline diffraction. Using Illumina sequencing, the mantle transcriptome of P. viridis was investigated and a total of 69,859 unigenes was generated. Using a combined proteomic/transcriptomic approach, a total of 378 shell proteins from P. viridis shell were identified, in which, 132 shell proteins identified with more than two matched unique peptides. Of the 132 shell proteins, 69 are exclusive to the nacre, 12 to the myostracum, and 51 are shared by both. The Myosin-tail domain containing proteins, Filament-like proteins, and Chitin-binding domain containing proteins represent the most abundant molecules. In addition, the shell matrix proteins (SMPs) containing biomineralization-related domains, such as Kunitz, A2M, WAP, EF-hand, PDZ, VWA, Collagen domain, and low complexity regions with abundant certain amino acids, were also identified from P. viridis shell. Collagenase and chitinase degradation can significantly change the morphology of the shell, indicating the important roles of collagen and chitin in the shell formation and the muscle-shell attachment. Our results present for the first time the proteome of P. viridis shell and increase the knowledge of SMPs in this genus.


Introduction
The bivalve shell, consisting of calcium carbonate crystals within an organic matrix, has been investigated as a typical biomineralization model for many years [1][2][3][4]. The bivalve shell presents superior mechanical properties, such as stiffness, fracture toughness, and tensile strength, because of the complex architecture and involvement of biological macromolecules [5][6][7]. It is well known that there are three major polymorphs of calcium carbonate: calcite, aragonite, and vaterite. Of these, the two most thermodynamically stable structures, calcite and aragonite, are deposited extensively as biominerals [8]. In general, most adult bivalve shells are composed of calcite and/or aragonite and consist of various microstructures (or layers) with different a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 The latter was subdivided into AMS-A (located at the anterior side of the AMS) that corresponded to the whitish area that composed the majority of the inner shell surface, and AMS-P (located at the posterior side of the AMS) that was located on the margin outside of the pallial line (Fig 1A). Using a knife, the shell was finely separated along the cutting line, as denoted in Fig 1A. Fresh fractured shell samples were cleaned using MilliQ water. The attached adductor muscle was removed by soaking the shell sample in 5% NaOH solution, lightly scrubbing it, and drying at room temperature. The deproteinization process was carried out by soaking the samples in 20% NaOH solution at 65˚C for 1 h. The samples were then rinsed in MilliQ water and freeze-dried before use.
Shell samples (polished by abrasive paper) were treated with enzymes (collagenase and chitinase) to analyse the composition and distribution of organic materials in the shell. For collagenase digestion, two collagenases, type-I (No. c0130, Sigma-Aldrich) and type-II (No. c6885, Sigma-Aldrich), were used. The shell samples were submerged in the buffer (50 mM Tricine with 10 mM calcium chloride and 400 mM sodium chloride, pH 7.5) containing 1 mg/mL collagenase (type-I or type-II) for 24 h at 25˚C. For chitinase degradation, two chitinases were used including the chitinase from Streptomyces griseus (No. c6137, Sigma-Aldrich) and the Chitotriosidase (No. SAE0052, Sigma-Aldrich). The shell samples were submerged in 50 mM potassium phosphate buffer (pH 7.0) containing 1 mg/mL enzyme for 4 h at 37˚C. The control samples were prepared by soaking with the same buffer without enzymes.

SEM, XRD, and FTIR analysis
The prepared samples were sputter-coated with gold and examined with a VEGA-3 TCSCA-NER SEM at 10 kV accelerator voltage. Shell layers were identified by the sharp contact between two types of shell microstructure and by the change in mineralogy. Layers were described according to their position within the shell.
Fractured samples from the AMS and AMS-A were collected, and the inner surfaces of these samples were analysed in situ by XRD using a RIGAKU Ultima IV XRD system with power of 40 kV and 44 mA. The scanning speed was 2 theta/min. Powdered samples were collected using a scalpel from the inner surface of AMS-A and AMS, corresponding to the nacre and the myostracum, respectively. The infrared spectra of powdered samples were recorded using a Fourier Transform Infrared (FTIR) spectrometer with a resolution of 4/cm on a Nicolet iS50 spectrophotometer (Thermo Scientific). The system was purged with dry N 2 to reduce interfering water vapor IR absorption, and no water contribution was verified by measuring KBr pellets.

RNA preparation and high-throughput sequencing of P. viridis mantle
Total RNA was extracted from the mantle tissues of P. viridis (n = 10) using TRIzol reagent (Invitrogen) according to the manufacturer's instructions. The integrity and purity of the extracted RNA were assessed using an Agilent 2100 Bioanalyzer (Agilent RNA 6000 Nano Kit) and agarose gel electrophoresis. Further, the RNA was quantified using a NanoDrop2000 spectrophotometer (NanoDrop Technologies Inc).
The mRNA was isolated and fragmented into~200 nt. cDNA was generated using a TruSeq TM RNA sample prep kit (Illumina) and purified with AMpure XP Beads (AGENCOURT). Then, the purified cDNA was end-repaired, ligated with an adapter, and enriched. After quantification and qualification, the cDNA library was amplified on cBot (Truseq PE Cluster Kit v3-cBot-HS, Illumina) to generate the clusters on the flowcell, and the amplified flowcell was paired-end sequenced on a HiSeq 4000 System (TruSeq SBS Kit v3-HS, Illumina).

Unigene assembly and functional annotations
The sequence analysis and read assembly were performed by the BGI Company (www. bgitechsolutions.com) followed by the standard commercial pipeline. The reads were assembled into clusters using Trinity (trinityrnaseq-r2013-02-25) by three steps, Inchworm, Chrysalis, and Butterfly, which had been successfully used for full-length transcriptome assembly of the RNA sequencing data from the species without reference genome [23]. The contigs that resulted from the assembly of multiple sequences were referred to as transcripts. The transcripts were further divided into two classes, the unigene and the cluster (named with a prefix "CL"), containing several contigs showing more than 70% similarity. These data were archived in the NCBI Sequence Read Archive (SRA) under the accession No. SRP144773. The transcripts were functionally annotated against the NT, NR, GO, KOG, KEGG, and SwissProt databases using the BLASTX algorithm with a specific cut-off E-value < le-05.

Proteomic analysis of P. viridis shell
The powdered shell samples from the nacre and myostracum layers of the P. viridis (n = 120) shell were collected and pooled and then suspended in cold acetic acid solution (5%, v/v) for 12 h at 4˚C with continuous stirring. The suspension was then centrifuged at 12,000 ×g and 4˚C for 15 min. The supernatant comprising the acid-soluble matrix was filtered (0.22 μm) and dialyzed (MWCO: 1 kDa) against MilliQ water before it was freeze-dried and weighed. The acidinsoluble matrix was rinsed 6 times with MilliQ water and further freeze-dried for use.
Prior to MS analysis, the protein samples of the P. viridis shells were reduced with 10 mM dithiothreitol (DTT) in 50 mM NH 4 HCO 3 at 57˚C for 1 h and alkylated using 20 mM iodoacetamide (IAA) in 50 mM NH 4 HCO 3 for 45 min at room temperature in the dark. The excess reagent was removed by dialyzing (MWCO: 1 kDa) against water overnight. After lyophilization, the sample was treated with trypsin for 12 h at 37˚C. The digests were then lyophilized and re-suspended in 0.1% formic acid and 2% acetonitrile for LC-MS/MS analysis. The LC-MS/MS analysis was performed using a Q-Exactive spectrometer (Thermo Fisher Scientific, San Jose, CA) interfaced with a LC-20AD HPLC system (Shimadzu, Japan). The trypsin-digested protein sample mixture was separated on a C18 column (75 μm × 150 mm, 3.6 μ). The HPLC gradient was 8~35% buffer B (98% ACN, 0.1% formic acid) in buffer A (2% ACN, 0.1% formic acid) at a flow rate of 300 nL/min over 37 min, followed by 35-60% buffer B over 5 min.
Isolated peptides from HPLC were submitted to an ion-trap mass spectrometer with datadependent acquisition (DDA) model detection under an ion spray voltage of 1.6 kV. The MS data were acquired automatically, following an MS survey scan over m/z 350~1600 m/z at a resolution of 70,000 for full scan and 17,500 for MS/MS measurements. The MS/MS spectra were sequentially and dynamically acquired with a dynamic exclusion duration of 15 s for the 20 most intense peptides with intensities greater than 10,000 and positive charges from 2 + to 7 + . The raw MS/MS data were converted into MGF format for bioinformatics analysis by Proteome Discoverer (Thermo Fisher Scientific Inc., MA, USA).
Protein identification was performed using Mascot database-searching software (version 2.3, Matrix Science, London, UK) against the mantle transcriptome database of P. viridis generated by Illumina sequencing. Carbamidomethyl was set as a fixed modification. Oxidation, Gln->pyro-Glu, and Deamidated were set as variable modifications. The peptide mass tolerance was set to 20 ppm, and the fragment mass tolerance was set to 0.05 Da, respectively. False discovery rate (FDR) analysis was performed, and an FDR<0.01 was considered for protein identification by using a target-decoy search strategy, a methodology for distinguishing correct from incorrect peptide identifications [24].
For all of the identified proteins, functional annotations were performed using the Blas-t2GO programme against the databases of Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Cluster of Orthologous Groups of proteins (COG). The homologous sequence searching of identified proteins was performed against the Non-redundant protein sequences (NR) at NCBI. The signal peptides were predicted using the SignalP 4.1 online tool (http://www.cbs.dtu.dk/services/SignalP/), and conserved domains were predicted using the SMART online tool at http://smart.embl-heidelberg.de/. The amino acid composition was analysed using the ProtParam tool at http://web.expasy.org/protparam/.

Results
The microstructure of the P. viridis shell As shown in Fig 1A, on the inner surface of adult P. viridis shells, two distinct areas with different colours and texture can be observed optically, including the whitish area accounting for majority of the inner surface (represented by AMS-A and AMS-P) and the AMS with muscle attached. When the adductor muscle was removed, the AMS of P. viridis presented a large, shiny area of light colour located towards the posterodorsal edge of each valve and merged with the scar of protractor muscle and retractor muscle (Fig 1A). The SEM images of sections perpendicular to the shell surfaces of the AMS, AMS-A, and AMS-P are shown in Fig 1B-1D, respectively. We describe the shell microstructures using the terminology (such as nacre and myostracum) defined by Carter [10] and Luciana et al. [25], and the relative position of these layers within the shell (Fig 1). In this study, the two microstructures observed in the P. viridis shell, the nacre and the myostracum layers, have different thicknesses and morphologies. At the AMS-A, the myostracum layer is buried between two nacreous laths, the exterior nacre and the interior nacre layer. At the AMS, the exterior nacre layer disappeared and the myostracum layer, with a thickness of approximately 5 μm, is attached by the adductor muscles ( Fig 1C). The myostracum layer is elongated from the shell umbo to the scar and finally disappears at the AMS-P (Fig 1D). At the outside of the shell, we found that the tablets of the nacre layer become thicker and a prismatic-like layer can be seen at the outermost layer of the shell (Fig 1E and 1F). Fig 2 shows the SEM images of the inner shell surface. The surface of muscle-attached AMS and non-attachment areas were selected for analysis. As shown in Fig 2A, the muscle fibres are tightly attached to the AMS. A film with many nanoscale pit structures was observed on the AMS when the muscle was detached using 5% NaOH (Fig 2B). These pit structures are also present at the transition areas of the AMS edge ( Fig 2C). However, for the non-attachment area, represented by AMS-A in this study, a translucent film without pit structures was observed (Fig 2D), exhibiting a different appearance from the AMS area ( Fig 2B). On the surface of the AMS-A, some distinct polygonal shapes, corresponding to the underlying tablets of the nacre layer, were seen under the film ( Fig 2D). However, the film in the AMS area was too thick to easily distinguish the shape of the underlying calcium carbonate crystals ( Fig 2B). After deproteinization with 20% NaOH, the film covering the AMS was removed, and mosaic tiles with some holes and cracks were observed (Fig 2E). For the AMS-A, the tablet surface was visible after the film was removed (Fig 2F). Fig 3 shows the FTIR and XRD spectra of the two layers from the P. viridis shell. The nacre and the myostracum layers showed similar FTIR spectra with four characteristic bands (1082-1085/cm, 854-857/cm, 714/cm, and 699/cm), indicating that the internal vibration modes of CO 3 2− ions were detected (Fig 3A), which represents the characteristics of the aragonite structure. Meanwhile, an amide I feature (located in 1647-1650/cm, denoted by the double star in Fig 3A) and C = O feature (1784-1789/cm, denoted by the single star in Fig 3A) were also detected in both nacre and myostracum layers, indicating the presence of a small amount of organic matrixes in these two layers. The polymorphs of the nacre and the myostracum layer were further determined as aragonite according to the XRD profiles ( Fig 3B).

Organic matrix distribution
To identify the distribution of the organic matrix in the P. viridis shell, shell samples were digested by collagenases and chitinases, respectively. Compared to the control samples ( Fig 4A  and 4B), the nacre layer of AMS-A was obviously etched by type-I collagenase digestion, resulting in many cracks and holes presented in the tablets ( Fig 4C); etched cracks can also be seen in the myostracum layer of AMS-A ( Fig 4D). For type-II collagenase digestion, comparing with the control samples ( Fig 4E and 4F), the nacre layer of AMS-A was slightly etched ( Fig  4G), and at the AMS region, the integrity of the myostracum layer was damaged due to part of the myostracum layer detaching from the surface of the AMS after collagenase digestion ( Fig  4H).
To test the distribution of chitin in different layers of P. viridis shell, AMS-A and AMS were treated with chitinase and chitotriosidase respectively. Both chitinase and chitotriosidase resulted in a similar effects for the shell samples. Compared to the control samples ( Fig 5A and  5B), the microstructure of AMS-A after chitinase digestion showed some long cracks at the interlamellar of the nacre layer ( Fig 5C). For the AMS region, long cracks are present at the interior nacre layer and the myostracum-nacre interface ( Fig 5D). For chitotriosidase digestion, similar results were presented comparing to the control samples (Fig 5E and 5F). Long cracks can be seen at the nacre layer of the AMS-A and AMS region after chitotriosidase digestion (Fig 5G and 5H).
For the surface of AMS and AMS-A samples after enzyme digestion, we noticed that, compared to the control samples (Figs 6A and 5B), the film covering the AMS-A region was digested by collagenase, and the underlying tablets of the nacre layer were presented individually ( Fig 6C). In addition, nacreous tablets on the surface of AMS-A showed a terrace pattern   Fig 6E). Interestingly, the film covering the AMS region showed strong resistance to the two enzymes, and no significant morphological changes were observed in this study (Figs 6D and 5F).

Transcriptome sequencing, assembly, and annotation
The mantle is the main organ that secretes shell proteins. For P. viridis, the RNA from 10 mantle samples was sequenced to yield a total of 83.28 Mb raw reads using an Illumina Hiseq4000 system. After further quality filtration, a total of 59.52 Mb of clean reads were obtained and used for subsequent assembly and annotation. De novo assembly of the clean reads yielded a total of 69,859 unigenes with an average length of 690 nt and an N50 of 1,093 nt (S1 Fig) Fig). A total of 4,832 unigenes were assigned to GO categories (Fig 7). The majority of genes in the Biological processes annotation shared genes associated with the terms Cellular process, Metabolic process, and Single-organism process. The Cellular component annotation included genes associated with the terms Cell, Cell part, and Membrane. The Molecular function annotation included a dominance of genes associated with Catalytic activity, Binding, and Transporter activity. A total of 17,972 unigenes were mapped to the KOG database, and the majority of KOG functions shared genes with Signal transduction mechanisms (4,306), General function prediction only (4,231), Function unknown (2,373), Posttranslational modification, protein turnover, chaperones (2,229), Cytoskeleton (1,503), Transcription (1,400), and Intracellular trafficking, secretion, and vesicular transport (1,183) (Fig 8). According to the KEGG annotation, the matched 19,682 unigenes were divided into 6 categories, including Cellular Processes, Environmental Information Processing, Genetic Information Processing, Human Diseases, Metabolism, and Organismal Systems. In KEGG classification, Signal transduction (3,082) encompassed most of the KEGG classification, followed by Global and overview maps (2,457), Cancers overview (2,035), Endocrine system (1,752), Transport and catabolism (1,500), and Cellular community (1,366) (Fig 9).

Identification of the transcripts encoding biomineralization-related proteins
To obtain an integrated view of the transcriptional events of biomineralization-related processes in the P. viridis mantle, the unique transcript library was screened for reported SMPs. Taken together, 111 transcripts were identified with significant hits (E-value<1E-05) to 14 reported image of AMS-A section after type-I collagenase digestion; the etched cracks and holes can be seen at the tablet of the nacre layer; D: the SEM image of AMS section after type-I collagenase digestion; the etched area at the myostracum layer are denoted by black arrows; E: the section image of AMS-A from the control sample; F: the section image of AMS from the control sample; G: the SEM image of AMS-A section after type-II collagenase digestion; the etched area at the nacre layer are circled with white dash line and the etched crack of the myostracum layer are denoted by black arrows; H: the SEM image of AMS section after type-II collagenase digestion; the etched area at the nacre layer are denoted by black arrows. The bar is 5 μm.

Proteomic profile of the P. viridis shell
The 69,859 unigenes identified from the P. viridis mantle provided transcriptomic data for shell proteomic analysis. The shell proteins were divided into four parts, the acid-soluble fraction and the acid-insoluble fraction from the myostracum layer, and the same parts from the nacreous layer. The number of the total spectra, the identified spectra, the identified peptides, and the identified proteins of the four samples from MS analysis are listed in Table 2. The unique spectrum number, unique peptide number, protein mass distribution, protein coverage, and the length of the matched peptide are shown in S4 and S5 Figs. The proteomic data of the P. viridis shell were submitted to http://www.iprox.org/ and http://proteomecentral. proteomexchange.org with the Nos. IPX0001273000 and PXD010566, respectively.
For the myostracum layer, 110 and 147 proteins with an FDR<0.01 were identified from the acid-soluble and -insoluble fraction, respectively (S1 and S2 Tables). For the nacre layer, 204 and 277 proteins with an FDR<0.01 were identified from the acid-soluble and -insoluble fraction, respectively (S3 and S4 Tables). Considering the overlap between the four sets, we identified a total of 378 proteins from the P. viridis shell. The nacre layer contributed 330 proteins, and the myostracum layer contributed 183 proteins. In addition, 195 proteins were found to be exclusive to the nacre layer, 48 proteins were exclusive to the myostracum layer, and 135 proteins were shared by both the nacre and myostracum layers. Of the 378 proteins identified from the P. viridis shell, 317 proteins had homologies in the NR database of NCBI, and 61 proteins had no significant similarity found. Out of the 317 significant matches found in the NR database, 98 proteins were from Crassostrea. Other matches were from Mizuhopecten (66 proteins), Mytilus (63 proteins), Lingula (8 proteins), Haliotins (4 proteins), and Perna (3 proteins) (S1-S4 Tables). In addition, 65 (~20%) identified SMPs from the P. viridis shell had predicted signal peptides with a peptide length of 16~27 amino acids. Protein homologous sequence searching was further performed against the NR databases at NCBI. The COG, GO, and KEGG databases were used to functionally classify the identified P. viridis SMPs.
For the myostracum layer, a total of 59 proteins were assigned to the COG annotation with different categories (Fig 10A). General function prediction only, Energy production and conversion, and Cell wall/membrane/envelope biogenesis were the categories shared by the majority proteins from this layer ( Fig 10A). In addition, 66 proteins from the myostracum layer were assigned to GO annotation ( Fig 10B). Binding, Cell part/Cell, and Cellular process were the top hits for Molecular function, Cellular component, and Biological processes, respectively ( Fig 10B).
For the nacre layer, a total of 196 proteins were assigned to COG annotation ( Fig 11A). General function prediction only, Energy production and conversion, and Cell wall/membrane/envelope biogenesis were the categories shared by the majority proteins from this layer ( Fig 11A). For GO annotation, 135 proteins were assigned and Binding, Cell part/Cell, and Cellular process were the top hits for Molecular function, Cellular component, and Biological processes, respectively ( Fig 11B).
According to the KEGG annotation, 123 proteins from the myostracum layer were associated with 163 pathways. For the nacre layer, 241 proteins were associated with 217 KEGG pathways. The top 10 KEGG pathways of these samples are listed in Table 3. Metabolic and Focal adhesion were the pathways associated with the majority of all samples.

Discussion
In the present study, we observed that the P. viridis shell consists of two distinct layers, the nacre and myostracum. As shown in Fig 1, the nacre layer, representing the predominant microstructure with paralleled stacks of tablets, occupies the greatest area of the P. viridis shell. The myostracum layer with columnar structures is a thin layer embedded in the nacre layer ( Fig 1B). In addition, both the nacre and myostracum of the P. viridis shell were determined as aragonite polymorphs according to the FTIR and XRD profiles (Fig 3). Molluscs present a layered structure that can be categorized into seven kinds of generally accepted structures, including columnar and sheet, prismatic, crossed-lamellar, foliated, homogeneous, and complex crossed-lamellar structures [10,26]. Most bivalves contain at least three different microstructures and both aragonite and calcite in the shell [27,28]. Thus, P. viridis has a simple shell structure compared with the shells from other bivalves, such as Mytilus [29], Crassostrea [30], clams [31], and pteriomorphs [32]. Therefore, the P. viridis shell is a good model for exploring the proteomic map layer by layer.

Microstructure of P. viridis shell at the AMS
For bivalves, the adductor muscle is the organ that controls the closing of the valves, and the tight junction between the muscle and shell is a good model of a "bi-material interface" that can be used to develop bio-inspired strategies for connecting dissimilar advanced materials. In the case of P. viridis, the adductor muscle is fastened strongly to the shell, and it is hard to separate the valves. Galtsoff reported that in Crassostrea virginica, the adhesion withstands a pulling force up to 10 kg [33], indicating the high strength, strong adhesion and toughness, and long fatigue lifetime of the adductor muscle-shell interface. In the AMS region of the P. viridis shell, the myostracum layer is exposed to the inner shell surface and connected directly with the adductor muscles (Fig 1C), suggesting the important roles of the myostracum layer in muscle-shell attachment as well as in the strength of the shell. The inner surface of the P. viridis shell is covered by a thin organic film. This film, also named the innermost shell lamella (ISL), has also been reported in other mollusc shells, including Mercenaria mercenaria, Rangia cuneata, and Pinctada fucata [34][35][36]. The ISL tightly adheres to the calcified shell layer and plays an intermediary role in shell formation [37,38]. Additionally, the film is believed to play a key role in muscle-shell attachment as an epithelium-film-shell junction for molluscs [12,15,16]. In the AMS region, pit structures were observed on the ISL after the adductor muscle was removed by 5% NaOH (Fig 2B), as well on as the calcified shell surface after deproteinization by 20% NaOH (Fig 2D), suggesting the presence of organic fibres existing in both the film and the beneath calcium carbonate crystal. Similar results were also found in the shell of the scallop Patinopecten yessoensis, and a model suggesting that the adductor muscle was attached to the shell by the insertion of organic fibres and fibril bundles branched from the muscle into the pits on the myostracum was proposed [39]. In addition, the ultrastructure of muscle-shell attachment in gastropods revealed a similar conclusion [12]. The mineral phase of the P. viridis shell is intimately associated with biological matrixes in both the myostracum and nacre layers. Different types of matrices have been observed in bivalve shells: the interlamellar matrix sheets separate adjacent crystal layers, and the intercrystalline matrix separates adjacent CaCO 3 crystals within the shell layer [40]. Considering the presentation of the ISL structure on the shell inner surface, the shell matrix in this study is represented not only by the interlamellar and intercrystalline matrix inside the shell but also by the matrix from the ISL covering on the shell surface.

Shell proteome of P. viridis
In the present study, we characterized the mantle transcriptome of P. viridis using an Illumina system to identify putative biomineralization-related genes and provide a dataset for the next P. viridis shell proteomic analysis. We obtained over 83 Mb of raw sequence data and were able to assemble these reads into 69,859 transcripts, 29,921 (42.83%) of which were annotated by BLAST searches against five public databases (S2 Fig). The size of transcripts from the P. viridis mantle transcriptome was lower than that of M. coruscus [20], Tectus pyramis [41], and Patinopecten yessoensis [42] but higher than Pecten maximus [43], P. maximus [44], C. gigas  [44], and M. truncata [45]. Similar to other molluscs, the P. viridis mantle transcriptome was heavily dominated by muscle-related genes (Myosin, Actin, etc.), reflecting the contractile nature of this organ and mitochondrial respiratory chain-related genes (NADH dehydrogenase, cytochrome c, arginine kinase, etc.), demonstrating that the mantle is a metabolically and transcriptionally active tissue. In addition, 111 biomineralization-related transcripts were identified from the P. viridis mantle transcriptome (Table 1), representing a highly conserved core set of genes involved in shell formation. Using a combined proteomic/transcriptomic approach, we identified a total of 378 SMPs associated with the nacre and myostracum layers in the P. viridis shell. Of this protein set, 132 SMPs identified with more than two matched unique peptides are listed (Table 4) and discussed here. We also observed that approximately 20% of identified SMPs presented a predicted signal peptide, suggesting that these proteins are secreted by the mantle tissue through a classical cellular secretion pathway. As shown in Table 4, of the 132 SMPs, 69 are exclusive to the nacre, 12 to the myostracum, and 51 SMPs are shared by both layers. According to the protein scores and the number of matched peptides, the Myosin-tail domain-containing proteins, Filament-like proteins, Chitin-binding domain-containing proteins, FN3 (Fibronectin-3) domain-containing proteins, Peroxidase-like proteins, Calponin-homology (CH)/Calponin domain-containing proteins, and Filamin-like proteins represent the most abundant molecules in P. viridis shell (Table 4).

Most abundant SMPs in the P. viridis shell
Myosin-tail domain-containing proteins have been identified from many bivalve shells, such as Mytilus [19,20] and clam [46] and were previously considered cellular contaminants [47,48]. However, the myosin-tail domain-containing protein is also an actin-binding protein involved in mechanochemical cross-bridges [49], and two myosin-tail domains can form a coil-coil structure and then assemble into the macromolecular thick filament [50]. In the present study, both actin and Filament-like proteins were identified confidently from the P. viridis shell. These results indicate the possibility that myosin-tail domain-containing proteins, together with actin and filament-like proteins, may be associated with biomineralization, as suggested by Jackson et al. [51]. In addition, filament-like proteins have also been identified from the shells of Mytilus [19,20], snails [52], and Magellania venosa [51], suggesting a conserved distribution of this protein in mollusc shells. Chitin-binding domain-containing proteins, also representing a conserved protein group from the shell matrix, have been identified from Crassostrea [53,54], Mytilus [19,20], Pinctada [55], and scallops [55]. The main role of the Chitin-binding domain in the shell is to interact with the chitin, which is a key component of the mollusc shell [56]. FN3 domain-containing proteins have been reported as calcite-specific (prism layer) shell proteins and were absent in the aragonite layer, such as in M. truncata, which has a shell consisting of aragonite entirely [57] and the nacre layer of Unionoida [58]. However, FN3 domain-containing proteins were identified from the aragonite shell layer of Mytilus (nacre and myostracum layer) [19,20] and Lottia (cross-lamellar layer) [59]. These results indicate that FN3 domain-containing proteins may play different roles in different mollusc shell layers rather than being calcite-specific shell proteins. Peroxidase-like proteins have been identified from the shells of Mytilus [19,20], Lottia [59,60], Crassostrea [54], and Pinctada [55]. As a redox enzyme, the putative function of Peroxidase-like proteins in shell biomineralization remains enigmatic. CH/calponin domain-containing proteins have been identified from the shells of Mytilus [19,20] and Crassostrea [54]. The potential actin-binding function of the calponin domain [61,62] confers these proteins with possible roles in calponin-actin interaction, which may be involved in shell biomineralization or myostracum muscle     [19,20] and Crassostrea [54], although the real function of filamin in biomineralization is still unknown. Recently, a filamin-like protein of P. fucata (pf-filamin-A) was reported as a calcium-sensing protein related to the synthesis and transportation of calcium-containing nanoparticles on the surface of the nacreous layer [63], indicating an important role for filamin in biomineralization. The presence of these abundant matrix proteins in both the nacre and myostracum layers of the P. viridis shell suggested an important function in shell formation and demonstrate the conservation of these proteins in various mollusc shells, especially bivalve shells.

SMPs with biomineralization-related domains in the P. viridis shell
In addition to SMPs being highly abundant, many P. viridis SMPs were identified with the presence of reported biomineralization-related domains, such as Kunitz, A2M, WAP, EFhand, PDZ, VWA, Collagen domain, and low-complexity regions with abundant certain amino acids, such as R-rich, G-rich, S-rich, and T-rich [64][65][66]. These proteins can be broadly categorized into six groups: enzymes, structural proteins, immune-related proteins, low-complexity-region-containing proteins, other proteins, and uncharacterized proteins. Various enzymes have been identified from many mollusc shells, and some of them, such as tyrosinase, chitinase, SOD, and arginine kinase, are believed to play roles in biomineralization. For example, tyrosinase has possible functions in periostracum tanning [67], biomineral hydrogel maturation, hardening [68], and shell damage repair [69]. Chitinase is secreted outside of the cell to reconstruct the chitin network, suggesting a function in shell chitin metabolism during shell formation and growth. Chitinase had been identified from other bivalve shells and showed high conservation across the Metazoa [55]. In bivalves, chitin can join to form a sheet or layered structure sandwiched between two layers of protein to form a protein/ polysaccharide network, enabling complex large-scale composite shells to be fabricated [68,70,71]. Additionally, chitin is also responsible for the mechanical strength and functionality of the resulting material [72]. In the case of the P. viridis shell, the distribution of the chitinasesensitive matrix (or chitin-like material) at the nacre layer from AMS-A showed a sheet-like pattern (Fig 5G). However, this laminar structure was thinner than that in the nacre layer from the AMS region ( Fig 5H). Considering that the AMS is the most important stress distribution site on the shell, the chitin-like material, mainly presented at the interface of the myostracum and the nacre layer at the AMS, may act as superglue to join these layers together and make them whole to support the stress from the adductor muscle.
Structural proteins, such as actins, tubulins, paramyosin/myosin, and collagens, have been identified from the shell of many mollusc species [19,20,54,55]. The presence of these structural proteins in shells is much debated, and the presence of actins, tubulins and paramyosins/ myosins identified from the shell matrix has been previously described as contaminants [73]. However, we cannot exclude the possible roles of these structural proteins in shell formation because of the wide distribution and high abundance of these cytoskeletal proteins in many mollusc shells, even those that have been washed thoroughly with hypochlorite solution [51] or sodium hydroxide [55]. In addition, actin has been suggested to be an organic remnant from the migration of secreted biomineral components from outer mantle epithelial cells to the shell [51,74]. Furthermore, many shell matrix proteins have been identified with actinbinding domains, such as PDZ, THY, and calponin, from Mytilus [19,20], indicating that a possible protein interactional network mediated by actin may exist in the mollusc shell. Collagens have been identified from many mollusc shells, including Mytilus [19,20], Hyriopsis cumingii [75], and Pinctada fucata [76]. A collagenous matrix has been suggested to be involved in the deposition of calcium phosphate in hard tissues, such as bone, dentin, and cementum [77,78]. In these tissues, minerals exist in the basic organic frameworks, which are formed by collagen fibrils [79,80]. These results indicate the important roles of collagen in biomineralization. In addition, the myostracum layer in the AMS area is susceptible to collagenase digestion, indicating the presence of a collagen-like matrix in the interface between the myostracum and the nacre layer (Fig 5). The specific distribution of the collagenase-sensitive matrix in the AMS area suggests that collagen might play an important role in muscle-shell attachment (myostracum-nacre interface, for example). The current results are in accordance with the observation of Galtsoff [33] in the study of muscle-shell attachment of C. virginica, where collagenase significantly reduced the adherence of the muscle to the shell.

Immune-related proteins in the P. viridis shell
The existence of immune-related proteins within the shell organic matrix is not new in molluscs because immune-related proteins have been broadly identified from the shell of Bivalvia [19,20,54,55], Gastropoda [51,59], and Brachiopoda [51,81]. Recently, a significant number of SMPs containing immune-related domains were identified from the shell of four highly divergent bivalves, including the Pacific oyster (C. gigas), the blue mussel (M. edulis), the clam (M. truncate), and the king scallop (P. maximus), indicating the important roles of these SMPs in defence against pathogens [82]. Of this protein group, the protease-inhibitor-like proteins (PILPs) are important not only for assisting the immune system against microbial invasion [83] but also as a part of the shell matrix framework [84]. In this study, abundant PILPs containing conserved proteinase inhibitor domains, such as WAP, KU, A2M, Serpin, and C345C, were identified in both the nacre and myostracum layers (Table 3). It has been suggested that PILPs are involved in biomineralization processes as shell matrix protection systems against proteolysis during shell formation and thus protect shell structures from proteases secreted by fouling organisms and predators [85,86]. Another immune-related protein, mucin (or mucinlike protein), was also reported as a biomineralization-related protein [87,88] and identified first from the fan mussel as a matrix protein associated with the nacre layer [89,90]. In this study, two proteins (Unigene51584 and Unigene3051, as shown in Table 3) with homologies to the mucin family were identified from the P. viridis shell, confirming the existence of mucin-like proteins in the mollusc shell. However, the real function of mucin-like proteins in biomineralization is still unknown, although it was suggested as one of the constituents of a gel-like matrix that can be pushed aside when the nacre tablet grows laterally [68]. Unlike Mucoperlin (the mucin-like protein from the Pinna nobilis shell), one of the mucin-like proteins (Unigene51584) from the P. viridis shell contains a chitin-binding domain (ChtBD2) in its sequence (Table 3), indicating a possible function of this protein in the interaction with chitin.

SMPs with LCRCPs in the P. viridis shell
In our study, Gly-, Arg-, Ser-, and Thr-rich shell proteins were also identified from P. viridis. Most of these proteins lacked significant homologs and known domains, but low-complexity regions existed in the sequence. Low-complexity region-containing proteins (LCRCPs), characterized by a high abundance of certain amino acids (such as Glu, Asp, Gly, Ala, Met, Arg, Thr, and Ser) in their sequence, have been reported previously from many mollusc shells [19,20,54,55,91]. Some of these proteins, identified in previous works, are well-known shell matrix proteins, such as MSI60 and carbonic anhydrase (CA), in which MSI60 is a Gly-and Ala-rich protein [92,93], and CA is a ubiquitous enzyme with high abundance of Gly, Asn, and Asp in mollusc shells [94]. LCRCPs have been proposed to confer proteins with the enhanced flexibility for the transport of proteins across the cell membrane and the intrinsic plasticity that allows a single protein to recognize several biological targets without sacrificing its specificity [95], which may aid in the transportation of shell proteins via mantle epithelial cell membranes and the interaction among the shell proteins. In addition, the low-complexity region of LCRCPs has been proposed to adopt a specific structure when bound to the calcium carbonate crystal surface [96]. In parallel, it has been proposed that the numerous Gly and Ala residues of some shell proteins, such as silk-like fibroins, are implicated in the tensile mechanical properties of these proteins [97].

Comparison of the SMPs from P. viridis with other bivalves
Irrespective of shell morphology and microstructure, we found that many SMPs identified from P. viridis shell are shared by other bivalves. Table 5 summarizes the protein features of identified SMPs from P. viridis and from other mollusc models, including Unionoids [58], Pinctada [55,76], Ostreidae [54], Myoida [57], and Mytilus [19,20]. As shown in this table, some SMPs, such as S-rich, G-rich, peroxidase, VWA-domain-containing, Glyco-hydro-18, and Lam-G, are shared by various bivalves shell. Thus, we suggest that these SMPs are evolutionarily conservative and represent part of the "basic tool kit" for the construction of the CaCO3 molluscan exo-skeleton, as previously described by Marie, et al [58]. Taken together, these results indicate that a conserved molecular machinery exists for shell biomineralization in bivalves. However, this assumption needs further validation via functional analyses.

Conclusions
In this paper, the microstructure and mineral polymorphs of the P. viridis shell were studied, and two shell layers, the nacre and myostracum with the same aragonite polymorph, were characterized by SEM, FTIR, and XRD. We also tested the microstructural changes of the shell samples after chitinase and collagenase digestion. In the AMS region, collagenase digestion can reduce the adhesion between the myostracum layer and the nacre layer, suggesting the important roles of collagens in myostracum-nacre interface. Chitinase digestion resulted in long cracks at the interlamellar of the nacre layer and the myostracum/nacre interface, indicating the important role of chitin in shell formation. Using Illumina sequencing, the mantle transcriptome of P. viridis was investigated, and a total of 69,859 unigenes were generated, of which approximately 43% were assigned putative functions. The most highly expressed genes were those with dominant biological functions of contraction and energy production. Using a combined proteomic/transcriptomic approach, we identified a total of 378 SMPs from P. viridis shell, of which 132 SMPs were identified as having more than two matched unique peptides. Of the 132 SMPs, 69 were exclusive to the nacre, 12 to the myostracum, and 51 were shared by both. The Myosin-tail domain-containing proteins, Filament-like proteins, and Chitin-binding domain-containing proteins represented the most abundant molecules. In addition, SMPs containing biomineralization-related domains, such as Kunitz, A2M, WAP, EFhand, PDZ, VWA, Collagen domain, and LCRCPs with abundant certain amino acids, were also identified from the P. viridis shell. These proteins can be broadly categorized into six groups, and the SMP-containing domains, such as immunomodulatory, might imply their role in different biological functions, not only in biomineralization. Our results present for the first time the proteome of the P. viridis shell and increase the knowledge of SMPs in this genus. For the acid-insoluble sample, the unique peptide number, the length of matched peptide, the protein coverage, the protein mass distribution, and the unique spectrum number are shown in A~E, respectively. For the acid-soluble sample, the results are shown in F~J, respectively. (TIF) S1