The authors have declared that no competing interests exist.
Chlorophyll degradation is an intricate process that is critical in a variety of plant tissues at different times during the plant life cycle. Many of the photoactive chlorophyll degradation intermediates are exceptionally cytotoxic necessitating that the pathway be carefully coordinated and regulated. The primary regulatory step in the chlorophyll degradation pathway involves the enzyme pheophorbide a oxygenase (PAO), which oxidizes the chlorophyll intermediate pheophorbide a, that is eventually converted to non-fluorescent chlorophyll catabolites. There is evidence that PAO is differentially regulated across different environmental and developmental conditions with both transcriptional and post-transcriptional components, but the involved regulatory elements are uncertain or unknown. We hypothesized that transcription factors modulate PAO expression across different environmental conditions, such as cold and drought, as well as during developmental transitions to leaf senescence and maturation of green seeds. To test these hypotheses, several sets of
Chlorophyll is a central molecule in plants that is essential to photosynthesis in absorbing light and transferring excitation energy, a portion of which is ultimately captured in plant biomass. Chlorophyll synthesis and breakdown are two metabolically significant processes of higher plants that can have significant economic consequences for crop agriculture. While the intricate chlorophyll biosynthetic pathway is well understood, basic knowledge about the chlorophyll degradation machinery and its regulation is uncertain. The chlorophyll degradation pathway consists of several enzymes that convert chlorophyll to non-fluorescent chlorophyll catabolites (NCCs), leading to loss of green color and absorption of visible light [
The key controlling enzyme involved in the chlorophyll degradation pathway is pheophorbide
A role for transcriptional regulation of the chlorophyll degradation pathway is suggested by several findings [
The chlorophyll degradation pathway is relevant to the agricultural sector by modulating postharvest fruit and vegetable color [
Three experiments on the analysis of global gene expression under conditions that should influence PAO gene expression in
Statistical and correlation analysis were performed on each microarray dataset using statistical programming language scripts of the statistical analysis system (SAS), written and tailored specifically for this project (
Functional determination of upstream regulatory candidates was accomplished using two unique, complementary methods. The first method involved the relational database structured query language (SQL) native functionality within the SAS statistical programming language. Pre-stored gene ontology data were crossed with the shortened co-expressed gene list using a custom-made SQL query to determine the functions of these genes and whether they belonged to any transcriptional factor families of interest. The second method relied on the Database for Annotation, Visualization and Integrated Discovery (DAVID) analysis tool [
First, the three microarray datasets were analyzed using ANOVA for comparison of PAO expression levels across all plant lines, environmental conditions, and biological processes, finding in each case that PAO expression levels significantly differed at P<0.05 among the various
An ANOVA was performed to cross-compare PAO expression levels and found significant differences among plant lines in the GSE55907, GSE5727, and GSE72050 microarray datasets at the P<0.05 level in normal vs. pre-freezing treatment (
The 24 candidates of interest were typified by their function using the SQL query merging the candidate list and pre-loaded ontology data. One candidate, probe name_261564_at, was shown to be a NAC family transcription factor based on SQL query findings (
The 24 candidates predicting PAO levels in the forward and stepwise method multiple linear regression models of the GSE55907 dataset were identified by their functions through performing SQL joins between tables containing gene names and functions. In particular, probe name_261564_at was found to be relevant due to its transcription factor capacity.
Obs | probename | genename | description |
---|---|---|---|
245169_at | AT2G33220 | similar to MEE4 (maternal effect embryo arrest 4) [ |
|
245604_at | AT4G14290 | similar to unknown protein [ |
|
246781_at | AT5G27350 | SFP1; carbohydrate transporter/ sugar porter | |
247270_at | AT5G64220 | calmodulin-binding protein | |
247438_at | AT5G62460 | zinc finger (C3HC4-type RING finger) family protein | |
247792_at | AT5G58787 | zinc finger (C3HC4-type RING finger) family protein | |
249409_at | AT5G40340 | PWWP domain-containing protein | |
251208_at | AT3G62880 | ATOEP16-4; protein translocase | |
251427_at | AT3G60130 | glycosyl hydrolase family 1 protein / beta-glucosidase, putative (YLS1) | |
251893_at | AT3G54380 | SAC3/GANP family protein | |
252180_at | AT3G50630 | ICK2 (KIP-RELATED PROTEIN 2) | |
252570_at | AT3G45300 | IVD (ISOVALERYL-COA-DEHYDROGENASE) | |
254622_at | AT4G18375 | KH domain-containing protein | |
255154_at | AT4G08220 | ||
255386_at | AT4G03620 | myosin heavy chain-related | |
255535_at | AT4G01790 | ribosomal protein L7Ae/L30e/S12e/Gadd45 family protein / ribonuclease P-related | |
256084_at | AT1G20750 | helicase-related | |
256926_at | AT3G22540 | similar to unknown protein [ |
|
261023_at | AT1G12200 | flavin-containing monooxygenase family protein / FMO family protein | |
261564_at | AT1G01720 | ATAF1 ( |
|
261880_at | AT1G50500 | HIT1 (HEAT-INTOLERANT 1); transporter | |
262429_at | AT1G47520 | ||
264335_s_at | AT1G55860;AT1G70320 | [AT1G55860, UPL1 (UBIQUITIN-PROTEIN LIGASE 1); ubiquitin-protein ligase];[AT1G70320, UPL2 (UBIQUITIN-PROTEIN LIGASE 2); ubiquitin-protein ligase] | |
265792_at | AT2G01390 | pentatricopeptide (PPR) repeat-containing protein |
To investigate ATAF1 further, correlation analysis was performed on ATAF1 compared to PAO, with positive Pearson’s correlation coefficients of greater than 0.5 and P<0.05 being determined between the two variables of ATAF1 and PAO transcript expression levels in all three microarray datasets (
Using Pearson correlation analysis it was found that ATAF1 correlates positively and significantly with PAO at the P<0.05 level in all three microarray datasets. Information about the genetics and physiology of the data sets were also provided.
Data set | Genetics | Physiology | Pearson’s |
P value |
---|---|---|---|---|
Overexpressing CBF/other TFs | Pre-freezing treatment | 0.67205 | <0.0001 | |
Mutant JA/ethylene and SA pathway | Pre- vs. during senescence | 0.88708 | 0.0006 | |
Overexpressing NAC26 | Drought vs. well-water | 0.84540 | 0.0082 |
The fit plot shows the values of ATAF1 (x-axis) and PAO (y-axis) in each observation, visually showing a positive correlation. The model from the GLM is overlaid onto the existing data of the GSE55907, GSE5727, and GSE72050 microarray datasets and the existing data fit within the 95% prediction limits of the model in normal vs. pre-freezing treatment (
Finally, frequency analysis repeated for ATAF1 side-by-side with PAO showed that in all three microarray datasets, the two genes correlated positively among plant lines, with ATAF1 levels being lower in plant lines where PAO expression was also low, while ATAF1 levels were higher in plant lines where PAO expression was also high (
Both PAO (left-side) and ATAF1 (right-side) expression differed by plant line in a positively correlated manner in the GSE55907, GSE5727, and GSE72050 microarray datasets, where PAO and ATAF1 levels were either both high or both low in each plant line.
8.40 | 0.30 | 8.18 | 8.61 | 6.41 | 0.52 | 6.04 | 6.78 | |
8.43 | 0.71 | 7.92 | 8.93 | 6.40 | 0.35 | 6.15 | 6.64 | |
9.03 | 0.15 | 8.92 | 9.13 | 7.59 | 0.41 | 7.30 | 7.88 | |
8.11 | 0.18 | 7.98 | 8.23 | 6.19 | 0.45 | 5.87 | 6.51 | |
8.32 | 0.36 | 8.06 | 8.57 | 6.50 | 0.08 | 6.44 | 6.55 | |
9.47 | 0.23 | 9.31 | 9.63 | 7.64 | 0.05 | 7.60 | 7.67 | |
8.55 | 0.11 | 8.47 | 8.62 | 6.62 | 0.05 | 6.58 | 6.65 | |
8.53 | 0.02 | 8.51 | 8.54 | 6.45 | 0.11 | 6.37 | 6.53 | |
8.27 | 0.11 | 8.19 | 8.35 | 6.88 | 0.08 | 6.82 | 6.93 | |
8.51 | 0.14 | 8.41 | 8.61 | 6.46 | 0.29 | 6.25 | 6.66 | |
8.46 | 0.16 | 8.34 | 8.57 | 7.06 | 0.42 | 6.76 | 7.36 | |
8.92 | 0.11 | 8.84 | 8.99 | 7.44 | 0.10 | 7.37 | 7.51 | |
8.96 | 0.20 | 8.82 | 9.10 | 6.45 | 0.31 | 6.23 | 6.67 | |
8.58 | 0.02 | 8.56 | 8.59 | 6.95 | 0.25 | 6.77 | 7.13 | |
10.39 | 0.16 | 10.28 | 10.50 | 7.22 | 0.06 | 7.17 | 7.26 | |
8.56 | 0.00 | 8.56 | 8.56 | 5.17 | 0.39 | 4.89 | 5.44 | |
9.99 | 0.28 | 9.79 | 10.18 | 7.73 | 0.21 | 7.58 | 7.87 | |
10.05 | 0.29 | 9.84 | 10.25 | 7.49 | 0.02 | 7.47 | 7.50 | |
9.92 | 0.21 | 9.77 | 10.07 | 7.60 | 0.08 | 7.54 | 7.65 | |
8.38 | 0.04 | 8.35 | 8.40 | 7.28 | 0.25 | 7.10 | 7.45 | |
9.22 | 0.45 | 8.90 | 9.54 | 7.96 | 0.06 | 7.92 | 8.00 | |
7.88 | 0.14 | 7.78 | 7.98 | 7.34 | 0.00 | 7.34 | 7.34 | |
9.37 | 0.04 | 9.34 | 9.40 | 7.98 | 0.07 | 7.93 | 8.03 |
The analysis of the GSE55907, GSE5727, and GSE72050 microarray datasets revealed that PAO expression differed among the different plant lines, which were either wild type, overexpressing, transgenic, or mutant lines, as well as among the differing experimental conditions such as pre-freezing treatment vs. normal conditions, before vs. during senescence, or drought vs. well-water conditions (
To determine the potential indirect genetic factors regulating PAO expression in
Further correlation analysis and regression modeling pointed to the same conclusion that ATAF1 significantly predicts PAO and that PAO is positively correlated with ATAF1 (
The above schematic presents a summary of the findings and our current understanding of the relationship between the three different conditions, cold, senescence, and drought, with ATAF1 and PAO gene expression and chlorophyll (Chl) levels in plants. Pointed arrows represent direct relationships between the entities while blind-ended arrows represent inverse relationships between the entities.
Our model is further supported by several findings in recent literature showing a strong connection between NAC family domain transcription factors and various abiotic factors and plant processes, including temperature and leaf senescence, serving as important regulators. For example, NAC domain family member NAM-B1 was found to accelerate senescence and increase zinc and iron content in wheat grain [
PAO transcript expression was found to be significantly up-regulated in warm conditions, during leaf senescence, and in drought conditions, and in all three conditions significantly positively correlated with expression of ATAF1, a NAC transcription factor implicated in the literature as being related to all three of these types of conditions. This analysis posits a regulatory network in which ATAF1 is triggered in response to these abiotic stresses and acts to regulate chlorophyll degradation by up-regulating PAO expression.
PAO is represented by probe name _246335_at, while Plant class variable represents the 14 plant lines of the study representing wild type under different temperatures, or transgenic lines overexpressing CBF-induced transcription factors. Note significant factor of Plant (p = 0.0089). Box plots displayed lower to upper quartiles with central horizontal line representing median, and diamond representing mean.
(TIF)
PAO is represented by probe name _246335_at, while Plant class variable represents the 5 plant lines of the study representing wild type pre- and during senescence, or mutant or transgenic lines, in relevant senescence pathways. Note significant factor of Plant (p = 0.0023). Box plots displayed lower to upper quartiles with central horizontal line representing median, and diamond representing mean.
(TIF)
PAO is represented by probe name _246335_at, while Plant class variable represents the 4 plant lines of the study representing wild-type or NAC26 line in well-water or drought conditions. Note significant factor of Plant (p = 0.0092). Box plots displayed lower to upper quartiles with central horizontal line representing median, and diamond representing mean.
(TIF)
After performing stepwise selection with a multiple linear regression model containing all PAO-correlated genes, these 24 candidates were identified as together forming a significant model (p<0.05) predicting PAO expression and explaining all variation in PAO expression (R2 = 1).
(DOCX)
After performing forward selection with a multiple linear regression model containing all PAO-correlated genes, these 24 candidates were identified as together forming a significant model (p<0.05) predicting PAO expression and explaining all variation in PAO expression (R2 = 1).
(DOCX)
The 24 candidates predicting PAO in the multiple linear regression models were annotated using the DAVID tool. In particular, probe name_261564_at, here aliased by TAIR ID AT1G01720, was found to be relevant due to its transcription factor capacity.
(DOCX)
The written SAS code is presented below for the statistical analysis of the GSE55907, GSE5727, and GSE72050 microarray datasets.
(DOCX)
We thank University of Illinois Department of Crop Sciences and Institute for Genomic Biology for their support.