Skip to main content
  • Loading metrics

Discovery of photosynthesis genes through whole-genome sequencing of acetate-requiring mutants of Chlamydomonas reinhardtii

  • Setsuko Wakao ,

    Roles Formal analysis, Investigation, Writing – original draft, Writing – review & editing (SW); (KKN)

    Affiliations Division of Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America, Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America

  • Patrick M. Shih,

    Roles Conceptualization, Formal analysis, Investigation, Writing – original draft, Writing – review & editing

    Affiliations Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America, Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America, Feedstocks Division, Joint BioEnergy Institute, Emeryville, California, United States of America, Innovative Genomics Institute, University of California, Berkeley, California, United States of America

  • Katharine Guan,

    Roles Investigation

    Affiliations Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America, Howard Hughes Medical Institute, University of California, Berkeley, California, United States of America

  • Wendy Schackwitz,

    Roles Formal analysis

    Affiliation Joint Genome Institute, Berkeley, California, United States of America

  • Joshua Ye,

    Roles Investigation

    Affiliations Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America, Howard Hughes Medical Institute, University of California, Berkeley, California, United States of America

  • Dhruv Patel,

    Roles Investigation

    Affiliation Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America

  • Robert M. Shih,

    Roles Investigation

    Affiliation Division of Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America

  • Rachel M. Dent,

    Roles Investigation

    Affiliation Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America

  • Mansi Chovatia,

    Roles Investigation

    Affiliation Joint Genome Institute, Berkeley, California, United States of America

  • Aditi Sharma,

    Roles Investigation

    Affiliation Joint Genome Institute, Berkeley, California, United States of America

  • Joel Martin,

    Roles Data curation

    Affiliation Joint Genome Institute, Berkeley, California, United States of America

  • Chia-Lin Wei,

    Roles Resources

    Present address: Jackson Lab, Farmington, Connecticut, United States of America

    Affiliation Joint Genome Institute, Berkeley, California, United States of America

  • Krishna K. Niyogi

    Roles Conceptualization, Funding acquisition, Writing – original draft, Writing – review & editing (SW); (KKN)

    Affiliations Division of Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America, Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America, Howard Hughes Medical Institute, University of California, Berkeley, California, United States of America


Large-scale mutant libraries have been indispensable for genetic studies, and the development of next-generation genome sequencing technologies has greatly advanced efforts to analyze mutants. In this work, we sequenced the genomes of 660 Chlamydomonas reinhardtii acetate-requiring mutants, part of a larger photosynthesis mutant collection previously generated by insertional mutagenesis with a linearized plasmid. We identified 554 insertion events from 509 mutants by mapping the plasmid insertion sites through paired-end sequences, in which one end aligned to the plasmid and the other to a chromosomal location. Nearly all (96%) of the events were associated with deletions, duplications, or more complex rearrangements of genomic DNA at the sites of plasmid insertion, and together with deletions that were unassociated with a plasmid insertion, 1470 genes were identified to be affected. Functional annotations of these genes were enriched in those related to photosynthesis, signaling, and tetrapyrrole synthesis as would be expected from a library enriched for photosynthesis mutants. Systematic manual analysis of the disrupted genes for each mutant generated a list of 253 higher-confidence candidate photosynthesis genes, and we experimentally validated two genes that are essential for photoautotrophic growth, CrLPA3 and CrPSBP4. The inventory of candidate genes includes 53 genes from a phylogenomically defined set of conserved genes in green algae and plants. Altogether, 70 candidate genes encode proteins with previously characterized functions in photosynthesis in Chlamydomonas, land plants, and/or cyanobacteria; 14 genes encode proteins previously shown to have functions unrelated to photosynthesis. Among the remaining 169 uncharacterized genes, 38 genes encode proteins without any functional annotation, signifying that our results connect a function related to photosynthesis to these previously unknown proteins. This mutant library, with genome sequences that reveal the molecular extent of the chromosomal lesions and resulting higher-confidence candidate genes, will aid in advancing gene discovery and protein functional analysis in photosynthesis.

Author summary

Oxygenic photosynthesis in plants, algae, and cyanobacteria is responsible for the production of organic matter and oxygen that supports most life on Earth, yet we still do not fully understand all the genes that are necessary for this fundamental biological process. Forward genetics is an unbiased approach to finding genes that are important for a biological process of interest. To identify genes that are necessary for oxygenic photosynthesis in the unicellular green alga Chlamydomonas reinhardtii, we sequenced the genomes of 660 mutants that are defective in photosynthesis. By manual curation of each mutant and its disrupted gene(s) and cross-comparison with previous photosynthesis mutant studies, we have identified 253 genes in 328 mutants that are strong candidate genes for having roles in photosynthesis. The list of 253 genes consists of 70 known photosynthesis genes, 169 previously uncharacterized genes, and 14 genes previously characterized for functions unrelated to photosynthesis. In addition, we gained insight into the molecular events that accompany insertional mutagenesis in Chlamydomonas, such as large deletions and chromosomal rearrangements. Our mutant library with documented mutant phenotypes, sequenced genomes, and candidate genes will aid in the discovery of photosynthesis genes.


Since the dawn of modern genetics, mutagenesis has been the primary vehicle to perturb the underlying genetic code of organisms, enabling scientists to investigate the genetic determinants underpinning biological systems. In the case of photosynthesis, much has been learned through mutagenesis of the unicellular green alga, Chlamydomonas reinhardtii, which has proven to be an indispensable reference organism for investigating the molecular components, regulation, and overall processes of photosynthesis [1,2]. Chlamydomonas has a haploid genome and an ability to use acetate as a sole carbon source, which facilitates the isolation and analysis of knock-out mutants that are defective in photosynthesis [3]. Moreover, the advantage of working with a unicellular alga rather than a whole plant has facilitated the speed with which molecular and genetic studies can be carried out [4]. Thus, the development of resources and tools to increase the breadth and depth of genetic studies in Chlamydomonas has advanced our ability to understand the molecular basis of photosynthesis.

Numerous large-scale mutagenesis and screening experiments have been carried out in Chlamydomonas, with some of the earliest efforts described over half a century ago [3,5,6]. Classical mutagenesis studies have utilized chemical and physical mutagens, which induce untargeted genomic lesions and rearrangements across the genome. Identifying the causative mutations requires genetic mapping through crosses, an approach that is robust but time consuming. Insertional mutagenesis approaches, in which a selectable marker is transformed and randomly integrated into the genome, have facilitated molecular analysis, and many PCR-based techniques have been successfully employed in Chlamydomomas to rapidly identify flanking sequence tags (FSTs) from the site of marker insertion [714]. However, the efficiency of FST recovery can be low [7] because of the complexity of events accompanying plasmid insertion such as concatemerization, chromosomal deletion or rearrangement, loss of the primer annealing sites, as well as difficulties with PCR from the Chlamydomonas nuclear genome, which is GC-rich and contains a high degree of repetitive sequences [15]. High-throughput FST recovery has been achieved in Chlamydomonas [8,10] and has offered a large collection of insertional mutants for the scientific community while enabling large-scale mutant analysis of photoautotrophic growth [9].

The advent of next-generation sequencing methods has dramatically improved our ability to identify mutations by whole-genome sequencing (WGS). In Chlamydomonas, this approach was initially combined with linkage mapping to identify point mutations in flagellar mutants [11,12], and it was used subsequently for point mutations affecting the cell cycle [13,14] and light signaling [16,17]. In the case of insertional mutants, WGS has been used extensively to identify insertion sites in bacteria and some microbial eukaryotes with smaller genomes [1820] but only for a relatively small number of mutants in Chlamydomonas [21]. In maize, due to its large genome, high-throughput next-generation sequencing of Mu transposon insertion sites has been applied only after enrichment for the transposon sequence [22], whereas the large volume of insertion site information of T-DNA insertion lines in Arabidopsis was obtained from traditional PCR-based FST isolation [2325].

We have previously generated a large insertional mutant population of Chlamydomonas by transformation with a linearized plasmid conferring paromomycin or zeocin resistance, and we identified mutants with photosynthetic defects (i.e., acetate-requiring and/or light-sensitive and reactive oxygen species-sensitive mutants) [7,26]. However, we were only able to obtain FSTs for 17% of the mutants using PCR-based approaches. Here we employed low-coverage WGS of a subset of 660 mutants to identify the plasmid insertion sites and accompanying structural variants, and we found 1470 genes that are affected by the plasmid insertion in 509 mutants. We generated a list of 253 genes from 328 mutants that we refer to as higher-confidence causative genes, enabling the discovery of 183 potential photosynthesis genes: 169 genes of previously unknown function and 14 genes previously shown to have functions unrelated to photosynthesis. We experimentally validated two genes, CrLPA3 and CrPSBP4, that are required for photoautotrophic growth in Chlamydomonas. In addition, our data provide insight into the spectrum of mutations that are induced by insertional mutagenesis in Chlamydomonas.


Identification of insertion sites by mapping of discordant read pairs

We re-screened our Chlamydomonas photosynthetic mutant collection [7,26] for growth on minimal and acetate-containing media under three light conditions (dark, D; low light of 60–80 μmol photons m-2 s-1, LL; and high light of 350–400 μmol photons m-2 s-1, HL) and for maximum photochemical efficiency of photosystem (PS) II (Fv/Fm) (S1 Table). An example of the phenotyping is shown in Fig 1. A total of 660 mutants, most of them with a growth phenotype and with resistance to either zeocin or paromomycin, indicative of the presence of the linearized plasmid sequence used for insertional mutagenesis, were chosen for WGS and herein will be referred to as the Acetate-Requiring Collection (ARC).

Fig 1. Growth and chlorophyll fluorescence screen pipeline.

Mutants were scored for growth on (A) D+ac, (B) LL+ac, (C) HL+ac, (D) LL+ac+zeocin, (E) LL-min, (F) HL-min. Fv/Fm values were measured on cells grown on (G) D+ac, (H) LL-min, (I) HL-min. The false color scale is indicated below the images (G-I). FST, flanking sequence tag. A representative plate spotted from a 96-well plate is shown. D, dark; LL, low light; HL, high light; +ac, added acetate; min, minimal media.

Genomic DNA was extracted from the 660 ARC mutants and submitted for low-coverage, paired-end WGS with a target depth of sequence coverage for each mutant between 5 and 10. The average sequencing depth across samples was 7.44. Paired-end reads that showed one end mapping to the plasmid used for mutagenesis and the other to a chromosome location were used to identify the plasmid insertion site(s) in each mutant. Plasmid insertion sites were not identified for 72 mutants, because few plasmid sequence reads were detected or the other end mapped to a low complexity region of the Chlamydomonas genome. 79 mutants had insertions that were not unique within the population (33 were duplicated, three were triplicated and one was quadruplicated) and were removed from further analysis. The remaining 509 mutant sequences were further analyzed for structural variants (insertions, deletions, and rearrangements) that occurred during insertional mutagenesis.

Fig 2 illustrates the types of structural variants detected by analysis of the paired-end sequence data. Most sequence read pairs were concordant, i.e., they showed the expected orientation and distance with respect to each other when mapped to the Chlamydomonas genome (Fig 2, dark gray arrows). In contrast, discordant pairs showed the incorrect orientation or distances that were closer or further from each other than expected based on the genome fragmentation that was performed during sequencing library preparation (genomic DNA was sheared to approximately 600 bp) or on different chromosomes. In Fig 2, the discordant reads are shown as colored arrows, with each color representing a chromosome (or plasmid) to which the corresponding paired-end read was mapped. Each of these genomic sites where sequence read pairs were discordant is listed in S1 Table as a “Discordant site”. A total of 681 discordant sites were identified as being associated with plasmid insertions in the 509 mutants.

Fig 2. Examples of structural variations and the frequency of mutants with simple or complex insertions in ARC.

Boxes contain schematic examples of mapped reads as seen in IGV. Black box, mapped reads (concordant and discordant) against plasmid and chromosome. Blue box, examples of “Simple insertions”; Gray box, examples of “Complex insertions”. Gray box shows examples of different complex insertions that are intra- or interchromosomal rearrangements. Second from left in gray box shows a possible translocation between two chromosomes. Pie chart shows frequency of “Simple mutants” containing only simple insertions and “Complex mutants” containing complex insertions.

Defining insertions by whether they are simple or complex (i.e., involving rearrangements).

At most of the plasmid insertion sites, two sets of discordant read pairs were found, with their chromosomal reads oriented toward each other and their paired-end reads mapping to the plasmid sequence (Fig 2A). We refer to these 425 events as two-sided insertions, where both sides of the plasmid insertion were unambiguously mapped (S1 Table, column “Number of sides paired with plasmid at site”, 2). Another large group of discordant sites displayed only one set of discordant read pairs located on one side of the plasmid insertion (referred to as one-sided insertions Fig 2 and S1 Table, column “Number of sides paired with plasmid at site”, 1). The read-pairs on the other side of the plasmid insertion could not be mapped in 21 of these insertion sites because (i) it was at a repetitive region (14 mutants) and (ii) it had no discordant reads (7 mutants). These 21 one-sided insertions together with the 425 two-sided insertions making a total of 446 insertions were considered to be simple insertions (S1 Fig and S1 Table). In the rest of the one-sided insertions, the other side of the plasmid insertion paired with another chromosomal region indicating an occurrence of a more complex chromosomal rearrangement. An insertion that paired with another chromosomal location was considered a complex insertion. The frequencies of two-sided, one-sided, and complex insertions are shown in S1 Fig. Whether or not the discordant site paired with another site within the same mutant is listed in the column “Pairing with other discordant site(s) in the same mutant,” and such paired sites were considered to comprise a single lesion. The columns “Number of discordant sites for the mutant” and “Number of lesions for the mutant” specify these numbers for a single mutant.

Mutants that contain only simple insertions and those that contain a complex insertion.

A total of 406 out of 509 mutants (80%) contained only simple insertions accounting for 435 out of the total 446 simple insertions (Fig 2C, Mutants with only simple insertions) (the remaining 11 simple insertions existed in mutants that also contained complex insertions). Among these 406 mutants, 24 mutants had multiple two-sided insertions accounting for 50 insertions, and three mutants had one two-sided insertion and one one-sided insertion (Fig 2C and S1 Fig). In 17 mutants, the multiple simple insertions occurred on the same chromosome, and six of these had tandem two-sided insertions that disrupted the same or neighboring genes. In 10 two-sided insertions (~1.8%), there appeared to be a short random fragment of another chromosome inserted together with the plasmid (Fig 2A, Two-sided insertion with random genome fragment). The original locus of these random fragments did not show a lack of mapped sequence reads but rather showed double the abundance of reads mapping to the small region, indicating that it was an extra copy of the same sequence at the insertion site, similar to what was observed in a previous study but at a lower frequency in ARC [8].

The other group of 103 mutants (20%) contained at least one complex insertion (Fig 2C “Mutants with complex insertions” and S1 Table, “Pairing with other discordant site(s) of the same mutant”). Some of these rearrangements occurred on a single chromosome, and others involved two or more chromosomes (Fig 2B). Among interchromosomal rearrangements, 13 of them involved two one-sided insertions that were paired to each other (Fig 2 gray box). These together may represent chromosomal translocation events resulting in two chimera chromosomes. In all of these possible translocation events, the plasmid sequence was present in one junction and not in the other. The proportion of complex insertion events was similar among the three plasmids used for transformation (pSP124S, pMS188, and pBC1). Validation of these complex structural variants would require de novo assembly of sequencing reads. Most mutants contained only two-sided or only complex insertions; 387 mutants (76%) had only two-sided insertion(s) (Fig 2C, red and orange slices), 92 had only complex insertion(s) (18%) (Fig 2C, light blue slice), and only a small proportion of mutants contained a mix of two-sided, one-sided, or complex insertions. Understanding whether or not a mutant contains only simple insertions is important in the search for causative mutations as discussed in following sections.

In summary, low-coverage WGS data for 509 ARC mutants identified 406 mutants that contained only simple insertions, whereas 103 mutants contained complex insertions that were associated with chromosomal rearrangements such as inversions and translocations.

Analysis of deletions and duplications associated with insertional mutagenesis

Insertional mutagenesis in Chlamydomonas has been previously associated with deletions and duplications at the site of plasmid insertion, especially when using glass bead for transformation (e.g. cpld38, cpld49, npq4, rbd1) [2729]. Focusing on the 425 two-sided insertions, we found deletions associated with 374 insertions (88%). A wide range of deletion sizes was observed, with a bimodal distribution peaking at 11–100 bp and 10–100 kb when plotted at log10-scale, the largest deletion being 133 kb (Fig 3A). Duplications occurred less frequently (7%), in a total of 29 insertion events (Fig 3B), and all were less than 1000 bp. Perfect insertions lacking any duplications or deletions were found in only 22 events (5%). Despite the high frequency and relatively large size of many deletions, more than half (220 insertions) of the entire set of 425 two-sided insertions affected only a single gene (Fig 3C).

Fig 3. Structural variation accompanying insertions.

(A) Duplication and deletion sizes and (B) number of mutants grouped by the number of genes affected by two-sided insertions. Only two-sided insertions were included in this analysis.

Genetic linkage between acetate-requiring phenotype and antibiotic resistance

To determine if the phenotype of ARC mutants was likely caused by the plasmid insertion, we back-crossed 89 mutants to the wild type (WT) and analyzed the genetic linkage of the acetate-requiring phenotype and antibiotic (paromomycin) resistance in the respective progenies. The acetate-requiring phenotype was closely linked to the antibiotic resistance in 88% (77 out of 88 that produced viable zygospores) of mutants that were tested (S1 Table, column “Genetic Linkage”). In each cross, approximately 100 zygospores were collected and tested for recombination between the acetate-requiring phenotype and paromomycin resistance by selecting for progeny that were able to grow on minimal medium with paromomycin (S2 Fig). The lack of recombination and therefore growth indicates that the genetic distance between the mutation causing the acetate-requiring phenotype and paromomycin resistance is less than 0.5 cM, estimated to be 50 kb on average in the Chlamydomonas genome [15].

Identification of secondary mutations using WGS data

In addition to the deletions associated with plasmid insertions in the ARC mutants, we searched for and found 68 other deletions using Pindel [30] (S2 Table). The size of the deletions ranged from 20 bp to 36 kb, with a majority of them (55 deletions, 77 genes) being less than 100 bp (S2 Table). The deletions were visually confirmed on alignments as direct gaps in reads and/or the lack of reads within the region, depending on the size. This was not expected to be an exhaustive search for such deletions. For example, low-coverage regions could be difficult to distinguish from a deletion. Nevertheless, some of the deletions affected clear candidate genes that could be responsible for the mutant phenotype. For example, the CAL014_01_19 mutant was found to contain a 21-bp deletion in Cre01.g013801, a GreenCut2 gene (conserved within genomes of land plants and green algae but absent from non-photosynthetic organisms [15,31]) annotated as a tocopherol cyclase (VTE1). The deletion occurred at the junction of intron 7 and exon 8, which could affect splicing and translation of a functional protein (S3 Fig). Because tocopherols are important for photoprotection in Chlamydomonas [32] disruption in the VTE1 gene could explain this mutant’s high light-sensitive phenotype (S1 Table). In support of this hypothesis, a second mutant in the ARC, CAL033_02_19, had a 33-bp deletion in this locus. Interestingly, this mutant has a less severe phenotype (S1 Table), consistent with the plasmid insertion and deletion positioned in the 3’-UTR of the gene, which may have led to a partial loss of function (S3 Fig).

Among the 11 mutants whose acetate-requiring phenotype did not cosegregate with its paromomycin resistance, one (CAL036_02_12) had a strong acetate-requiring phenotype (S1 Table) and contained a secondary 36-kb deletion unassociated with the plasmid sequence and located 2 Mb away from the plasmid insertion on chromosome 7 (S2 Table). This resulted in a deletion of seven genes (Cre07.g346050, Cre07.g346100, Cre07.g346150, Cre07.g346200, Cre07.g346250, Cre07.g346300, and Cre07.g346317). One of these (Cre07.g346050) is COPPER RESPONSE DEFECT 1 (CRD1), and crd1 mutants have a conditional phenotype, lacking accumulation of PSI only under copper deficiency [33]. Another mutant (CAL029_03_36) has a one-sided insertion in CRD1 and was only modestly affected in growth in HL (S1 Table), suggesting that the loss of CRD1 is not the sole cause of the severe growth phenotype of CAL036_02_12. Another one of the deleted genes is annotated as phytol kinase (Cre07.g346300). Chlorophyll degradation and phytol remobilization through phytol kinase (VTE5) and phytol phosphate kinase (VTE6) are important for α-tocopherol biosynthesis and their disruption results in high light sensitivity in tomato [34] and Arabidopsis [35]. The light sensitivity observed in CAL036_02_12 is similar to that of tomato plants silenced for VTE5 [34] and strongly suggests that Cre07.g346300 is the causative gene for the mutant phenotype. The remaining 10 mutants whose acetate-requiring phenotype is unlinked to the plasmid insertion would be candidates for higher-coverage WGS to search for causative mutations.

Genes with multiple mutant alleles in the ARC

In total, 1404 genes were directly affected by the 554 plasmid insertions in 509 mutants. There are many more affected genes compared to the number of mutants from which they originate due to disruption of multiple genes by large deletions. Additionally, among the 77 genes affected by deletions that were unassociated with plasmid insertions (S2 Table), 11 overlapped with the 1404 genes, bringing the total number of genes disrupted in our library to 1470. S3 Table lists all of the 1470 genes and their available annotations.

To begin identifying causative mutations, we searched for genes that were affected in multiple ARC mutants. Fig 4A shows the number of alleles of each gene affected by plasmid-associated insertions (S1 Table). Interruption/deletion of 1053 genes only occurred once, while 212 genes have two alleles and 94 genes have three alleles. Some genes appeared on the list of affected genes more than three times (Fig 4A). However, because disruption of multiple genes occurred in approximately half of the ARC mutants, many of these genes represented by multiple alleles are likely not causative for the mutant phenotype. Some of the genes appear more frequently on the list simply because of their proximity to the causative gene. Fig 4B shows an example of such an occurrence for CPSFL1 (Cre10.g448051). Seven ARC mutants had deletions ranging from 22 to 130 kb in a region on chromosome 10 (CAL028_01_03, CAL033_04_04, CAL031_01_04, CAL039_03_10, CAL007_02_07, CAL038_02_20, and CAL028_01_06) (S1 Table). 33 genes were affected by the deletions in these mutants, including seven genes affected in all seven mutants, which makes it difficult to narrow down to a single causative gene. One additional mutant (CAL29_02_48) had a complex insertion event involving four different chromosomes, but strikingly it shared a single affected gene (CPSFL1, containing a 10-bp deletion) with the other seven mutants. All eight mutants exhibited a strict acetate-requirement and severe light-sensitivity phenotype (S1 Table), and in-depth characterization of the CAL028_01_06 mutant showed that CPSFL1 is involved in carotenoid accumulation and is essential for photoautotrophic growth in Chlamydomonas and Arabidopsis [36,37].

Fig 4. Genes represented by multiple mutant alleles are more likely to be causative genes.

(A) Number of genes affected by plasmid-associated insertions in ARC grouped by the number of mutant alleles that represent the gene. Schematic of mutant alleles disrupted in (B) cpsfl1 mutants and (C) lpa3 mutants and the allele frequencies of surrounding genes. Note that not all genes with multiple mutant alleles are causative; some genes belong to this group because of their physical proximity to the true causative genes. CAL040_01_25 (lpa3-3, Fig 5) is indicated in gray because this mutant is not included in S1 Table or the analysis for panel A but represents the ninth mutant allele of cpsfl1.

The CrLPA3 gene (Cre03.g184550, hereon LPA3) is another example of a gene that was affected in multiple mutants (Fig 4C). The CAL014_01_47, CAL028_01_27, CAL039_03_42, CAL029_03_31, and CAL036_02_09 mutants had overlapping deletions ranging from 28 bp to 32 kb in the same region on chromosome 3, and all five mutants exhibited a strict acetate-requiring phenotype in HL (S1 Table). By comparing the disruption frequencies, we identified LPA3 as the only gene that was affected in all five mutants.

LPA3 and PSBP4 are essential for photoautotrophic growth and accumulation of the photosystems

We proceeded to validate the WGS data and identify two genes as necessary for photoautotrophic growth in Chlamydomonas. In one case (LPA3), multiple alleles were present in the ARC, whereas only a single allele of the other gene CrPSBP4 (hereon PSBP4) was present. Three lpa3 mutants (CAL028_01_27, CAL039_03_42, and CAL040_01_25) were selected for further analysis (and renamed as lpa3-1, lpa3-2, and lpa3-3, respectively). The WGS data indicated that the lpa3-1 and lpa3-2 mutants had very similar deletions of 24 kb that affected the same five genes (S1 Table). The deletion was confirmed by amplifying genomic regions across the predicted deletion by PCR in both mutants (Fig 5A), although it was not possible to amplify the plasmid sequence at the site of the deletion. The lpa3-3 mutant was predicted from WGS to have a 4-bp deletion and plasmid insertion in the 5’-UTR of Cre03.g184550, which was confirmed by sequencing a PCR fragment of the region from the mutant (Fig 5A), but it was not included in S1 Table, because it was one of the 79 mutants with a non-unique insertion site (see above in section “Identification of insertion sites by mapping of discordant read pairs”). All three mutants had an acetate-requiring phenotype (Fig 5B). The gene Cre03.g184550 encodes a GreenCut2 protein (CPLD28) [31], and is annotated as an ortholog of Arabidopsis LOW PSII ACCUMULATION 3 (LPA3). Arabidopsis LPA3 has been reported to be involved in the assembly of photosystem II [38], although the publication on the function of this protein was later retracted [39]. Complementation with a genomic DNA clone of Cre03.g184550 (LPA3) including 1.2 kb upstream of the transcription start site rescued all three mutants, demonstrating that the disruption of this gene was responsible for the acetate-requiring phenotype of these mutants. Mutants lacking LPA3 exhibited very low Fv/Fm values even in the dark (Fig 5C). This suggests that Chlamydomonas LPA3 is required for the assembly of PSII even in the absence of light, resulting in a much more severe phenotype than lpa3 single mutants in Arabidopsis that accumulate PSII-LHCII supercomplexes at a slower rate than WT plants [38]. PSII subunits did not accumulate in lpa3 mutants grown in TAP under very low light (0–2 μmol photons m-2 s-1) (Fig 5D). A concomitant overaccumulation of PSI subunits and ATP synthase was observed in lpa3-2 but not in lpa3-3, in which the abundance of PSI subunits was reduced as compared to WT (Fig 5D). This may be due to allele-specific differences; lpa3-2 is a complete knock-out as compared to lpa3-3, which has an insertion in the 5’-UTR. The low Fv/Fm phenotype of the mutants was rescued in the complemented lines in all light conditions (Fig 5B and 5C).

Fig 5. Identification of CrLPA3 and CrPSBP4 required for photoautotrophic growth.

(A) Schematic of loci and deletions indicated from whole-genome sequence data in mutants lpa3-1 (CAL028_01_27), lpa3-2 (CAL039_03_42), and lpa3-3 (CAL040_01_25) that share a disruption in Cre03.g184550, gene encoding a predicted ortholog of Arabidopsis LOW PHOTOSYSTEM II ACCUMULATION 3 (LPA3) and mutant psbp4-1 (CAL032_04_48) that had a deletion encompassing Cre08.g362900, a gene encoding a protein predicted as PSBP4. Numbered arrowheads indicate the PCR probes used in testing for deletions shown in the agarose gel photos. WT and lpa3-3 sequences indicate the plasmid insertion site and associated 4 bp-deletion. (B) Growth and chlorophyll fluorescence phenotype of WT, mutants and their complemented lines. Images are representative of an experiment repeated twice. Cells were grown with acetate in the dark or without acetate under 400 μmol photons s-1 m-2 and imaged for growth and Fv/Fm measurements (HL-Ac). Fv/Fm value are represented by false colors as shown in the reference bar. (C) Fv/Fm values of each genotype under different growth conditions. Values indicate averages of three biological replicates; error bars represent standard deviations. comp, complemented line. (D) PSII and PSI subunit accumulation shown by immunoblotting against subunits of PSII (D1 and CP43) and PSI (PsaA and PsaD). Mitochondrial ATP synthase (F1β) was probed as loading control with an antibody that also detects the F1β subunit of the chloroplast ATP synthase (CF1β). Each genotype was analyzed in biological duplicates (one per lane). Dilutions of the WT samples are shown in the four left lanes.

The mutant CAL032_04_48 (renamed as psbp4-1) required acetate for growth and exhibited light sensitivity even in the presence of acetate, and its Fv/Fm was reduced compared to that of the WT when grown in the light (S1 Table and Fig 5C). Its WGS indicated two tandem simple insertions disrupting five genes. Among them, Cre08.g362900, annotated as encoding a thylakoid luminal PsbP-like protein (PSBP4), presented itself as a clear candidate to be the gene responsible for the phenotypes. The PSBP4 ortholog of Arabidopsis has been shown to be involved in the assembly of PSI [40,41]. The deletion in psbp4-1 was confirmed by PCR (Fig 5A). Immunoblotting showed that the mutant failed to accumulate PSI subunits to wild-type levels but overaccumulated PSII subunits (Fig 5D). The mutant growth and Fv/Fm phenotypes were rescued by transforming with genomic DNA including Cre08.g362900 and upstream region, demonstrating that disruption of PSBP4 was the cause of the acetate-requiring and light-sensitive phenotypes of this mutant (Fig 5B and 5C).

Curation of higher-confidence photosynthesis candidate genes

To identify candidate genes that are likely to be responsible for the ARC mutant phenotypes, we focused on the 406 mutants with only simple insertions (Fig 2C). We reasoned that when considering the causative gene of a mutant phenotype, a mutant with only simple insertion events is more likely to have a causative gene within its disrupted gene list than a mutant with a complex insertion event that is accompanied by large-scale chromosomal rearrangements, which could cause unpredictable changes in expression of neighboring genes due to alterations in promoters, enhancers, and chromatin environment. For each of the 406 mutants with simple insertions, we applied a series of criteria to generate a list of genes that are the strongest confidence candidates for being genes that are responsible for the ARC mutant phenotype. If a mutant contained a single, simple insertion that disrupts a single gene then that gene was immediately considered to be a higher-confidence candidate, based on the 88% genetic linkage observed between paromomycin resistance and acetate requirement (S2 Fig). If a mutant contained a simple insertion with multiple genes disrupted by an associated deletion, then we manually analyzed the genes and selected the best candidate, considering whether it was a GreenCut2 gene, whether it was co-expressed with photosynthesis genes [42], and whether it encoded a protein with annotation or domains indicating a possible function in photosynthesis (e.g. redox, chlorophyll a/b-binding, Fe-S cluster). 78 GreenCut2 genes that were disrupted in 509 ARC mutants (Table 1) and were considered strong candidates unless there was an even stronger candidate based on functional annotation. As was shown for cpsfl1 (Fig 4B) and lpa3 (Fig 4C), mutants with overlapping disrupted genes were also compared to find the strongest candidate (gene with highest disruption frequency). Neighboring genes that were co-disrupted with the strongest candidates were deemed non-candidates in all the mutants. As a final criterion, we searched candidate genes derived from analysis of other existing photosynthesis mutant libraries and identified overlaps with Chlamydomonas genes whose disruption affected photoautotrophic growth (Chlamydomonas Library Project, CLiP) [9], orthologous genes from the maize Photosynthetic Mutant Library (PML, [43], and orthologous genes identified from Dynamic Environmental Photosynthetic Imaging (DEPI) of Arabidopsis mutants [44]. Additionally, we surveyed the genes affected by small deletions unassociated with the plasmid insertion (S2 Table) with the same set of criteria. Two mutants were identified as additional alleles of candidate causative genes as well as two new candidates and their corresponding mutants.

Table 1. GreenCut2 proteins within genes affected in ARC.

We were able to identify 253 higher-confidence candidate genes which are shown in Table 2 (and with their corresponding mutants in S4 Table with additional details and references). This list includes genes known to be important for photosynthesis, photoprotection, and peripheral functions either in Chlamydomonas or other photosynthetic organisms (S4 Table, Column “Function reported in Cr and other photosynthetic organisms”). 103 gene products were predicted to be targeted to the chloroplast by protein targeting software Predalgo ( [45], and among those, 78 were also predicted to be targeted to plastids by ChloroP ( [46] (Table 2 and S4 Table). Proteins encoded by 31 genes were found in the chloroplast proteome of Chlamydomonas [47], 27 of which were also in silico-predicted chloroplast proteins. 53 GreenCut2 genes are within this higher-confidence list, leaving 25 GreenCut2 genes that were not chosen because there was a stronger candidate gene (see column “Comments” in Table 1). Among the 253 candidates, the photosynthetic functions of 70 genes have been previously described in Chlamydomonas, land plants, or cyanobacteria. 14 genes have been described as having functions other than photosynthesis, such as plant meristem development, sulfur response, or carbon metabolism (S4 Table, Known genes whose functions are not Photosynthesis). This leaves 183 genes whose functions remain to be studied in context of photosynthesis, 38 of which have no annotation (S4 Table).

Table 2. Higher-confidence photosynthesis candidate genes.

Higher confidence was determined by manual curation of the genes disrupted in a mutant following multiple criteria: (i) single gene disruption in a simple mutant, (ii) highest frequency disruption among multiple mutant alleles, (iii) GreenCut2 membership, (iv) protein domains associated with photosynthetic functions.


We successfully used high-throughput, low-coverage WGS for the identification of plasmid insertion sites in our Chlamydomonas photosynthesis mutant collection (ARC). This approach has a much higher efficiency than PCR-based FST isolation. From the larger collection of 2800 mutants [7] from which ARC was derived, we recovered FSTs from only 17% of the mutants, whereas our WGS identified insertions in 509 out of 581 non-redundant ARC mutants (88% success among the population). We attribute this improvement to the fact that insertion site identification by WGS is not dependent on the intactness or sequence continuity of the inserted plasmid sequence, and therefore WGS overcomes complications such as plasmid concatemerization and loss of plasmid ends to which PCR primers need to anneal. Most importantly, it completely bypasses the need for PCR from the GC- and repeat-rich genome of Chlamydomonas. Even with relatively low average WGS coverage (~7x), we also identified 68 deletions that were not associated with plasmid insertions, some of which may be causative mutations for photosynthesis-related phenotypes that are unlinked to the plasmid insertion in specific mutants.

A previous study using WGS to identify DNA insertion events in Chlamydomonas [21] provides the most direct comparison with our results. Lin et al. (2018) analyzed paromomycin-resistant insertional mutants derived from electroporation instead of the glass bead transformation method that we used to generate either paromomycin- or zeocin-resistant mutants [9]. They sequenced 20 transformants in 10 pools of two strains and verified 38 insertions, obtaining an average of 1.9 insertions per strain. In contrast, we found a total of 554 insertions in 509 mutants, resulting in a lower average of ~1.1 insertions per mutant. Lin et al. (2018) found that more than half (11 of 20) of their strains had more than one insertion event, and a larger collection of 1935 mutants derived from electroporation exhibited multiple insertions in 26% of strains [10]. We found multiple insertions in 8% (43 out of 509) of the ARC mutants, suggesting that glass bead transformation of Chlamydomonas results in a higher frequency of single-copy insertions. Lin et al. (2018) identified one-sided insertions in ~40% of their mutants, whereas we observed only ~4% (21 out of 554 insertion events), despite the lower average WGS coverage in our study (~7x vs. ~15x). The frequency of complex rearrangements in our study (19%) was comparable to that observed by Lin et al. (25%), however, as previously noted by us and others [7,10,21,48], glass bead transformation seems to be frequently associated with larger deletions of genomic DNA at the sites of DNA insertion than electroporation, a finding that was clearly evident in our WGS data (Fig 3A).

In part because of the occurrence of larger deletions, 1470 genes were disrupted in 509 ARC mutants. As expected, this list is enriched for genes that encode proteins with annotated functions in photosynthesis and tetrapyrrole synthesis, and it includes 78 GreenCut2 genes [31]. We examined the affected genes in each mutant to identify possible causative genes using several criteria, including GreenCut2 membership, existence of protein domains suggestive of a function in photosynthesis, and occurrence of multiple mutant alleles in the ARC. We also searched for overlaps with available photosynthesis mutant datasets, namely CLiP (Chlamydomonas), PML (maize), DEPI (Arabidopsis), and those found co-expressed with photosynthesis genes (Chlamydomonas). The CLiP collection has been used to identify mutants that are defective in photosynthetic growth in pooled cultures [9]. This study identified 303 candidate photosynthesis genes. We identified 43 of those 303 genes in our list of 253 higher-confidence genes (Table 2 and S4 Table). The maize PML consists of approximately 2100 photosynthesis mutants that contain 50 to 100 Mu transposable elements per individual. It is estimated to be a saturated collection with 3–4 mutant alleles for ~600 genes [43]. The FSTs of this library were obtained with Illumina sequencing of fragmented gDNA that was enriched for the Mu element [22]. Our higher-confidence candidate gene list overlapped with 17 genes identified from the maize PML ( DEPI screening of 300 Arabidopsis mutants affecting genes that encode chloroplast-targeted proteins (Chloroplast 2010 project, identified 12 mutants with altered photosynthetic response [44]. These mutants likely represent disruption in genes that are conditionally important in acclimation to changing light environments. Two of the 12 genes found through DEPI overlapped with our higher-confidence photosynthesis candidate gene list. The largest overlap (83 genes) was observed between our higher-confidence list and the group of photosynthesis-related genes defined based on co-expression analysis [42].

For two of the higher-confidence photosynthesis genes, LPA3 and PSBP4, we validated the insertion-associated lesions for four of the ARC mutants and demonstrated their requirement for photoautotrophic growth (Fig 5). LPA3 is a GreenCut2 protein (CPLD28) that contains a DUF1995 domain. Insertion mutants containing large or small deletions in LPA3 (Cre03.g184550) were acetate-requiring and exhibited a severe defect in PSII function even in the dark, as evidenced by Fv/Fm values near zero (Fig 5). A mutant (CAL014_01_01) affecting Cre02.g105650, which was recently identified as the Chlamydomonas ortholog of Arabidopsis LPA2 [49], was also found to require acetate and was highly light-sensitive. The phenotypes of these Chlamydomonas mutants are much more severe than those of Arabidopsis lpa2 and lpa3, indicating that Chlamydomonas is more dependent on these proteins for PSII assembly. There are two additional genes encoding DUF1995 proteins in the Chlamydomonas genome, Cre06.g281800 and Cre08.g369000. The mutant CAL038_02_36 is disrupted in Cre06.g281800. It does not grow photoautotrophically but is able to grow in LL and HL in the presence of acetate. Interestingly, this mutant also has an Fv/Fm of nearly zero in the dark (S1 Table), similar to the lpa3-1 (Fig 5C) and lpa3-2 (S1 Table) mutant alleles. The severe phenotypes of these mutants in Chlamydomonas indicate non-overlapping functions in PSII assembly of the gene products of LPA2, LPA3, and Cre06.g281800.

PSBP (encoded by PSBP1/OEE2 in Chlamydomonas) together with PSBO and PSBQ constitute the oxygen-evolving complex (OEC) of PSII [50,51]. In green algae and plants, PSBP appears to have expanded into a large family of proteins sharing similar domains beyond the canonical PSBP of the OEC. The Chlamydomonas genome contains 13 additional genes encoding proteins with PsbP-like domains whose individual functions are unknown. We showed that PSBP4 is required for photoautotrophic growth in Chlamydomonas, ruling out redundancy in its function with other PSBP-like domain-containing proteins. An Arabidopsis ortholog of CrPSBP4 (AT4G15510, PPD1) has been shown to play a role in PSI assembly [40,41], which is consistent with the light-sensitivity and the lack of PSI subunits accumulation of our psbp4-1 mutant (Fig 5). Two other members of the PSBP family, PSBP3, and PSBP9, were found to be disrupted in the ARC. The large family of PSBP-like domain-containing proteins is speculated to have resulted in divergence of their functions [52], and the availability of mutants in these genes should help to reveal their functions.

Of the 253 higher-confidence candidate photosynthesis genes that we curated based on WGS analysis of the ARC, only 70 have a previously demonstrated function in photosynthesis. This is similar to the results of pooled growth analysis of ~60,000 Chlamydomonas insertional mutants by Li et al. (2019), which revealed 303 candidate photosynthesis genes, of which only 65 have previously known roles in photosynthesis [9]. Thus, 238 genes in the study of Li et al. (2019) and 183 genes in our study remain to be analyzed experimentally to determine their specific functions in photosynthesis. Only 43 genes are shared by these two sets of candidate photosynthesis genes, and yet this is a statistically significant overlap (Fisher’s exact test rejects the null hypothesis with a P-value of 0.05). This overlap is lower than might be expected but is likely due to the fact that both the CLiP and ARC mutant collections are based on a total of ~62,000 and ~49,000 insertional mutants, respectively, which is not sufficient to saturate the Chlamydomonas genome for mutations affecting photosynthesis (assuming a Poisson distribution, an average exon length per gene of 1,583 bp [15], and a total genome size of 111 Mb (!info?alias=Org_Creinhardtii), the likelihood of a library of 60,000 insertional mutants hitting a gene is 58%). In addition, the presence of incomplete knock-outs in the libraries may be contributing, which is a challenge for all studies of this type in eukaryotic genomes containing non-coding regions (~50% of CLiP library insertions were in exons, Fig 2A in Li et al). Another possible reason is the difference in the experimental designs of the studies. The CLiP mutant pool was selected in minimal medium at 500 μmol photons m-2 s-1 in 20-L liquid cultures that would have experienced self-shading [9]. In contrast, ARC mutants were screened for even relatively small differences in photoautotrophic growth under more severe HL conditions on plates rather than in liquid medium, which might have led to identification of a different set of genes. Nonetheless, these studies suggest that there are still many more photosynthesis genes that remain to be identified, which highlights the enormous potential for future validation and discovery of new proteins involved in oxygenic photosynthesis.

Material and methods

Strains and culture conditions

Mutants described in this work were generated from wild-type strain 4A+ (CC-4051) in the 137c background. Cells were grown mixotrophically (ac) on Tris-acetate-phosphate (TAP) medium and photoautotrophically (min) on minimal high-salt medium (HS) medium [53] in low light (LL) of 60–80 μmol photons m-2 s-1 and high light (HL) of 350–400 μmol photons m-2 s-1. LL and HL conditions were obtained using GE F25T8/SPX41/ECO and Sylvania F72T12/CW/VHO fluorescent bulbs, respectively.

Generation of mutant library by insertional mutagenesis

Detailed methods used to generate the mutants analyzed in this work were previously described [7,26]. In brief, cells growing in log phase were harvested and transformed with linearized plasmid DNA following the glass beads method [54]. Either 600 ng (pMS188 and pBC1) or 1 μg (pSP124S) of DNA was used for 51x106 cells in 300 μL, under which conditions ~70% of the mutants generated contained a single insertion [26].

Genomic DNA preparation and whole-genome sequencing

Chlamydomonas cultures were grown in 20 mL TAP to stationary phase, and genomic DNA was extracted using an alkaline lysis buffer (50 mM Tris-HCl (pH 8), 200 mM NaCl, 20 mM EDTA, 2% SDS, 1% PVP 40,000, 1 mg/mL Proteinase K) followed by phenol-chloroform extraction. DNA was collected, washed and eluted using DNeasy Plant mini-columns (QIAGEN). The resulting quality of the DNA was confirmed to be A260/A280 of approximately 1.8 and A260/A230 of >2. Plate-based DNA library preparation for Illumina sequencing was performed on the PerkinElmer Sciclone NGS robotic liquid handling system using Kapa Biosystems library preparation kit. 200 ng of sample DNA was sheared to 600 bp using a Covaris LE220 focused ultrasonicator. The sheared DNA fragments were size selected by double-SPRI, and then the selected fragments were end-repaired, A-tailed, and ligated with Illumina-compatible sequencing adaptors from IDT containing a unique molecular index barcode for each sample library. The prepared libraries were quantified using Kapa Biosystem’s next-generation sequencing library qPCR kit and run on a Roche LightCycler 480 real-time PCR instrument. The quantified libraries were then multiplexed with other libraries, and the pool of libraries was then prepared for sequencing on the Illumina HiSeq sequencing platform utilizing a TruSeq paired-end cluster kit, v4, and Illumina’s cBot instrument to generate a clustered flow cell for sequencing. Sequencing of the flow cell was performed on the Illumina HiSeq2500 sequencer using HiSeq TruSeq SBS sequencing kits, v4, following a 2x150 indexed run recipe. The reads were aligned to the reference genome using BWA-mem. To identify plasmid insertion sites, discordant paired-end reads with one end mapping to the plasmid used for mutagenesis and the other to a chromosome location were mapped and manually validated for each mutant using Integrated Genome Viewer (IGV 2.3.94) ( Putative structural variations unpaired to the plasmid sequence were called using a combination of BreakDancer (filtered to quality 90+) and Pindel and manually validated using IGV. Resulting genome sequences of 79 mutants were not unique (33 were duplicated, three were triplicated and one was quadruplicated). In all cases the mutants sharing similar sequences came from the same agar plate and sequencing plate, suggesting that it could be due to an error at the genome extraction step or in maintenance of the mutant strains; these mutants were not included in further analysis. All sequence files are available from the NCBI SRA database ( The SRA accession numbers for each of the mutants are listed in S1 Appendix.

Molecular analyses of mutants by PCR and mutant complementation

Deletions predicted from genome sequences were confirmed by using PCR primers that anneal proximal to the borders and within the deletions. The insertion of the plasmid sequence accompanied by a 4 bp-deletion in lpa3-3 was sequenced from the PCR product from the predicted region. Primers used for PCRs indicated in Fig 4 are listed in S5 Table. For complementation of lpa3-1, lpa3-2, and lpa3-3, a 3531 bp genomic fragment containing the full length CrLPA3 gene (Cre03.g184550) with 1209 bp upstream of the start codon and 719 bp downstream of the stop codon was amplified using primers Comp11F and Comp11R. This fragment was subsequently Gibson cloned into the vector pSP124S using primers PS1362 and PS1363 to inverse PCR around pSP124S. For complementation of mutant psbp4-1, a 3246 bp genomic fragment containing the full length CrPSBP4 gene (Cre08.g362900), including 1209 bp upstream of the start codon and 719 bp downstream of the stop codon, was amplified using primers Comp12F and Comp12R and similarly cloned into vector pSP124S. Primer sequences are listed in S4 Table. Constructs for complementation were transformed into the respective mutants using the glass bead method [54]. Colonies were selected on 10 μM zeocin TAP agar plates and screened for rescued individuals by measuring Fv/Fm as described below.

Fv/Fm measurement

Chlamydomonas strains were grown on agar plates in Dark+ac, LL-min, or HL-min, and Fv/Fm (Fm-Fo/Fm) was measured using a chlorophyll fluorescence video imager (IMAG-MAX/L, WALZ). Plates with the streaks of strains were dark-acclimated for 30 min and exposed to a pulse of saturating light (4000 μmol photons m-2 s-1). Fluorescence images of Fm and Fo were captured during saturating pulses, and false-color images of Fv/Fm were generated.

Immunoblot analysis of photosystem subunits

WT and mutants were grown in TAP to early log phase under very low light (0~2 μmol photons m-2 s-1). Total protein was solubilized as previously described [29] and extracted using the methanol/chloroform method [55]. Each sample was quantified for protein concentration using BCA protein assay (Pierce) to load 5 μg protein in each lane of pre-cast SDS-PAGE Any kD Mini-PROTEAN gels (Bio-Rad), except for the dilutions of the WT samples. Proteins were transferred from the gel to a polyvinylidene difluoride membrane (Immobilon-FL 0.45 μm, Millipore) via wet-transfer for immunodetection. The primary polyclonal antibodies raised against AtpB (AS16 3976), D1 DE-loop (AS10 704), CP43 (AS11 1787), PsaA (AS06 172), and PsaD (AS09 461) were obtained from Agrisera. Donkey anti-rabbit IgG antibody with horseradish peroxidase (1:10,000; GE Healthcare) was used as a secondary antibody and visualized using SuperSignal West Femto Maximum Sensitivity Substrate (Thermo Scientific) chemiluminescent system on a ChemiDoc MP imaging system (Bio-Rad).

Supporting information

S1 Fig. Proportion of different types of insertions observed in ARC.

The frequency of the different types of insertions compared to the total number of insertions observed in the mutant library. Some insertions coexist with another insertion in a mutant. The number of mutants grouped by the types of insertions it contains is listed along with the number of insertions accounted for in that group.


S2 Fig. Genetic linkage test of parR and ac- phenotypes.

Mutants (ac- parR) were crossed with WT (AC+ parS) cells. For a mutant with a single insertion that causes ac- parR, the two phenotypes segregate 2:2 in the resulting progeny. Each colony in the images arose from a single zygospore that was a mix of four genotypes (tetrad progenies). In the left column, half of the progenies from a zygospore grew since a cross between a single allele of ac- and WT will result in half of the progeny being AC+. In the right column, progenies were selected on paromomycin and on minimal media. If the insertion of parR at a given locus results in the ac- phenotype (i.e., genetically linked), none of the progenies grow on minimal media as in the upper four mutants. In the lower two mutants, growth was observed because the ac- and parR genotypes are separate mutations at distinct loci that segregated in some of the progenies.


S3 Fig. Two mutant alleles in tocopherol cyclase (Cre01.g013801, VTE1) in ARC.

Schematic representation of the disruption sites in CAL014_01_19, a strictly acetate-requiring mutant and CAL032_02_19, a mutant with comparatively moderate phenotype.


S1 Table. Plasmid-paired and unpaired discordant sites detected in ARC by WGS and mutant phenotypes.


S2 Table. Mutants with deletions unassociated with plasmid insertion.


S3 Table. Total genes affected in ARC and their description.


S4 Table. Higher-confidence candidate genes and corresponding mutants.


S5 Table. List of PCR primers used in this study.


S2 Appendix. Numerical data for histograms and graphs.



We thank Alice Barkan for sharing the data for PML to compare with ARC higher-confidence candidate genes and Sabeeha Merchant and Masakazu Iwai for critical reading of the manuscript.

This manuscript has been authored by authors at Lawrence Berkeley National Laboratory under Contract No. DE-AC02-05CH11231 with the U.S. Department of Energy. The U.S. Government retains, and the publisher, by accepting the article for publication, acknowledges, that the U.S. Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for U.S. Government purposes.


  1. 1. Harris EH. Chlamydomonas as a model organism. Annu Rev Plant Biol. 2001;52:363–406. pmid:11337403
  2. 2. Salomé PA, Merchant SS. A series of fortunate events: Introducing Chlamydomonas as a reference organism. Vol. 31, Plant Cell. 2019. p. 1682–707. pmid:31189738
  3. 3. Levine RP. A screening technique for photosynthetic mutants in unicellular algæ. Nature. 1960;188(4747):339–40. pmid:13761501
  4. 4. Dent RM, Han M, Niyogi KK. Functional genomics of plant photosynthesis in the fast lane using Chlamydomonas reinhardtii. Vol. 6, Trends in Plant Science. 2001. p. 364–71. pmid:11495790
  5. 5. Goodenough UW, Armstrong JJ, Levine RP. Photosynthetic Properties of ac-31, a Mutant Strain of Chlamydomonas reinhardi Devoid of Chloroplast Membrane Stacking. Plant Physiol. 1969;44(7):1001–12. pmid:16657147
  6. 6. Sager R, Granick S. Nutritional studies with Chlamydomonas reinhardi. Ann N Y Acad Sci. 1953;56(5):831–8. pmid:13139273
  7. 7. Dent RM, Sharifi MN, Malnoë A, Haglund C, Calderon RH, Wakao S, et al. Large-scale insertional mutagenesis of Chlamydomonas supports phylogenomic functional prediction of photosynthetic genes and analysis of classical acetate-requiring mutants. Plant J. 2015;82(2):337–51. pmid:25711437
  8. 8. Zhang R, Patena W, Armbruster U, Gang SS, Blum SR, Jonikas MC. High-throughput genotyping of green algal mutants reveals random distribution of mutagenic insertion sites and endonucleolytic cleavage of transforming DNA. Plant Cell. 2014;26(4):1398–409. pmid:24706510
  9. 9. Li X, Patena W, Fauser F, Jinkerson RE, Saroussi S, Meyer MT, et al. A genome-wide algal mutant library and functional screen identifies genes required for eukaryotic photosynthesis. Nat Genet. 2019;51(4):627–35. pmid:30886426
  10. 10. Li X, Zhang R, Patena W, Gang SS, Blum SR, Ivanova N, et al. An indexed, mapped mutant library enables reverse genetics studies of biological processes in Chlamydomonas reinhardtii. Plant Cell. 2016;28(2):367–87. pmid:26764374
  11. 11. Lin H, Miller ML, Granas DM, Dutcher SK. Whole Genome Sequencing Identifies a Deletion in Protein Phosphatase 2A That Affects Its Stability and Localization in Chlamydomonas reinhardtii. PLoS Genet. 2013;9(9). pmid:24086163
  12. 12. Dutcher SK, Li L, Lin H, Meyer L, Giddings TH, Kwan AL, et al. Whole-genome sequencing to identify mutants and polymorphisms in Chlamydomonas reinhardtii. G3 Genes, Genomes, Genet. 2012;2(1):15–22. pmid:22384377
  13. 13. Tulin F, Cross FR. Patching holes in the Chlamydomonas genome. G3 Genes, Genomes, Genet. 2016;6(7):1899–910. pmid:27175017
  14. 14. Breker M, Lieberman K, Cross FR. Comprehensive discovery of cell-cycle-essential pathways in Chlamydomonas reinhardtii. Plant Cell. 2018;30(6):1178–98. pmid:29743196
  15. 15. Merchant SS, Prochnik SE, Vallon O, Harris EH, Karpowicz SJ, Witman GB, et al. The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science. 2007;318(5848):245–51. pmid:17932292
  16. 16. Schierenbeck L, Ries D, Rogge K, Grewe S, Weisshaar B, Kruse O. Fast forward genetics to identify mutations causing a high light tolerant phenotype in Chlamydomonas reinhardtii by whole-genome-sequencing. BMC Genomics. 2015;16(1). pmid:25730202
  17. 17. Gabilly ST, Baker CR, Wakao S, Crisanto T, Guan K, Bi K, et al. Regulation of photoprotection gene expression in Chlamydomonas by a putative E3 ubiquitin ligase complex and a homolog of CONSTANS. Proc Natl Acad Sci U S A. 2019;116(35):17556–62. pmid:31405963
  18. 18. Smith HE. Identifying insertion mutations by whole-genome sequencing. Biotechniques. 2011;50(2):96–7. pmid:21486250
  19. 19. Cao Y, Rui B, Wellems DL, Li M, Chen B, Zhang D, et al. Identification of piggyBac-mediated insertions in Plasmodium berghei by next generation sequencing. Malar J. 2013;12(1). pmid:23961915
  20. 20. Urban M, King R, Hassani-Pak K, Hammond-Kosack KE. Whole-genome analysis of Fusarium graminearum insertional mutants identifies virulence associated genes and unmasks untagged chromosomal deletions. BMC Genomics. 2015;16(1). pmid:25881124
  21. 21. Lin H, Cliften PF, Dutcher SK. MAPINS, a highly efficient detection method that identifies insertional mutations and complex DNA rearrangements. Plant Physiol. 2018;178(4):1436–47. pmid:30206105
  22. 22. Williams-Carrier R, Stiffler N, Belcher S, Kroeger T, Stern DB, Monde RA, et al. Use of Illumina sequencing to identify transposon insertions underlying mutant phenotypes in high-copy Mutator lines of maize. Plant J. 2010;63(1):167–77. pmid:20409008
  23. 23. Alonso JM, Stepanova AN, Leisse TJ, Kim CJ, Chen H, Shinn P, et al. Genome-wide insertional mutagenesis of Arabidopsis thaliana. Science. 2003;301(5633):653–7. pmid:12893945
  24. 24. Strizhov N, Li Y, Rosso MG, Viehoever P, Dekker KA, Weisshaar B. High-throughput generation of sequence indexes from T-DNA mutagenized Arabidopsis thaliana lines. Biotechniques. 2003;35(6):1164–8. pmid:14682050
  25. 25. Sessions A, Burke E, Presting G, Aux G, McElver J, Patton D, et al. A high-throughput Arabidopsis reverse genetics system. Plant Cell. 2002;14(12):2985–94. pmid:12468722
  26. 26. Dent RM. Functional Genomics of Eukaryotic Photosynthesis Using Insertional Mutagenesis of Chlamydomonas reinhardtii. Plant Physiol. 2005;137(2):545–56. pmid:15653810
  27. 27. Heinnickel ML, Alric J, Wittkopp T, Yang W, Catalanotti C, Dent R, et al. Novel thylakoid membrane GreenCut protein CPLD38 impacts accumulation of the cytochrome b6f complex and associated regulatory processes. J Biol Chem. 2013;288(10):7024–36. pmid:23303190
  28. 28. Peers G, Truong TB, Ostendorf E, Busch A, Elrad D, Grossman AR, et al. An ancient light-harvesting protein is critical for the regulation of algal photosynthesis. Nature. 2009;462(7272):518–21. pmid:19940928
  29. 29. Calderon RH, García-Cerdán JG, Malnoë A, Cook R, Russell JJ, Gaw C, et al. A conserved rubredoxin is necessary for photosystem II accumulation in diverse oxygenic photoautotrophs. J Biol Chem. 2013;288(37):26688–96. pmid:23900844
  30. 30. Ye K, Schulz MH, Long Q, Apweiler R, Ning Z. Pindel: A pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics. 2009;25(21):2865–71. pmid:19561018
  31. 31. Karpowicz SJ, Prochnik SE, Grossman AR, Merchant SS. The GreenCut2 resource, a phylogenomically derived inventory of proteins specific to the plant lineage. J Biol Chem. 2011;286(24):21427–39. pmid:21515685
  32. 32. Li Z, Keasling JD, Niyogi KK. Overlapping photoprotective function of vitamin E and carotenoids in Chlamydomonas. Plant Physiol. 2011/11/15. 2012;158(1):313–23. pmid:22080601
  33. 33. Moseley J, Quinn J, Eriksson M, Merchant S. The Crd1 gene encodes a putative di-iron enzyme required for photosystem I accumulation in copper deficiency and hypoxia in Chlamydomonas reinhardtii. EMBO J. 2000;19(10):2139–51. pmid:10811605
  34. 34. Spicher L, Almeida J, Gutbrod K, Pipitone R, Dörmann P, Glauser G, et al. Essential role for phytol kinase and tocopherol in tolerance to combined light and temperature stress in tomato. J Exp Bot. 2017;68(21–22):5845–56. pmid:29186558
  35. 35. Vom Dorp K, Hölzl G, Plohmann C, Eisenhut M, Abraham M, Weber APM, et al. Remobilization of Phytol from Chlorophyll Degradation is Essential for Tocopherol Synthesis and Growth of Arabidopsis. Plant Cell. 2015;27(10):2846–59. pmid:26452599
  36. 36. García-Cerdán JG, Schmid EM, Takeuchi T, McRae I, McDonald KL, Yordduangjun N, et al. Chloroplast Sec14-like 1 (CPSFL1) is essential for normal chloroplast development and affects carotenoid accumulation in Chlamydomonas. Proc Natl Acad Sci U S A. 2020;117(22):12452–63. pmid:32404426
  37. 37. Hertle AP, García-Cerdán JG, Armbruster U, Shih R, Lee JJ, Wong W, et al. A Sec14 domain protein is required for photoautotrophic growth and chloroplast vesicle formation in Arabidopsis thaliana. Proc Natl Acad Sci U S A. 2020;117(16):9101–11. pmid:32245810
  38. 38. Lu Y. Identification and Roles of Photosystem II Assembly, Stability, and Repair Factors in Arabidopsis. Front Plant Sci. 2016;7:168. pmid:26909098
  39. 39. Retraction to Cooperation of LPA3 and LPA2 Is essential for photosystem II assembly in Arabidopsis (Plant Physiol, (2010) 154, (109–120), 10.1104/pp.110.159558). Vol. 173, Plant Physiology. 2017. p. 1526.
  40. 40. Roose JL, Frankel LK, Bricker TM. The PsbP domain protein 1 functions in the assembly of lumenal domains in photosystem I. J Biol Chem. 2014;289(34):23776–85. pmid:25008325
  41. 41. Liu J, Yang H, Lu Q, Wen X, Chen F, Peng L, et al. PSBP-DOMAIN PROTEIN1, a Nuclear-Encoded thylakoid lumenal protein, is essential for photosystem I assembly in Arabidopsis. Plant Cell. 2013;24(12):4992–5006.
  42. 42. Salomé PA, Merchant SS. Co-expression networks in Chlamydomonas reveal significant rhythmicity in batch cultures and empower gene function discovery. Plant Cell. 2021;33:1058–82. pmid:33793846
  43. 43. Belcher S, Williams-Carrier R, Stiffler N, Barkan A. Large-scale genetic analysis of chloroplast biogenesis in maize. Vol. 1847, Biochimica et Biophysica Acta—Bioenergetics. 2015. p. 1004–16. pmid:25725436
  44. 44. Cruz JA, Savage LJ, Zegarac R, Hall CC, Satoh-Cruz M, Davis GA, et al. Dynamic Environmental Photosynthetic Imaging Reveals Emergent Phenotypes. Cell Syst. 2016;2(6):365–77. pmid:27336966
  45. 45. Tardif M, Atteia A, Specht M, Cogne G, Rolland N, Brugière S, et al. Predalgo: A new subcellular localization prediction tool dedicated to green algae. In: Molecular Biology and Evolution. 2012. p. 3625–39. pmid:22826458
  46. 46. Emanuelsson O, Nielsen H, Heijne G Von. ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci. 1999;8(5):978–84. pmid:10338008
  47. 47. Terashima M, Specht M, Hippler M. The chloroplast proteome: a survey from the Chlamydomonas reinhardtii perspective with a focus on distinctive features. Curr Genet. 2011;57(3):151–68. pmid:21533645
  48. 48. Pollock S V., Mukherjee B, Bajsa-Hirschel J, Machingura MC, Mukherjee A, Grossman AR, et al. A robust protocol for efficient generation, and genomic characterization of insertional mutants of Chlamydomonas reinhardtii. Plant Methods. 2017;13(1).
  49. 49. Spaniol B, Lang J, Venn B, Schake L, Sommer F, Mustas M, et al. Complexome profiling on the lpa2 mutant reveals insights into PSII biogenesis and new PSII associated proteins. bioRvix. 2021;
  50. 50. Rova M, Franzén LG, Fredriksson PO, Styring S. Photosystem II in a mutant of Chlamydomonas reinhardtii lacking the 23 kDa psbP protein shows increased sensitivity to photoinhibition in the absence of chloride. Photosynth Res. 1994;39(1):75–83. pmid:24311003
  51. 51. de Vitry C, Olive J, Drapier D, Recouvreur M, Wollman FA. Posttranslational events leading to the assembly of photosystem II protein complex: a study using photosynthesis mutants from Chlamydomonas reinhardtii. J Cell Biol. 1989;109(3):991–1006. pmid:2670960
  52. 52. Ifuku K, Ishihara S, Shimamoto R, Ido K, Sato F. Structure, function, and evolution of the PsbP protein family in higher plants. Vol. 98, Photosynthesis Research. 2008. p. 427–37.
  53. 53. Harris EH. The Chlamydomonas Sourcebook Volume1: Introduction to Chlamydomonas and Its Laboratory Use. Journal of Chemical Information and Modeling. 2013.
  54. 54. Kindle KL. High-frequency nuclear transformation of Chlamydomonas reinhardtii. Methods Enzymol. 1998;297:27–38.
  55. 55. Wessel D, Flügge UI. A method for the quantitative recovery of protein in dilute solution in the presence of detergents and lipids. Anal Biochem. 1984;138(1):141–3. pmid:6731838