Transcription Regulation of Plastid Genes Involved in Sulfate Transport in Viridiplantae

This study considers transcription regulation of plastid genes involved in sulfate transport in the parasites of invertebrate (Helicosporidium sp.) and other species of the Viridiplantae. A one-box conserved motif with the consensus TAAWATGATT is found near promoters upstream the cysT and cysA genes in many species. In certain cases, the motif is repeated two or three times.


Introduction
This study focuses on selected species of the Viridiplantae, particularly, the genus Helicosporidium sp. (class Trebouxiophyceae), which comprises green algae parasitizing flies of the species Simulium jonesi [1][2][3]. Plastids of these parasites are a good target for antibiotic treatment, as earlier was shown for apicomplexan parasites of vertebrates (Toxoplasma gondii and Plasmodium spp. [4]).
The plastome of Helicosporidium sp. is relatively small, about 37 kb. Most of the plastome genes encode tRNA, rRNA, ribosomal proteins, and subunits of the bacterial-type RNA polymerase. One of two nonhousekeeping proteins is the CysT subunit of a sulfate ABC transporter.
Sulfate ABC transporters in cyanobacteria and proteobacteria consist of two identical ATP-binding CysA proteins, two transmembrane proteins (CysT and CysW), and a sulfate-binding protein SbpA. In the cyanobacteria Synechocystis sp. PCC 6803 [5], genes encoding the sulfate transporter subunits are arranged in a single operon sbpA-ssr2439-cysT-cysW-cysA. In cyanobacteria, no data on expression is available for this operon; however, in Escherichia coli and some other proteobacteria, genes of the sulfate transporter subunits are known to be regulated in the single operon cysPTWAM (further details are given in Discussion).
Plastomes of vascular plants lack genes of the sulfate transport system except for rare instances of cysT and cysA. However, the green alga Helicosporidium sp. retains cysT. Plastomes of the rhodophyte Cyanidium caldarium and Cyanidioschyzon merolae and the cyanelle genome of Cyanophora paradoxa lack cysT homologues but possess distant homologues of cysA presumably involved in the transport of zinc or manganese (further details are given in Results). Similar proteins are involved in the transport of molybdenum, zinc, and manganese and belong to a large family of transporters of ions, sugars, peptides, and more complex organic molecules. For example, the transcription regulation of the ziaA gene (encoding a polypeptide similar to a P-type ATPases involved in transporting heavy metals) is described in the cyanobacterium Synechocystis PCC 6803 [6].
The sulfate transport in plastids is necessary for the synthesis of many sulfur-containing compounds. For example, in Spinacia oleracea, the lack of sulfates leads to considerable changes in the expression of cysteine synthesis genes [7]. Also, plastids of many algae synthesize thiamine and other sulfur-containing compounds. For example, the lipoic acid is synthesized in plastids of apicomplexan parasites [8].
In this paper, we consider the expression regulation of cysT and cysA in Viridiplantae, in particular, Helicosporidium sp. and Pycnococcus provasolii, where cysT is present and cysA is absent.
In proteobacteria, the regulation mechanism of transcription initiation of cysA and cysT is known. The CysB protein is a transcription factor of the LysR family and acts as a tetramer. This protein binds DNA upstream the −35 box of a promoter and activates transcription initiation of  [9], and Klebsiella aerogenes [10], the CysB activation of cysPTWAM, cysK, cysJIH, cysDNC, sbp, and L-cysteine transport genes, as well as CysB autorepression is described in detail. Binding of CysB to DNA is not directly dependent on sulfate concentration but requires high concentrations of acetylserine. Also, proteobacteria lack a distinct binding motif for CysB.

Materials and Methods
The list of species is given in Table 1. Genomes were obtained from GenBank, NCBI. Clustering of proteins was performed using the method described in [11,12]. An original algorithm from [13,14] was employed to search for bacterial-type promoters. Relevant promoter sequences and the evolutionary impact of DNA point mutations on polymerase binding affinity are described in [15,16]; experimental evidence was obtained using the psbA promoter of Sinapis alba [17].
A novel method based on clique search in a multipartite graph [18] was used to identify conserved motifs. In current modification of the method, the nucleotide similarity was estimated accounting for the GC content in plastid DNA. Namely, if the average GC-rate was p, the additive contribution for a mismatch at any position in the calculation of the distance between two words of equal lengths was (1 − ) for a A-T pair, for a C-G pair, and 1/2 for a S-W pair, where S = {C,G} and W = {A,T}. A large-scale search for binding sites was performed using formulas from [19]. Protein alignments were constructed with ClustalW v. 2.0.3 [20].  [21].

Analysis of the Domain
CysT proteins are conserved across green algae, cyanobacteria, and proteobacteria and function as the transmembrane domain of the ABC transporter ( Figure 1). Short Nterminus regions of these proteins can vary. Another exception is CysT in Bryopsis hypnoides and Leptosira terrestris that have truncated C-termini.
Cyanobacteria possess strong potential orthologs of cysT in the Viridiplantae. Cyanobacterial CysT is considered to be ancestral.
As CysA and CysT functions are linked, cysA and cysT normally either coexist or are absent in plastids of all Viridiplantae with the exception of Helicosporidium sp. and Pycnococcus provasolii. (Note that the cysA ortholog in Marchantia polymorpha is named mbpX.) Viridiplantae species lacking cysA and cysT are closely related to the species that contain both genes [11,22]. Unexpectedly, cysA and cysT are present in Bryophyta while they are absent in many highly organized algae close to land plants (Chaetosphaeridium globosum, Chara vulgaris, and Staurastrum punctulatum). They are also absent in plastomes of Physcomitrella patens and all vascular plants. In green algae, these genes are mainly present in the class Trebouxiophyceae (genera Chlorella, Coccomyxa, Helicosporidium, Leptosira, and Parachlorella).
Sequences of plastid CysA and their orthologs from cyanobacterium are well aligned ( Figure 2). CysA in all Viridiplantae has a highly conservative N-terminus domain which is typical for the ATP-binding cassette    of ABC transporters. In all studied Chlorophyta, except for Nephroselmis olivacea, this protein possesses a short C-terminus. Conversely, in the Streptophyta, Nephroselmis oli-vacea, and cyanobacteria, C-termini are long and conservative. In Mesostigma viride and Chlorokybus atmophyticus, this domain is homologous to the TOBE domain involved in sulfate binding [23]. According to the Pfam 26.0 database [24], e-value for this domain is 0.0017 in M. viride and 0.00007 in Ch. atmophyticus. Other plastidencoded proteins, although also being well conserved, have a lower score for this domain. There is no sulfate-binding CysP (SbpA) subunit in plastids which could indicate that cysP is located in the nucleus.

Analysis of the Genomic Context.
Genes upstream and downstream cysT and cysA are listed in Table 2. The rpl32 gene located upstream cysT encodes the ribosomal protein L32 and in most cases belongs to the same DNA strand as cyst; refer to Table 2. Pycnococcus provasolii, Bryopsis hypnoides, Table 2: The genomic context of the cysA and cysT genes in plastids of the Viridiplantae. The symbol "&" designated the lack of a bacterialtype promoter, "!" means that the intergenic region is very short, "P" designates the presence of a bacterial type promoter, " * " designates a pseudogene, "()" marks an opposite direction, and "#" designates prediction of a conserved site.  Table 2. Only in Chlorokybus atmophyticus, rpl32 is both upstream of cysT and belongs to a different strand than cysT. In most cases, the intergenic region upstream the cysT gene is quite long. The ycf1 gene is present downstream cysT in most algae. A few other genes are found downstream cysT: tRNA in Bryophyta and the alga Leptosira terrestris; rpl21 (L21 protein) in the alga Zygnema circumcarinatum; rpoA (alpha subunit of bacterial-type RNA polymerase) in Bryopsis hypnoides; refer to Table 2.
The accD gene is located upstream cysA and belongs to the same DNA strand in Trebouxiophyceae algae, except for Leptosira terrestris. In Nephroselmis olivacea, however, a tRNA gene upstream cysA is on the complementary strand. In Bryopsis hypnoides, ccsA is upstream cysA, and the intergenic region is very short. In Streptophyta, cysA is surrounded by tRNA genes, and they often belong to the complementary strand which suggests the presence of a promoter directly in the upstream region of cysA.

Searching for Bacterial Type Promoters.
Only two candidate bacterial-type promoters are found in 5 -leader regions of the considered genes; refer to Table 2. The exception is the cysA gene in Anthoceros formosae, for which we detect three potential promoters of similar quality. In Chlorella vulgaris, the single promoter candidate is located upstream cysA and has the unusual −35 box, AAGAAA, which was the reason for its rejection. However, in Ch. variabilis, a good potential promoter is detected in the upstream region of this gene with a TG-extension of the −10 box. Promoters were not found in the upstream regions of either cysA or cysT in Nephroselmis olivacea, Pycnococcus provasolii, Bryopsis hypnoides, Leptosira terrestris, Aneura mirabilis, and Ptilidium pulcherrimum. Promoters were not found in the upstream region of cysA in Chlorokybus atmophyticus and in the upstream region of cysT in Zygnema circumcarinatum. We speculate that in these cases these genes are transcribed as a part of an operon or by an RNA polymerase of the phage type.

Searching for the Conservative Motif.
Transcription regulation of plastid genes involved in the sulfate transport in the parasites of invertebrate (Helicosporidium sp.) and in other species of the Viridiplantae is considered. A one-box conserved motif with the consensus TAAWATGATT is found near the promoters in the upstream regions of the cysT and cysA genes in many species. In some cases, the motif is repeated two or three times. In the upstream region of the cysA promoter in alga C. subellipsoidea C-169, however, the entire motif is repeated twice and is supplemented with its partial repeat at the 5 -terminus. The motif is not present near the promoters in Chlorokybus atmophyticus and Marchantia polymorpha. Deviations from the motif consensus are often the same in the same taxonomic lineage, which may reflect the variability of the transcription factor between lineages. The consensus was obtained from multiple alignments of 28 regions upstream two genes in 9 species (Coccomyxa subellipsoidea, Chlorella variabilis, Chlorella vulgaris, Helicosporidium, Parachlorella kessleri, Mesostigma viride, Chlorokybus atmophyticus, Zygnema circumcarinatum, and Anthoceros formosae). The LOGO profile of this motif is shown in Figure 3.
In most species, the motif is found upstream the −35 box or is overlapping the promoter. In the cysA upstream region in Zygnema circumcarinatum and Anthoceros formosae, the motif is detected between −35 and −10 boxes or is overlapping the −10 box of the promoter.

Discussion
We believe that the found motif represents binding sites of a transcription factor because of its positional linking with BioMed Research International 5 the promoter. The variable distance between the motif and the promoters suggests a repressor role of a putative transcription factor. Repeating of the motif is typically associated with a cooperative factor binding. This cooperativity can compensate for the motif variability, which is the case of Coccomyxa subellipsoidea.
The motif sequence confirms that Helicosporidium sp. belongs to the class Trebouxiophyceae. Its conservativity emphasizes the importance of cysT in plastids of parasites and suggests its key importance in understanding the role of the plastids in virulence. Indeed, plastids often synthesize many chemicals, which are usually provided by mitochondria [7,25].
In Leptosira terrestris, cysA and cysT are not predicted to have the regulatory sites, unlike other Trebouxiophyceae and their close relatives, which suggests a shift in the transporter (consisting of CysA, CysT, and CysP subunits) specificity in Leptosira. This observation conforms with considerable changes in the CysA sequence in Leptosira. The lack of regulatory sites in Nephroselmis olivacea, Pycnococcus provasolii, Bryopsis hypnoides, and Marchantia polymorpha may suggest a demising role of the protein, which is consistent with the loss or pseudogene nature of cysA and cysT in Aneura mirabilis and Ptilidium pulcherrimum. The absence of bacterial-type promoters upstream cysA and cysT is often associated with changes in genes order on the chromosome. This effect may be explained by the de novo formation of phage-type promoters (possibly activated by another factor), or the inclusion of cysA or cysT in another operons. In general, the sulfate transport can be regulated by changing the expression level of a nuclear encoded sulfate-binding domain CysP (SbpA).
CysB binding sites in proteobacteria [9, 10] differ considerably from binding sites of a putative factor that we predicted for cysT and cysA. However, the two most conserved motif positions in plastids coincide with the two conserved positions in experimentally characterized sites upstream cysPTWAM, cysK, cysJIH, cysDNC, sbp, and cysB in proteobacteria. This evidence is however insufficient to establish the identity of our predicted motif and the CysB binding sites in proteobacteria. In E. coli, both proteins CysT and CysW consist of transmembrane domains that are very close to each other. Their genes belong to the sulfate transport operon cysPTWAM. But the CysW protein is absent in plastids. We hypothesize that CysW is replaced by the second CysT copy in plastids. The cysT and cysA genes do not form an operon in plastids, so we assume the CysT protein's double abundance over the CysA protein. It conforms to our hypothesis.
In E. coli, CysT and CysW are transmembrane domains with similar structure. Their genes belong to the same operon with the sulfate transport subunit cysPTWAM; however, cysW is absent in plastids. We hypothesize that in plastids the CysW subunit is functionally replaced by another copy of CysT, and the cysT mRNA concentration is twice as high compared to cysA mRNA. This hypothesis is indirectly supported by the fact that cysT and cysA are not included in one operon in plastids, and thus their mRNA expression levels may differ considerably.