Show me your secret(ed) weapons: a multifaceted approach reveals novel type III-secreted effectors of a plant pathogenic bacterium

Many Gram-negative plant and animal pathogenic bacteria employ a type III secretion system (T3SS) to secrete protein effectors into the cells of their hosts and promote disease. The plant pathogen Acidovorax citrulli requires a functional T3SS for pathogenicity. As with Xanthomonas and Ralstonia spp., an AraC-type transcriptional regulator, HrpX, regulates expression of genes encoding T3SS components and type III-secreted effectors (T3Es) in A. citrulli. A previous study reported eleven T3E genes in this pathogen, based on the annotation of a sequenced strain. We hypothesized that this was an underestimation. Guided by this hypothesis, we aimed at uncovering the T3E arsenal of the A. citrulli model strain, M6. We carried out a thorough sequence analysis searching for similarity to known T3Es from other bacteria. This analysis revealed 51 A. citrulli genes whose products are similar to known T3Es. Further, we combined machine learning and transcriptomics to identify novel T3Es. The machine learning approach ranked all A. citrulli M6 genes according to their propensity to encode T3Es. RNA-Seq revealed differential gene expression between wild-type M6 and a mutant defective in HrpX. Data combined from these approaches led to the identification of seven novel T3E candidates, that were further validated using a T3SS-dependent translocation assay. These T3E genes encode hypothetical proteins, do not show any similarity to known effectors from other bacteria, and seem to be restricted to plant pathogenic Acidovorax species. Transient expression in Nicotiana benthamiana revealed that two of these T3Es localize to the cell nucleus and one interacts with the endoplasmic reticulum. This study not only uncovered the arsenal of T3Es of an important pathogen, but it also places A. citrulli among the “richest” bacterial pathogens in terms of T3E cargo. It also revealed novel T3Es that appear to be involved in the pathoadaptive evolution of plant pathogenic Acidovorax species. Author summary Acidovorax citrulli is a Gram-negative bacterium that causes bacterial fruit blotch (BFB) disease of cucurbits. This disease represents a serious threat to cucurbit crop production worldwide. Despite the agricultural importance of BFB, the knowledge about basic aspects of A. citrulli-plant interactions is rather limited. As many Gram-negative plant and animal pathogenic bacteria, A. citrulli employs a complex secretion system, named type III secretion system, to deliver protein virulence effectors into the host cells. In this work we aimed at uncovering the arsenal of type III-secreted effectors (T3Es) of this pathogen by combination of bioinformatics and experimental approaches. We found that this bacterium possesses at least 51 genes that are similar to T3E genes from other pathogenic bacteria. In addition, our study revealed seven novel T3Es that seem to occur only in A. citrulli strains and in other plant pathogenic Acidovorax species. We found that two of these T3Es localize to the plant cell nucleus while one partially interacts with the endoplasmic reticulum. Further characterization of the novel T3Es identified in this study may uncover new host targets of pathogen effectors and new mechanisms by which pathogenic bacteria manipulate their hosts.


Abstract 26
Many Gram-negative plant and animal pathogenic bacteria employ a type III 27 secretion system (T3SS) to secrete protein effectors into the cells of their hosts and 28 promote disease. The plant pathogen Acidovorax citrulli requires a functional T3SS for 29 pathogenicity. As with Xanthomonas and Ralstonia spp., an AraC-type transcriptional 30 regulator, HrpX, regulates expression of genes encoding T3SS components and type III-31 secreted effectors (T3Es) in A. citrulli. A previous study reported eleven T3E genes in 32 this pathogen, based on the annotation of a sequenced strain. We hypothesized that this 33 was an underestimation. Guided by this hypothesis, we aimed at uncovering the T3E 34 arsenal of the A. citrulli model strain, M6. We carried out a thorough sequence analysis 35 searching for similarity to known T3Es from other bacteria. This analysis revealed 51 A. 36 citrulli genes whose products are similar to known T3Es. Further, we combined 37 machine learning and transcriptomics to identify novel T3Es. The machine learning 38 approach ranked all A. citrulli M6 genes according to their propensity to encode T3Es. 39 RNA-Seq revealed differential gene expression between wild-type M6 and a mutant 40 defective in HrpX. Data combined from these approaches led to the identification of 41 seven novel T3E candidates, that were further validated using a T3SS-dependent 42 translocation assay. These T3E genes encode hypothetical proteins, do not show any 43 similarity to known effectors from other bacteria, and seem to be restricted to plant 44 pathogenic Acidovorax species. Transient expression in Nicotiana benthamiana 45 revealed that two of these T3Es localize to the cell nucleus and one interacts with the 46 endoplasmic reticulum. This study not only uncovered the arsenal of T3Es of an 47 important pathogen, but it also places A. citrulli among the "richest" bacterial pathogens 48 in terms of T3E cargo. It also revealed novel T3Es that appear to be involved in the 49 pathoadaptive evolution of plant pathogenic Acidovorax species. The genus Acidovorax (class Betaproteobacteria) contains a variety of species 68 with different lifestyles. While some species are well adapted to soil and water 69 environments, others have developed intimate relationships with eukaryotic organisms, 70 including as plant pathogens [1]. Among the latter, Acidovorax citrulli is one of the 71 most important plant pathogenic species [2]. This bacterium infects all aerial parts of 72 cucurbit plants, causing bacterial fruit blotch (BFB) disease. The unavailability of 73 effective tools for managing BFB, including the lack of resistance sources, and the 74 disease's high destructive potential, exacerbate the threat BFB poses to cucurbit (mainly 75 melon and watermelon) production [3,4]. Despite the economic importance of BFB, 76 little is known about basic aspects of A. citrulli-plant interactions. 77 On the basis of genetic and biochemical features, A. citrulli strains are divided 78 into two main groups: group I strains have been generally isolated from melon and other 79 non-watermelon cucurbits, whereas group II strains have been mainly isolated from 80 watermelon [5][6][7]. Acidovorax citrulli M6 is a group I strain that was isolated in 2002 81 from a BFB outbreak of melons in Israel [5], and subsequently became a model group I 82 strain for investigation of basic aspects of BFB. The A. citrulli M6 genome has been 83 sequenced, first by Illumina MiSeq [8] and recently, by PacBio [9], which allowed its 84 complete closure. 85 As many Gram-negative plant and animal pathogenic bacteria, A. citrulli relies 86 on a functional type III secretion system (T3SS) to promote disease [10]. This complex 87 secretion system is employed by these pathogens to deliver protein effectors into target other plant pathogenic bacteria, we hypothesized that this is an underestimation of the 129 actual number of T3Es in A. citrulli. We also hypothesized that A. citrulli may carry 130 novel T3E genes that were not previously described in other bacteria. Guided by these 131 hypotheses, we carried out a detailed sequence analysis of A. citrulli M6 open reading 132 frames (ORFs) to identify genes with similarity to known T3E genes from other 133 bacteria. We also combined machine-learning (ML) and RNA-Seq approaches to 134 identify putative, novel A. citrulli T3Es. Further, we adapted a T3E translocation assay 135 to verify T3S-dependent translocation of candidate effectors. Combining these 136 approaches allowed identification of seven new T3Es that appear to be unique to plant 137 pathogenic Acidovorax species. Subcellular localization of three of these T3Es in N. 138 benthamiana leaves was also determined by Agrobacterium-mediated transient    An initial ML run was used to classify all ORFs of strain AAC00-1 according to 161 their probability to encode T3Es. This strain, rather than M6, was used for learning and 162 prediction, because at the time this ML was conducted, the AAC00-1 genome was fully 163 assembled with better annotation. For training, the positive set included 12 AAC00-1 164 genes that encoded T3E homologs: the eleven genes described by Eckshtain-Levi et al. 165 [27] and one additional gene, Aave_2938 that is identical to Aave_2708. The negative 166 set included genes that showed high sequence similarity to ORFs of a non-pathogenic 167 Escherichia coli strain. The output of this ML run was a list of all annotated genes of A. 168 citrulli AAC00-1 ranked by their propensity to encode T3Es (S1 Table). For each ORF, 169 we searched for the homolog in A. citrulli M6. Among the top predictions from 170 AAC00-1, many genes did not have homologs in M6. As expected, the aforementioned 171 12 positive T3E genes of AAC00-1 were ranked high in this list (among the 36 highest 172 scoring predictions, with eight being ranked among the top 10, and eleven among the 173 top 15; S1 Table) were known prior to this study, based on the annotation of the A. citrulli group II strain AAC00-202 1 [27], in addition of gene Aave_2708, which is not present in strain M6.  Table 1). A smaller number of genes, 31, 213 shared similarity with T3E genes of P. syringae strains. We also assessed the 214 occurrence of these T3Es in other plant pathogenic Acidovorax species (S2 Table).  Interestingly, of the 51 putative T3E genes of A. citrulli M6, ten were not present 227 in the genome of the group II strain AAC00-1 (Table 1). Besides M6 and AAC00-1, the 228 NCBI database includes draft genomes of one additional group II strain, KAAC17055, 229 and four group I strains (pslb65, tw6, DSM 17060 and ZJU1106). BlastN analyses 230 revealed that these ten genes are also absent in strain KAAC17055, but present in most 231 of the group I strains. The only exceptions were APS58_0506 that was not detected in 232 strains tw6 and DSM 17060, APS58_1209 that was not detected in tw6, and 233 APS58_2767 that was not detected in DSM 17060. The inability to detect these T3E 234 genes in the genomes of strains tw6 and DSM 17060 could reflect true absence in these 235 strains but also could be due to the draft nature of these genomes. In any case, these 236 results strongly suggest that the ten M6 T3E genes that are absent in the group II strains 237 AAC00-1 and KAAC17055 could be specific to group I strains of A. citrulli. Yet, this 238 assumption should be verified on a larger collection of strains. Interestingly, among 239 these ten T3E genes, APS58_0506, APS58_2228 and APS58_3303, were detected in 240 strains of all other plant pathogenic Acidovorax species (S2 Table). In the case of 241 APS58_2228, it should be mentioned that the group II strains AAC00-1 and 242 KAAC17055 possess genes (Aave_0378 in AAC00-1) that encode short products (140 243 a.a.) and partially align with the C-terminal region of the group I product (with 244 predicted length of 538 a.a.). In our analysis we did not consider them as ortholog  In Xanthomonas spp. and R. solanacearum, the transcriptional regulator HrpX 250 (HrpB in R. solanacearum) plays a key role in regulation of hrp and T3E genes. We 251 hypothesized that this is also the case in A. citrulli M6. To assess this hypothesis, we 252 first generated an A. citrulli M6 strain mutated in APS58_2298, the hrpX orthologous 253 gene. This mutant lost the ability to cause disease in melon ( Fig 1A) and induce HR in 254 pepper leaves (Fig 1B), as previously observed for a strain carrying a mutation in the    transcription. Nine hits belonged to protein secretion/protein secretion by the T3SS and 320 these corresponded to genes encoding Hrp/Hrc components. Notably, most T3S and 321 T3E genes could not be assigned to any specific GO biological process; this was the 322 case for 11 hrp/hrc/hpa genes and for 24 T3E genes (S3C Table). Overall, RNA-Seq 323 revealed 20 hrp/hrc/hpa genes and 27 genes encoding putative T3Es (including the 324 seven new effectors identified in this study; see below) that had significantly reduced 325 expression in the hrpX mutant relative to wild-type M6 (S3B and S3C Tables). 326 Importantly, almost 60 genes that showed reduced expression in the hrpX mutant are 327 annotated as hypothetical proteins and did not show similarity to known T3E genes. It is 328 possible that some of these genes encode novel T3Es. Gene ontology (GO) biological process category (blue columns). HrpX-regulated genes 334 encoding T3S structural and accessory proteins (red column) and putative T3Es (green 335 column) were manually assigned to these categories. 336 Interestingly, the hrpX mutant also showed reduced expression of several genes 338 encoding proteins that are putatively secreted by the type II secretion system (T2SS). 339 We used SignalP, Pred-Tat and Phobius tools to detect putative Tat or Sec type II 340 secretion (T2S) signals in the ORFs of all genes that showed significantly lower 341 expression in the hrpX mutant relative to the wild-type strain. While T2S signals were 342 predicted in 39 genes by at least one of the tools (not shown), 14 genes were predicted 343 to encode products with T2S signals by the three different tools (S3E Table). Among 344 these genes were APS58_0633 (xynB) encoding 1-4--xylanase, APS58_2599 (pelA_2), 345 encoding pectate lyase, and APS58_3722, encoding a family S1 extracellular serine 346 protease. These three genes were also shown to contain PIP boxes in their promoter 347 region (S3B Table).

348
Of the 28 genes showing increased expression in the hrpX mutant relative to 349 wild-type M6, only ten could be assigned to GO categories, most of which belonged to 350 regulatory genes (regulation of transcription, phosphorelay signal transduction system, 351 signal transduction; S3D Table).  Table), of which 41 correlated with significant regulation by HrpX 359 (Table 2 and S4 Table). We used the PIP boxes of the aforementioned 41 genes/operons 360 to determine the consensus PIP box of A. citrulli using the MEME suite (Fig 4). 361 Importantly, some of the PIP boxes are upstream of operons, thus probably regulating 362 the expression of more than one gene. We detected additional 25 genes [marked as (+) 363 in the PIP box column of S3B Table] that are likely in PIP box-containing operons and 364 showing higher expression in the wild-type strain relative to the hrpX mutant. It is also 365 worth mentioning that eleven additional genes (some of which encoding T3Es) carrying 366 PIP boxes showed higher expression values in the wild-type relative to the hrpX mutant 367 in the RNA-Seq experiment, but were slightly below the level of statistical significance 368 (S3A and S4 Tables).    Table 2).   the Bs2 gene), and when a X. euvesicatoria hrpF mutant (impaired in T3S) was used in 409 these assays (Fig 5A). Overall, these results demonstrated the suitability of the avrBs2- Four genes that were ranked relatively low in the ML were also included in these 433 experiments to evaluate the quality of the ML prediction (Table 3 and S1 Table). All 434 seven CT3E genes, but not the low-ranked ML genes, were translocated (Fig 5B). The 435 validated genes were annotated as hypothetical proteins, had a predicted PIP box, were 436 shown to be positively regulated by HrpX, and ranked high in the ML run (Table 3 and 437 S1 Table). Importantly, the gene APS58_1340, which contains a PIP box in its promoter 438 region and its expression is regulated by HrpX (Table 3) was not translocated, 439 indicating that these two parameters alone are not sufficient for accurate prediction of     Table). Searches for conserved domains in these T3Es did 471 not provide any insight.  to the nucleus, including in clearly visible nuclear foci (Fig 6).    (Table 1 and S1 Table), 537 thus supporting the higher reliability of the new list relative to the first prediction.  Xanthomonas spp., R. solanacearum and/or P. syringae strains (Table 1). Homologs for 575 most of these T3E genes and for those identified in the present study were also detected 576 in other plant pathogenic Acidovorax species (S2 Table).

577
To identify new T3E genes of A. citrulli, we also used RNA-Seq to identify 578 HrpX-regulated genes. Based on the knowledge accumulated with Xanthomonas spp. 579 and R. solanacearum [22,32,47, 48], we expected that most genes encoding T3SS 580 components and some T3Es of A. citrulli would be under the direct regulation of HrpX.

581
This assumption was strengthened in preliminary experiments comparing gene 582 expression between a wild-type and a hrpX mutant strain ( Fig 1C). As previously  Table). These numbers are similar to those reported in gene expression regulators. Among the HrpX up-regulated genes we also detected several genes whose 606 products are putatively secreted by type II secretion (T2S). These included genes 607 encoding 1-4--xylanase (xynB), pectate lyase and a protein with similarity to a family 608 of S1 extracellular serine proteases (S3E Table). HrpX regulation of genes encoding 609 type II-secreted enzymes was also demonstrated in Xanthomonas spp. and in R. HrpX-down-regulated (S2B Table).

623
After demonstrating the suitability of the avrBs2-Bs2 T3E translocation assay 624 with eight known T3Es, we used the data obtained from the first ML run and the RNA-625 Seq analysis to select seven A. citrulli M6 ORFs for experimental validation (Fig 5). We 626 validated translocation of the seven candidates, thus demonstrating the strength of 627 combining ML and RNA-Seq for identifying T3E genes. Importantly, the lack of 628 translocation of the four ORFs that received relatively low scores in the first ML run 629 strengthened the suitability of our combined computational/experimental approach.

630
Remarkably, the seven effectors identified in this study were up-regulated by 631 HrpX and carried PIP boxes in their promoter regions, while among the four non-632 validated genes, only one had these traits (Table 3). An interesting trait of the seven new 633 T3Es was that they share significant similarity only with hypothetical proteins of other 634 plant pathogenic Acidovorax strains (Table 3 and S2 Table). This strongly supports that 635 these effectors are unique to plant pathogenic Acidovorax. Importantly, a second ML  Table). These represent high priority CT3Es for future experimental validation assays.

640
This emphasizes one benefit of the ML approach: its ability to integrate novel 641 knowledge in the prediction algorithm.  (Table 1 and S1 Table). Thus, it is logical to assume that the variability in T3E content 686 between group I and II strains plays a critical role in shaping the differences in host-687 preferential association between the groups. Despite this, more research is needed to test  In order to predict T3Es, we applied ML classification algorithms, which are 723 similar to the ones we have previously described [28,29,58,59]. The first ML run was 724 used to search for T3Es in the AAC00-1 genome (GenBank accession CP000512.1).

725
The training data included 12 ORFs that were known as T3Es (see in Results). The 726 negative set included 2,680 ORFs that had high similarity (E-value less than 0.001) to 727 ORFs in the non-pathogenic E. coli K12 genome (accession number NC_000913.3).

728
The positive and negative ORFs are marked in S1 Table. For this ML, 71 features were 729 used, including homology (to known effectors or to bacteria without T3SS),

743
The second ML run was similar to the first, with the following modifications.

744
First, the classifiers were run on the M6 genome (GenBank accession CP029373).

745
Second, the positive set included ORFs that were validated as T3Es in this study and 746 ORFs with high sequence similarity to known effectors as described in Table 1. The 747 negative set included 2,570 ORFs (S5 Table). The four ORFs that were experimentally 748 shown not to be T3Es were also included in this negative set. Third, the expression data 749 from the RNA-Seq analysis (HrpX regulation) were added as a feature. Fourth, the PIP 750 box feature was updated to reflect the PIP box as inferred from promoter regions of 751 Acidovorax T3Es (see additional bioinformatics tools below). The second ML run was 752 based on Voting classifier, which included all the classifiers specified above, as it gave 753 the highest AUC among all the classifiers. The AUC for this second ML run was 0.999.   genes with a fold-change higher than 2, and a P value lower than 0.05.

836
Validation of RNA-Seq results by quantitative real-time PCR (qRT-PCR) 837 RNA-seq data were verified by qRT-PCR using specific primers of selected 838 genes (S7 Table). Bacterial growth, RNA isolation and cDNA synthesis were as   The ORFs without the stop codon of candidate genes were amplified using 872 specific primers (S7 Table) and cloned into the SalI/XbaI sites of pBBR1MCS- The ORFs of genes APS58_0500, APS58_1448 and APS58_4116 were amplified 894 with specific primers (S7 Table) and cloned into pEarlyGate101 binary vector [84], 895 upstream of a Yellow Fluorescence Protein (YFP) encoding gene and an HA tag using 896 the Gateway cloning system (ThermoFisher Scientific). The resulting plasmids were 897 verified by sequencing and mobilized into A. tumefaciens GV3101 as indicated above.

898
Transient expression experiments were performed following the protocol described by