Proteinaceous effector discovery and characterization in filamentous plant pathogens

Abstract The complicated interplay of plant–pathogen interactions occurs on multiple levels as pathogens evolve to constantly evade the immune responses of their hosts. Many economically important crops fall victim to filamentous pathogens that produce small proteins called effectors to manipulate the host and aid infection/colonization. Understanding the effector repertoires of pathogens is facilitating an increased understanding of the molecular mechanisms underlying virulence as well as guiding the development of disease control strategies. The purpose of this review is to give a chronological perspective on the evolution of the methodologies used in effector discovery from physical isolation and in silico predictions, to functional characterization of the effectors of filamentous plant pathogens and identification of their host targets.


| INTRODUC TI ON
If people think Nature is their friend, then they sure don't need an enemy.
Kurt Vonnegut, Letter in Time magazine

| The threats from filamentous phytopathogens
Our expanding global population forces us to intensify our crop production as we prepare to feed 2.2 billion more people by 2050. One of the main biotic challenges facing society to meeting these evergrowing demands are filamentous plant pathogens. Oomycetes and fungi are the causal agents of some of the most notorious plant diseases and are a true threat to our global food security and community structures. Plant disease outbreaks have occurred throughout human history, some of the most infamous include the Irish potato famine caused by the oomycete Phytophthora infestans (Turner, 2005), Panama disease caused by Fusarium oxysporum f. sp. cubense (Gordon, 2017), and wheat stem rust caused by Puccinia graminis f. sp. tritici (Roelfs, 1985;Singh et al., 2011).

| Effectors and the plant immune response
The elegantly described "zig-zag" model by Jones and Dangl (2006) reveals a two-tier immune response where pathogen-associated molecular patterns (PAMPs) are first detected on host cell surfaces

| The importance of effector research
Hundreds of small proteins, predicted to be effectors, are secreted by filamentous phytopathogens during host colonization (Dean et al., 2005;Kämper et al., 2006;Yoshida et al., 2009;Duplessis et al., 2011). We have little understanding of the function of most of these putative effectors and each typically shares minimal or no sequence homology to proteins with previously defined functions. However, the effector repertoire of a pathogen is a major determinant of host specialization and can greatly impact whether the plant-pathogen interaction is successful or not based on the genotype of the host (Raffaele et al., 2010;Sánchez-Vallet et al., 2018a). Molecular studies have characterized over 60 fungal effectors across multiple species; however, this barely makes a dent in the candidate effector repertoire for each pathogenic species (Sperschneider et al., 2015). For example, the barley powdery mildew fungus Blumeria graminis f. sp. hordei alone is suspected to have roughly 7% of its genome encoding candidate secreted effector proteins (CSEPs) (Pedersen et al., 2012).
Identifying and characterizing the function of effector proteins will improve our understanding of their role in disease formation and influence our future strategies to combat pathogen infections.
Fundamental effector research is a key part of devising new plant disease control strategies and this is detailed further in Sections 3.2 and 6 of this review. Effectors play an important role in crop breeding where, as well as being used to detect resistance genes in new cultivars, characterized effectors can be used to locate susceptibility loci in vulnerable crops (Vleeshouwers and Oliver, 2014). The development of mobile sequencing technology means that genes encoding effectors can also be used to detect the emergence of new strains of crop pathogens in the field and elude the severity of future disease outbreaks (Radhakrishnan et al., 2019). Effectors function in multiple ways, including inhibiting host enzymes, modulating plant immune responses, and targeting host gene-silencing mechanisms. All features of effectors described in this article are summarized in Table 1 Sohn et al. (2007) Avr1b-1 204 G5A9E5 Uncharacterized

Phytophthora sojae
Stem and root rot of soybean Soybean (Glycine max) Shan et al. (2004); Dou et al. (2008) Has been shown to reduce heterologously induced plant cell death Avr1-C039 89 NA Uncharacterized protein that is recognized in the host by direct binding of the NB-LRR proteins RGA5, which together with RGA4 induces ETI
a Number of amino acids including signal peptide.

TA B L E 1 (Continued)
that become important in many subsequent effector identification stories (van Kan et al., 1991).

| Homology searches
Once an effector has been cloned, the sequence can be used to identify homologous candidates in closely related species. Three elicitins were isolated from Phytophthora spp. using proteomics techniques: cryptogein (P. cryptogea), cinnamomin (P. cinnamomi), and capsicein (P. capsici) Pernollet, 1989, Ricci et al., 1989). Primers were deigned based on conserved regions of the elicitin amino acid sequences and used to probe cDNA libraries from P. parasitica, leading to the discovery of the host-specific elicitor protein PARA1 (Kamoun et al., 1993).

| Genetic mapping
Prior to the genomics era, the isolation of Avr proteins from intracellular colonizing fungal pathogens such as Magnaporthe oryzae and haustoria-producing pathogens was unsuccessful using the proteomics approach. Instead, in the case of the rice blast fungus M. oryzae, map-based cloning techniques were used to clone Avrs such as Avr1-CO39 (Farman and Leong, 1998). Avr1-CO39 was mapped to a region on chromosome 1 by a series of backcrosses of the progeny of the virulent isolate Guy11 and the avirulent isolate 2539 (Smith and Leong, 1994). Later, a chromosome-walking strategy led to the physical mapping and identification of Avr1-CO39. The identity of the Avr1-CO39 locus was confirmed by transforming the virulent Guy11 strain with cosmids from the Avr1-CO39 genetic interval. This resulted in a loss of pathogenicity on rice cultivars containing the corresponding functional CO39 resistance gene (Farman and Leong, 1998).

| Always lagging behind
By the end of the 20th century, over 30 bacterial Avr genes had been cloned and characterized by screening cosmid libraries, with almost all of these coming from two host-specific species of Pseudomonas and Xanthomonas (Leach and White, 1996;De Wit, 1997). In comparison, using proteomics and genetic mapping, only eight fungal phytopathogen Avr genes had been successfully identified and confirmed to be effectors (Laugé and De Wit, 1998). But all this was about to change.

| Sanger and next-generation sequencing of pathogen genomes
In the early 2000s, the Fungal Genome Initiative (FGI) was established following the publication of a white paper (Birren et al., 2003) F I G U R E 1 A timeline showing the progression of filamentous plant pathogen effector prediction and identification from the pregenomic era to the present day. The first effectors identified using these methods are included as well as the elicitins used for homology-based searches. Increasingly, pangenome data are used to predict core and novel candidates but as yet none have been characterized using this technique. For a recent review of pangenomics see Golicz et al. (2019). Details on individual effectors named are given in Table 1.
to promote the sequencing in the public domain of fungal genomes belonging to species important to human health, agriculture, and industry. By 2017 a total of 191 genomes of fungal plant pathogens had been sequenced, including the economically important M. oryzae, Fusarium graminearum, and Botrytis cinerea (Dean et al., 2005(Dean et al., , 2012Cuomo et al., 2007;Amselem et al., 2011;Aylward et al., 2017). This, together with the publication of numerous oomycete genomes, including the late potato blight pathogen Phytophthora infestans (Haas et al., 2009), as well as extensive in planta and in vitro transcriptome data sets, has led to an explosion in effector discovery. These techniques for effector discovery are summarized in Table 2.

| REFINING EFFEC TOR PRED I C TI ON
Truth, like gold, is to be obtained not by its growth, but by washing away from it all that is not gold.

| Secretion
As the de Wit et al. studies demonstrated, a key feature of effectors is secretion by the pathogen into the host (De Wit et al., 1985;Asai and Shirasu, 2015). Therefore, early studies in effector discovery using sequencing data focused on the predicted secretome.
In a bid to identify extracellular effector proteins, Torto et al. (2003) used their PEX-finder algorithm to mine transcript datasets of the potato pathogen P. infestans. The algorithm searched for a specific amino acid sequence known as a signal peptide followed by a cleavage site commonly found at the N-terminus of secreted proteins (Nielsen and Krogh, 1998;Torto et al., 2003). Of the 261 cDNAs predicted to code for secreted proteins, 78 had no matches to those found in the public databases, a feature common to candidate effectors. Using high-throughput functional expression assays this study led to the discovery of a large complex family of effectors called crinklers (CRNs), which are found throughout the pathogenic oomycetes (Schornack et al., 2010;Amaro et al., 2017).
However, some characterized secreted effectors lack a signal peptide. For example, the effectors, PsIsc1 and VdIsc1, produced by Phytophthora sojae and Verticillium dahliae, respectively, have been shown to be unconventionally secreted into the respective host to suppress salicylate (SA)-mediated defences in planta (Liu et al., 2014).
Another difficulty is that such broad criteria leaves a large pool of possible effector candidates that are demanding in both time and resources to functionally characterize, with studies often having low discovery rates. The Magnaporthe grisea effector MC69, essential for appressoria formation (Motaung et al., 2017), was the only candidate from 1,306 putative secreted proteins that was found to be required for pathogenicity following large-scale gene disruptions (Yoshida et al., 2009;Saitoh et al., 2012).

| Domains
The C. fulvum effector Ecp6 sequesters the fungal cell wall protein chitin, preventing chitin fragment detection by the host PRRs, and thereby evades a host immune response (De Jonge et al., 2010). Ecp6 contains LysM domains that bind to chitin with ultrahigh affinity, therefore outcompeting host immune receptors (Sánchez-Vallet et al., 2013).
The LysM domain found in Ecp6 has now been identified in over 302 putative effectors from 62 published fungal genomes, and is conserved among effectors targeting the chitin detection aspect of plant immunity (De Jonge and Thomma, 2009;Lee et al., 2014).
On the other hand, the Avr2 effector from C. fulvum and the EPIC1 and EPIC2 effectors from P. infestans both target the tomato defence protease Rcr3 (Song et al., 2009) yet are unrelated and share no sequence similarity, thus relying on the presence of conserved domains could cause many possible candidates to be overlooked.

| Motifs
The first four oomycete Avr effectors cloned, ATR13 and ATR1 NDWsB from the downy mildew Hyaloperonospora parasitica (Allen et al., 2004;Rehmany et al., 2005), Avr3a from P. infestans , and Avr1b-1 from P. sojae (Shan et al., 2004), showed no sequence similarity except for two conserved motifs at the N-terminus. These RxLR and DEER motifs have since been identified as N-terminal host targeting domains and, in P. infestans, the RxLR motif in the Avr3a effector is required for translocation into potato cells (Whisson et al., 2007;Bos et al., 2010).

RxLR effectors have been identified in multiple Phytophthora,
Albugo, and Hyaloperonospora species, with 568 RxLR genes being found in P. infestans alone, making this the largest oomycete effector family to date (Anderson et al., 2015). Rapid variation and host specialization are attributed to the general lack of sequence similarity in filamentous pathogen effectors, yet this mostly contributes to the variation in the C-terminus of oomycete effector sequences, leaving the N-terminal motifs largely conserved . Conserved motifs such as RxLR and the more downstream DEER are used as powerful bioinformatic tools to isolate putative effector repertoires from genomic sequences (Jiang et al., 2008;Raffaele and Kamoun, 2012).
Within pathogenic fungi there is limited evidence for conserved translocation motifs. One possible exception is the [YFC] xC motif found in Blumeria graminis f. sp. hordei and Puccinia spp., members of the phyla Ascomycota and Basidiomycota, respectively (Godfrey et al., 2010;Duplessis et al., 2011). The evolutionary distance between these two fungi suggests a deep homology in the conservation of this motif, linked to a biotrophic lifestyle that uses haustoria-based feeding.
However, the general lack of sequence similarity or conserved domains means that bioinformatic approaches to effector prediction need to go beyond sequence homology.

| Structure
The structural properties of proteins are more highly conserved than amino acid sequences (Illergård et al., 2009) and therefore could be used as a tool for effector prediction. The structural similarities between the two sequenced M. oryzae effectors Avr1-CO39 and Avr-Pia were found using two-and three-dimensional nuclear magnetic resonance (NMR) experiments (de Guillen et al., 2015) and led to the discovery of the Magnaporthe Avr and ToxB-like effector family (MAX), which contains half of all cloned M. oryzae Avrs despite sharing less than 25% sequence identity (de Guillen et al., 2015).
The structural analysis of four RxLR oomycete effectors showed the presence of a conserved C-terminus 3-α-helix fold (Boutemy et al., 2011;Yaeno et al., 2011). This WY domain, named after the interacting tryptophan and tyrosine residues, hints to a core, stable protein scaffold as a source of protein function (Wirthmueller et al., 2013).
Resolving the structure of known effector proteins provides a useful tool for supporting the candidacy of putative effectors. One of the early effectors to be structurally resolved was ToxA produced by the tan spot fungus, Pyrenophora tritici-repentis. The ToxA crystal structure was resolved using X-ray crystallography (1.65 Å) and revealed a novel β-sandwich fold (Sarma et al., 2005). Later, the resolution of the flax rust, Melampsora lini, effectors AvrL567-A and -D showed a similar β-sandwich fold hinting at the structural homology of unrelated effector proteins (Wang et al., 2007).
Recently the structures of two candidate effectors in the poplar rust fungus, Melampsora larici-populina, were resolved using NMR.
One, MLP124266, is the first fungal protein to present a knottin-like structure (Postic et al., 2017)

| Rich in cysteines but not in size
The additional criteria for candidate effector selection often require secreted proteins to be small and cysteine-rich (Sperschneider et al., 2015). The presence of multiple cysteines enables the formation of stabilizing disulphide bridges (De Wit et al., 1986;Doehlemann et al., 2009).
Relying on such broad criteria can be problematic as, despite many known effectors sharing these features, these are not universal requirements. NIS1, first described in the cucumber anthracnose fungus Colletotrichum orbiculare (Yoshino et al., 2012), is conserved across both Basidiomycota and Ascomycota (Irieda et al., 2019), but contains no cysteines.
Relying on the size of mature peptides as a parameter for effector identification can also be problematic. The maximum size of a small protein in effector discovery can be anything from 150 to 400 amino acids (Bowen et al., 2009;Saunders et al., 2012b). However, even the larger size limits would exclude the P. graminis f. sp. tritici effector AvrSr35 with a mature length of 578 amino acids (Salcedo et al., 2017).
With these issues in mind, bioinformatic pipelines have been developed to encompass multiple criteria to refine effector prediction. Saunders et al. developed an in silico analysis pipeline that moved away from reliance on sequence similarity-based methods for effector identification and included physiological functions such as expression profiles, taxonomic information, and genomic features of potential candidates (Saunders et al., 2012b). To identify the repertoire of potential effectors within two rust fungus genomes, a clustering algorithm grouped candidates into families and ranked their likelihood of being effectors based on the knowledge that filamentous pathogen effectors have a least one of eight specific properties.

| Bespoke bioinformatic pipelines
These properties included the absence of recognized Pfam domains, similarities to haustorial proteins, and the presence of internal repeats. The number of candidates continued to functional analysis using this pipeline was greatly reduced (Saunders et al., 2012b). This approach has limitations as it is dependent on the thresholds based on a priori assumptions about effector properties; the number of missed effectors remains to be seen.  (Pierleoni et al., 2008;Armenteros et al., 2019). The subcellular localization of candidate effectors can also be predicted by searching for chloroplast or mitochondrial transit peptides or nuclear localization signals using tools such as WoLF-PSORT (wolfpsort.hgc.jp/) or LOCALIZER (localizer.csiro.au/) (Horton et al., 2007;Sperschneider et al., 2017). Machine learning has also resulted in the development of web-based tools that can predict with 89% accuracy whether proteins in the predicted secretome are effectors or not. EffectorP2.0 (effectorp.csiro.au/) takes into account the net charge and serine/cysteine content of proteins to prioritize candidate effectors for further functional validation (Sperschneider et al., 2018).

| Genomic landscape and transposable elements
Many fungal plant pathogens exhibit a two-speed genome, with distinct genomic compartments evolving at different rates. Alongside core stable regions, which are slow to evolve and often contain genes involved in metabolism, are hypervariable areas with high recombination and richness in repetitive sequences, including transposable elements (TEs). This genomic landscape and the presence of TEs serve to drive adaptive evolution (Faino et al., 2016) and these hypervariable regions often are the location of genes associated with pathogenicity, including effectors (Fouché et al., 2018;Jones et al., 2018).
In M. oryzae and Zymoseptoria tritici, TEs are associated with pathogenicity clusters and are seen to flank the first characterized Z. tritici effector, AvrStb6 (Bao et al., 2017;Zhong et al., 2017). TEs have also been shown to interfere with effector gene expression via epigenetic control. For example, AvrLm1 in Leptosphaeria maculans, located in a TE-rich genomic region, showed distinct histone methylation that acts to temporarily suppress expression during colonization to evade host recognition (Soyer et al., 2014;Fouché et al., 2018). This suggests that the variability of the genomic region or the proximity to TEs maybe useful factors in refining the search for candidate effectors.
Following the sequencing, genome assembly and annotation of the tumour-forming maize smut fungus Ustilago maydis, c.18% of genes encoding secreted proteins were found to be arranged into 12 discrete clusters within the genome (Kämper et al., 2006). These clusters were co-regulated by a central pathogen-development regulator and expression induced in tumour tissue. Deletions of five clusters caused clear changes in virulence, including the largest cluster, 19A, which caused a strong attenuation in virulence and reduced tumour formation upon deletion (Kämper et al., 2006;Brefort et al., 2014). Subsequent subdeletions of 19A members led to the identification of the effector Tin2, required for anthocyanin production Tanaka et al., 2014).

| Comparative genomics
By comparing the genomes of U. maydis and Sporisorium reilianum, Schirawski et al. (2010) found that effector clusters and pathogenicity-related regions were more highly diverged between the close relatives than the rest of the genome. This comparison led to the identification of the pit gene cluster involved in tumour formation in U. maydis . Within this cluster the secreted effector Pit2, involved in plant defence suppression and cysteine protease inhibition, was found Mueller et al., 2013). This same comparison was used to locate gene clusters and candidate effectors in S. reilianum, and whilst genes that have a partial impact on disease severity have been identified, as yet no candidates strongly attenuate virulence (Ghareeb et al., 2019).

| Lineage-specific elements
Novel effectors were identified in the asexual fungus V. dahliae, where chromosome reshuffling has led to the formation of lineage-specific (LS) regions of plasticity in the genome (de Jonge et al., 2013). These LS regions are enriched with retrotransposon and repetitive sequence elements, as well as being the location of many candidate effectors. Contrary to the two-speed genome hypothesis, these LS regions show strong levels of conservation with little to no single nucleotide polymorphisms (SNPs) being identified, even within the intergenic regions . In one such LS region, four putative effectors were identified, including the LysM domain containing effector Vd2LysM, which was only found in the VdLs17 strain (de Jonge et al., 2013).

| Sequence divergence
Molecular variation in filamentous phytopathogen genes is known to be essential for altering pathogen-host interaction outcome and can provide insight into the evolution of virulence (Allen et al., 2008).
Polymorphisms in effector sequences among isolates can impact on virulence and are involved in host adaptation; this makes them promising targets for disease control strategies.
The genomes of four isolates of the wheat yellow stripe rust fungus Puccinia striiformis f. sp. tritici were resequenced and assessed for SNPs. Proteins that displayed nonsynonymous substitutions between isolates that differed in virulence on specific wheat cultivars were identified (Cantu et al., 2013). This led to five secreted polymorphic candidate effectors being nominated for further characterization from a predicted secretome of 2,999 proteins.
This sequence divergence has also proved useful in identifying pathogens in the field. Using the Oxford Nanopore MinION sequencer, 242 highly variable genes were used to collect real-time population dynamics data of P. striiformis f. sp. tritici isolates in Ethiopia (Radhakrishnan et al., 2019). This Mobile And Real-time PLant disEase (MARPLE) diagnostic system can be used to monitor the emergence of plant pathogen strains, but can also be adapted to include newly characterized effectors within the panel of genes.
Going forward, MARPLE will allow for the monitoring of mutations and the detection of effector evolution that may be linked to gain of virulence of phytopathogens, all within the confines of the field.

| Association mapping in the sequencing era
In silico predictions of effectors, whilst allowing us to rapidly screen whole genomes for candidates, lack discriminatory power and often result in candidate effectors having no clear impact on pathogen virulence. Genome-wide association studies (GWAS) and quantitative trait locus (QTL) mapping can identify loci associated with heritable phenotypic variation, such as virulence, thereby complementing techniques to identify and clone Avr effectors recognized by known host resistance proteins . The Zymoseptoria tritici effector AvrStb6 was isolated in this way .
Using crosses between two Swiss strains of Z. tritici, QTL mapping found a confidence interval containing nine candidates for AvrStb6.
Combining this with a GWAS study from over 100 different natural isolates led to one candidate, a small cysteine-rich secreted protein that was not present in the original Z. tritici genome annotation ).
An additional benefit of using GWAS in effector discovery is that the natural variation in SNP calling identified in wild populations can be used to quantify how each SNP contributes to pathogen virulence (Sánchez-Vallet et al., 2018b). Integrating GWAS with transcriptome dataset, referred to as transcriptome-wide association studies (TWAS) (Wainberg et al., 2019), identified the link between genes and traits across populations and has been used to discover Blumeria graminis f. sp. hordei Avr a effectors, including Avr a9 (Saur et al., 2019a).

| FUN C TIONAL CHAR AC TERIZ ATION
Make your work to be in keeping with your purpose.

| Knock out or knock down: let's be disruptive
One of the simplest ways to determine the pathogenicity of a candidate effector is to disrupt the encoding gene and determine whether the virulence on a susceptible host or the Avr phenotype on a resistance genotype is compromised. Early transformation studies of the C. fulvum effectors relied on double homologous recombination to insert a selectable marker into the target gene encoding a known effector such as ecp1 and ecp2, thus disrupting them (Laugé et al., 1997).
Later sequencing technology allowed transformations without the need for cloning. Mutants of the corn smut fungus Ustilago maydis were made using PCR-based protocols combined with protoplast transformation to generate candidate effector knockout mutants (Schulz et al., 1990;Kämper, 2004). This method is widely used and has successfully facilitated the functional characterization of U. maydis effectors, including Rsp3 and Cce1 (Ma et al., 2018a;Seitner et al., 2018).

Agrobacterium tumefaciens-mediated transformation (ATMT) is
another method to disrupt genes and is widely used in plant transformations. ATMT was first used in fungi in budding yeast in 1995 and then the technique was adapted for use in filamentous fungi, including M. oryzae (Bundock et al., 1995;Rho et al., 2001). This method relies on the targeted insertion of a selectable marker into the fungal genome from a disarmed Ti plasmid of transformed Agrobacterium to disrupt the gene of interest. The selectable marker is incorporated into the fungal genome via homologous recombination, a process that occurs easily in yeast. This mechanism, however, is highly variable in filamentous fungi, where nonhomologous end-joining (NHEJ) appears to be the dominant DNA repair pathway over homologous recombination (Meyer et al., 2007;Villalba et al., 2008). The Ku70 protein is part of a complex that regulates the NHEJ pathway (Ninomiya et al., 2004), and its deletion has led to the increase of homologous recombination in M. oryzae from <25% to 80% (Kershaw and Talbot, 2009). Combining ATMT with the generation of ∆Ku70 mutants led to the characterization of the Z. tritici Avr effector AvrStb6 .
Another, more recent, method of gene disruption is using the genome-editing system CRISPR-Cas9. Originally identified as an immune mechanism in bacteria and archaea, the CRISPR-Cas9 system is used as a genome-editing tool in plants and animals, and was adapted by Nødvig et al. (2015) for use in filamentous fungi (Mali et al., 2013;Fauser et al., 2014;Nødvig et al., 2015). This technique has led to targeted gene disruption and consequent characterization of effectors in the oomycete P. sojae and the fungal pathogen U. maydis (Fang and Tyler, 2016;Schuster et al., 2018).
There are, however, difficulties in producing stable transformants in phytopathogens that are obligate biotrophs (Thomas et al., 2001;Lorrain et al., 2019). In these cases, knockdown technologies

| In planta expression
When a candidate effector is heterologously expressed in planta various functional assays can be used to determine the virulence activities of the protein.
Necrosis assays monitor for the induction of HR-like cell death, which can be a result of Avr/R protein/guardee protein interactions or be directly induced by the candidate effector. These assays were first carried out using the model plant Nicotiana tabacum (tobacco), which is infiltrated with transformed Agrobacterium that delivers the effector gene expressed from an inducible promoter into the plant cell for transient protein production (Kamoun et al., 1999;Qutob et al., 2002;Ma et al., 2012).
In 1999 the P. infestans and C. fulvum effectors Inf1 and Avr9, respectively, were transformed into either wild-type or Cf-9 transgenic N. tabacum using this method. The assay showed that INF1 was capable of inducing necrosis in wild-type tobacco whilst Avr9 could only do so in transgenic tobacco expressing the corresponding R gene Cf-9 (Kamoun et al., 1999). Later Avr9 and Cf-9 were transiently coexpressed in N. tabacum using agroinfiltration to confirm the induction of HR in the nonhost plant following expression of the Avr/R gene pairs (Van der Hoorn et al., 2000).
Effector characterization in nonhost dicotyledonous model plants maybe more suited to high-throughput screening than in cereal hosts. However, these highly artificial scenarios do have several limitations. A negative screen with no visible phenotype upon recombinant expression may indicate either the candidate is not an effector or the effector target/receptor is lacking in the model species. On the otherhand, HR-induced necrosis in an effector screen may not be caused by a specific effector/target interaction but by nonhost resistance (NHR) triggered by detection of the candidate (Kettles et al., 2017). Although of interest, by definition the latter

F I G U R E 2
The host-induced gene silencing (HIGS) construct encodes an inverted sequence that forms a hairpin double-stranded (ds) RNA following transcription and is introduced into the host plant either by transient or stable transformation. The dsRNA is processed to form small interfering RNA (siRNA), either before or after delivery to the pathogen cell using the plants innate RNAi machinery. Once inside the fungal cells the siRNA silences the target effector genes by interfering with the target mRNA transcripts (Koch et al., 2018). The movement of small RNA between host and pathogen is detailed by Wang and Dean (2020).
scenario would not occur in native host interactions. Therefore, expression assays in the native host maybe the more useful for functional characterization.
Candidate effectors can be transiently expressed in protoplast cells and cell death monitored via the reduction in expression of a co-transfected reporter gene such as β-glucuronidase (GUS) or luciferase (Chen et al., 2006;Lu et al., 2016). This approach was used to identify the cell death-inducing properties of five M. oryzae effectors, including MoCDIP4 (M. oryzae cell death inducing protein 4), in rice protoplasts (Chen et al., 2012) and the NLR-mediated recognition of four newly identified barley powdery mildew avirulence effectors, including AVR a9 , in barley (Saur et al., 2019a).
Cell-death suppression assays are used to detect the alteration of the plant immune response induced by a known cell death elicitor.
The overexpression of the stem rust candidate effector PSTha5a23 in Nicotiana benthamiana suppresses P. infestans INF1-triggered cell death, indicating that PSTha5a23 plays a role in controlling plant defence responses (Cheng et al., 2017).

An alternative method of expressing effectors in plant cells uses
the bacterial type III secretion system (T3SS) derived from the tomato bacterial speck pathogen Pseudomonas syringe pv. tomato DC3000 (He et al., 2004). This system was first adapted for filamentous plant pathogens by Sohn et al. (2007) (Whisson et al., 2007;Fabro et al., 2011). Despite T3SS being used to screen effector candidates of stem rust (P. graminis f. sp. tritici) and bean rust (Uromyces appendiculatus), this system is rarely used for fungal effector characterization and has limited success on cereals (Upadhyaya et al., 2014;Qi et al., 2019;Saur et al., 2019b). These problems are linked to the required unfolding and refolding of effectors prior to insertion, especially those rich in cysteine-cysteine bridges.
As well as monitoring for necrosis, or lack thereof, the in planta growth of another pathogenic species can be used as a proxy to

| The viral overexpression system
Due to the limited effectiveness of both T3SS and Agrobacteriummediated transient expression in most cereal species, viruses have been developed as efficient vectors for heterologous protein expression (viral overexpression, VOX) (Lee et al., 2012).
The barley stripe mosaic virus (BSMV) was first verified as a tool for protein expression when used to overexpress the luciferase reporter gene in protoplast cells and later to express green fluorescent protein (GFP) in planta (Joshi et al., 1990;Haupt et al., 2001;Lawrence and Jackson, 2001). The BSMV vector was adapted for use in the VOX system and used to characterize the function of the fungal effector ToxA (Manning et al., 2010) (Figure 3). However, the compact nature of the virus results in a negative correlation between fragment size and stability of the viral vector (Avesani et al., 2007;Bruun-Rasmussen et al., 2007). BSMV-VOX has been widely used for heterologous expression of proteins up to 150 amino acids; however, as previously stated there is no agreed size limit for an effector (Figure 3a; Bouton et al., 2018).
Another limitation of BSMV for use in effector discovery is that this virus has a tripartite RNA genome (Figure 3b). The heterologous protein is inserted into the γ genome yet all three subgenomes are required to combine for successful expression in planta making BSMV-VOX unsuitable for high-throughput screening assays.
The foxtail mosaic virus (FoMV) has been adapted for use in VOX systems in cereals (Bouton et al., 2018). Vectors derived from FoMV such as PV101 avoid many of the caveats of those from BSMV.
FoMV has a monopartite RNA genome and the PV101 vector can be used to successfully express proteins up to 600 amino acids in size. In addition, unlike BSMV vectors, PV101 allows for heterologous expression of proteins in their native form, including possible signal peptides, without the need for processing from proteases that may only be 90% efficient (Bouton et al., 2018). In situations where the effector expressed from the VOX vector rapidly triggers R protein-mediated defences, virus spread is halted and therefore the phenotypic readout in the bioassay is the lack of systemic spread of the recombinant virus (Saintenac et al., 2018).

| Where do they go?
Knowing the localization of candidate effectors within host tissues not only demonstrates that the protein can be translocated from the pathogen to its host, but also suggests where the effector target(s) may be found. Traditionally in situ hybridization assays were done where antibodies were raised against the effector or an added epitope tag and detected using transmission electron microscopy (TEM). Translocation of fungal effectors into the host cell was first shown using an immunocytochemical approach in rusts. The goldand fluorescence-labelling of four independently raised antibodies to the RTP1p protein in Uromyces fabae and its homolog in Uromyces striatus showed that in later stages of infection RTP1p translocated from the extrahaustorial matrix to inside the plant cell itself (Kemen et al., 2005).
For apoplastic effectors, localization was often determined by means of their isolation. The C. fulvum effectors Avr2, Avr4, Avr9, and Ecp6 were directly isolated from the apoplastic fluid, whereas the P. infestans protease inhibitor EPIC1 was isolated from the apoplast after antibodies were raised Rooney et al., 2005;Tian et al., 2007;Bolton et al., 2008). Whilst successful, these approaches are laborious, expensive, and not suited to high-throughput screening of either apoplastic or cytoplasmic effector candidates (Dalio et al., 2017).
The nuclear localization of the P. infestans CRN effectors was determined using N-terminal GFP tagging and confocal microscopy.
By overexpression five GFP-CRN (without the signal peptide) fusion proteins in planta the effectors were shown to accumulate within plant cell nuclei (Schornack et al., 2010). High-throughput screening of 61 candidate effectors (ChECs) from the anthracnose fungus Colletotrichum higginsianum using this method found that whilst nine of the ChECs were imported into the nucleus, others localized to the Golgi bodies, microtubules, and peroxisomes, all novel targets for fungal effectors (Robin et al., 2018).
The U. maydis effectors Cmu1 and Tin2 have been shown to localize to the maize cytoplasm; however, this could not be demonstrated when fluorescently tagged (Djamei et al., 2011;Tanaka et al., 2014;Tanaka et al., 2015). This may be due to the tags inhibiting the partial unfolding of the effectors, thereby preventing their translocation, or the incorrect refolding of the tags themselves upon entering the cytoplasm (Lo Presti et al., 2015).
Whilst investigating the translocation of M. oryzae effectors into rice cells, fluorescent-tagged cytoplasmic effectors were seen to first accumulate in the plant-membrane derived infection structure, the biotrophic interfacial complex (BIC), prior to delivery into the cytoplasm, whereas tagged apoplastic effectors localized to the invasion hyphae (Mosquera et al., 2009;Khang et al., 2010).
The BIC's role in effector translocation could only be confirmed by the addition of a nuclear localization signal (NLS) to cytoplasmic effectors, causing artificial accumulation in the nucleus of the neighbouring rice cells. This approach concentrated the fluorescent signal into discrete foci observable using live cell imaging (Khang et al., 2010).
For apoplastic effectors it is difficult to distinguish between apoplastic or cytoplasmic localization when the fluorescently tagged candidates appear to localize to the plasma membrane or cell wall. and was evenly distributed throughout the enlarged space (Oparka, 1994;Doehlemann et al., 2009

| A shot in the dark: unbiased screening
Unbiased "forward" screening to find protein-protein interactions (PPI) is a common technique used in many aspects of molecular biology. The yeast two-hybrid system (Y2H), first developed 30 years ago, allows for the large-scale screening of cDNA libraries derived from pathogen-infected plants for effector target identification (Fields and Song, 1989;Mukhtar et al., 2011). Interactions detected by Y2H screens must be validated by additional PPI assays as this approach is prone to false positives.
The most common Y2H validation technique is co-immunoprecipitation (Co-IP). Co-immunoprecipitation is used to screen effector interactors in heterologous systems. When 20 candidate poplar rust fungus (M. larici-populina) effectors were tagged with GFP and expressed in N. benthamiana, five were found to specifically interact with plant proteins by pull-down assays using anti-GFP followed by protein purification (Figure 4a) (Petre et al., 2015).
Biotinylation is also used for proximity labelling based on tools such as BioID . A benefit of proximity labelling over co-immunoprecipitation is the possibility of identifying proteins that only weakly or transiently interact with the target (Figure 4b).
Recently a new proximity labelling tool, TurboID, has been shown to provide more efficient labelling in planta compared to BioID and can also reduce the biotin incubation time from 16 hr to 10 min (Branon et al., 2018;Zhang et al., 2019). These new advances in PPI technology pave the way for higher-throughput effector interaction screening in planta.

| Split-marker complementation
The effector Pep1 is essential for the pathogenicity of the corn smut fungus U. maydis (Doehlemann et al., 2009). The direct interaction between Pep1 and the plant peroxidase POX12 was validated using the bimolecular fluorescence complementation (BiFC) assay ( Figure 4c), which involves two parts of a fluorescent marker being fused to candidate interactors. Only when the interactors meet can the full-length fluorescent marker assemble and be detected.
Alternatively, the firefly-derived enzyme luciferase can be used for split-marker complementation. This has the advantage over BiFC for in planta studies because luciferase does not require excitation by light for detection, thereby eliminating autofluorescence interference . However, using split-marker complementation for PPI validation is not infallible as heterologous overexpression of proteins in N. benthamiana can affect protein localization and therefore interactors.

| Structural interactions: pinpointing the surface contacts and their strengths
Knowledge of effector structures whilst in complex with their targets gives us a greater insight into the molecular basis of these crosskingdom interactions.
The C. fulvum effector Avr4 was one of the first to be characterized from a family of effectors that bind to and protect fungal cell-wall chitin from host chitinase van den Burg et al., 2006). Recently the crystalline structure of Avr4 in complex with its chitin ligand (resolved to 1.95Å) has highlighted the residues required for this function (Hurlburt et al., 2018). Structural mutant studies have also shown that recognition of the Avr4 by the cognate Cf-4 immune receptor does not depend on the same ligand binding as previously thought (Hurlburt et al., 2018).
The crystal structure of the rice intracellular NLR immune receptor Pik in complex with the M. oryzae effector Avr-Pik (1.6Å resolution) reveals molecular details of the recognition event that leads to HR-induced cell death (Maqbool et al., 2015). The effector surface involved in this interaction was also identified as being involved in the surface interactions between Avr-Pia and the NLR-RATX1 in M. oryzae (Ortiz et al., 2017).
In the past decade protein structures are increasingly being resolved without the need to form crystals or use damaging X-rays but by using cryo-electron microscopy. This technique is widely used to resolved proteins in complexes and has been used to show both inactive Arabidopsis NLR complex ZAR1-RKS1 and the intermediate form when the complex interacts with a protein modified by the bacterial effector AvrAC (Xanthomonas campestris pv. campestris) .
Cryo-e, despite gaining popularity in structural biology, is unable to resolve proteins smaller than 65 kDa, a size exclusion that would include many fungal and oomycete effectors (Muench et al., 2019).
The strength of effector-target interactions can be determined using isothermal titration calorimetry whereby direct measurement of the heat that is either released or absorbed during the molecular binding event gives a complete thermodynamic picture of the reaction, including affinity, enthalpy, and stoichiometry (Duff et al., 2011).
For the conserved M. oryzae MAX effector Avr1-CO39, isothermal titration calorimetry was used to confirm that direct interaction with the heavy-metal associated (HMA) domain of the rice NLR RGA5 was required for effector binding (Guo et al., 2018).
A greater understanding of how structural interactions aid the specificity of Avr recognition is vital for future work in developing sustainable disease resistance in important food crops.

| E XPLOITING EFFEC TOR D ISCOVERIE S TO CONTROL CROP PL ANT D IS E A S E S
Knowing is not enough; we must apply. Willing is not enough; we must do.
Johann Wolfgang von Goethe, Wilhelm Meister's Journeyman Years F I G U R E 4 Protein-protein interaction techniques. (a) Co-immunoprecipitation, effectors are tagged with a peptide sequence such as green fluorescent protein (GFP) and expressed in planta. Antibodies are used to pull down the protein complexes that can then be analysed using liquid chromatography and mass spectrometry (LC-MS/MS) . (b) Biotinylation, effectors are fused to mutant biotin ligase enzymes and expressed in vivo. The fusion protein catalyses the biotinylation of interacting and proximal proteins in the presence of biotin. The biotinylated proteins are captured using streptavidin beads (Roux et al., 2012). (c) Bimolecular fluorescence complementation, the effector and putative interactors are tagged with nonfluorescent fragments of yellow fluorescent protein (YFP). Direct interaction of the tagged effectors results in YFP reassembly visualized in vivo or quantified using flow cytometry (Kerppola, 2008;Graciet and Wellmer, 2010;Miller et al., 2015).
The ultimate goal of effector discovery, from identification to characterization to target interactions, is to apply this knowledge to the control of multiple pathogens that threaten our food security.

| "Effectoromics"
For over 100 years disease resistance loci have been introduced into crops and subsequently shuffled through traditional breeding techniques, whether that be as individual genes or stacked to achieve often only short-lived resistance to pathogens (Vleeshouwers et al., 2011;Langner et al., 2018). Despite this, the search for novel R genes with durable or broad-spectrum resistance remains ongoing.
The term "effectoromics" is used to describe the use of effectors in high-throughput screening for R protein function in either the germplasm of crop cultivars or a sexually compatible species. Avr effectors can be harnessed to screen rapidly for HR phenotypes, a hallmark of an ETI response (Vleeshouwers and Oliver, 2014 Solanum species (Takken et al., 2000;Du et al., 2014).
The search for broad-spectrum or more robust R genes for breeding purposes maybe more nuanced than previously thought as multiple unrelated R genes can recognize the same pathogen effector (Aguilera-Galvez et al., 2018).

| Screening with necrosis-inducing effectors to remove host susceptibility loci
The necrosis-inducing effector ToxA was isolated from the wheat tan spot fungus P. tritici-repentis in 1996. Infiltration of purified ToxA into the apoplastic space of a susceptible wheat cultivar containing the Tsn1 susceptibility (S) gene is itself sufficient to induce tan spot symptoms (Tomas et al., 1990;Ballance et al., 1996;Ciuffetti et al., 1997;Welti and Wang, 2004). Wheat breeders routinely use the purified toxin to screen all new wheat germplasm to eliminate susceptible lines from their breeding programmes. This method is preferred over screening for molecular markers linked to the corresponding Tsn1 locus due to the ease of application and speed of results (Vleeshouwers and Oliver, 2014). Tsn1 removal from all newly commercially released wheat varieties has improved resistance to tan spot disease and Australia has seen a 26% reduction in ToxAsensitive wheat grown in the 10 years prior to 2016 (See et al., 2018).

| K EEPING TR ACK OF EFFEC TOR D ISCOVERIE S IN MULTIPLE S PECIE S IN AN IN CRE A S ING LY DATA-RICH WORLD
A place for everything, and everything in its place.
Idiom from 17th century In the past two decades effector discovery and characterization have exploded with regard to crop pests and pathogens. This key information is found in multiple original research publications, review articles, UniProt, individual pathogen genome browsers, and species-specific websites. However, to aid future research and guide the direction of work the genotype and fine phenotyping data surrounding these discoveries and new insights needs to be FAIR (Findable, Accessible, Interoperable, and Reusable) to molecular plant pathologists as well as the wider life sciences communities.
Publicly available repositories of curated data regarding proteins with confirmed roles in pathogenicity and virulence are a fundamental tool for effector study. The Pathogen-Host Interactions database (PHI-base, www.phi-base.org) is a manually curated database comprising over 6,780 genes from 268 pathogens of over 210 hosts (September 2019), of which 60% are plants (Urban et al., 2020

| CON CLUS I ON S AND OUTLOOK
"Would you tell me, please, which way I ought to go from here?" "That depends a good deal on where you want to get to," said the Cat.

Lewis Carroll, Alice in Wonderland
Effectors are the mysterious molecular tools evolved and used by plant pathogens in multiple ways. Effector studies are of vital importance in addressing the global food security challenge, yet the explosion in research efforts aimed at understanding effector biology over the last few decades has left us with a dichotomy in our knowledge.  (McGrann et al., 2016;Lopez et al., 2018).
The arrival of full genome sequencing almost two decades ago has been a double-edged sword. Bioinformatic pipelines and the development of prediction software has sped up the refinement of putative effectors whilst simultaneously highlighting the vastness of the gene repertoires to be investigated. For effector characterization, the future efficiency not only depends on the development of ultrahigh-throughput functional assays but also their use in combination with lower-throughput novel and well-established techniques such as QTL mapping and GWAS .
Whilst multiple developments in effector discovery have increased our understanding of these enigmatic proteins, arguably the explosion in effector research can be attributed to the development of three approaches: genome sequencing, bespoke bioinformatic pipelines, and Agrobacterium-mediated transient expression in planta. Armed with only an annotated genome, even understudied conifer-infecting fungal pathogens can be screened for the presence of putative effector proteins (Raffaello and Asiegbu, 2017). With this in mind, genome reannotations and improvements to prediction algorithms continuously widen the pool of effector candidates available, especially in well-studied crop pathogens Frantzeskakis et al., 2018). Therefore, perhaps the greatest roadblock to effector discovery is the accuracy of genome assembly and annotation, an issue that will take at least 5-10 years to resolve with the inclusion of pangenomes (Cissé and Stajich, 2019).
The genome annotation of multiple isolates through the construction of pathogen pangenomes allows for intraspecific genome analysis and will provide insight into the links between high polymorphisms and host specificity. The use of pangenome analyses has already led to the differentiation between core candidate effectors and novel candidate effectors in Z. tritici and M. oryzae (Singh et al., 2019;Badet et al., 2019). Machine-learning-based prediction tools as well as the robotic implementation of practical molecular techniques should help to fast track the progress from effector prediction to characterization. This anticipated progress will undoubtedly erode some of the disparity in our interspecies knowledge and lift the veil on the enigmatic filamentous phytopathogen effector repertoire.
Many novel functions, locations, interactions, and generic underlying themes remain to be discovered.

ACK N OWLED G M ENTS
We would like to acknowledge the invaluable input given by friends Grant (BB/P016855/1). The PHI-base is funded from two UK BBSRC grants BB/K020056/1 and BB/S020020/1.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data sharing is not applicable to this article as no new data were created or analysed in this study.