EffectorK, a comprehensive resource to mine for Ralstonia, Xanthomonas, and other published effector interactors in the Arabidopsis proteome

Abstract Pathogens deploy effector proteins that interact with host proteins to manipulate the host physiology to the pathogen's own benefit. However, effectors can also be recognized by host immune proteins, leading to the activation of defence responses. Effectors are thus essential components in determining the outcome of plant–pathogen interactions. Despite major efforts to decipher effector functions, our current knowledge on effector biology is scattered and often limited. In this study, we conducted two systematic large‐scale yeast two‐hybrid screenings to detect interactions between Arabidopsis thaliana proteins and effectors from two vascular bacterial pathogens: Ralstonia pseudosolanacearum and Xanthomonas campestris. We then constructed an interactomic network focused on Arabidopsis and effector proteins from a wide variety of bacterial, oomycete, fungal, and invertebrate pathogens. This network contains our experimental data and protein–protein interactions from 2,035 peer‐reviewed publications (48,200 Arabidopsis–Arabidopsis and 1,300 Arabidopsis–effector protein interactions). Our results show that effectors from different species interact with both common and specific Arabidopsis interactors, suggesting dual roles as modulators of generic and adaptive host processes. Network analyses revealed that effector interactors, particularly “effector hubs” and bacterial core effector interactors, occupy important positions for network organization, as shown by their larger number of protein interactions and centrality. These interactomic data were incorporated in EffectorK, a new graph‐oriented knowledge database that allows users to navigate the network, search for homology, or find possible paths between host and/or effector proteins. EffectorK is available at www.effectork.org and allows users to submit their own interactomic data.


| INTRODUC TI ON
Plants are continuously confronted with a wide variety of pathogens, including bacteria, oomycetes, fungi, nematodes, and insects.
To prevent their proliferation, plants have evolved a complex multilayered immune system (Jones and Dangl, 2006). Plants are able to recognize highly conserved pathogen-associated molecular patterns (PAMPs) through pattern-recognition receptors triggering induced defence responses collectively known as PAMP-triggered immunity (PTI) (Zipfel, 2014). These responses are usually enough to prevent most potential invaders; however, some pathogens secrete effector proteins to subvert the defence responses and alter diverse cellular processes to ease their proliferation (Ma et al., 2018). Plants, moreover, have evolved several intracellular nucleotide-binding siteleucine-rich repeat (NBS-LRR) receptors recognizing these effectors and activating potent defence responses collectively known as effector-triggered immunity (ETI) (Cui et al., 2015).
Although the interactors and molecular functions of some effectors have been characterized (Büttner, 2016;Giron et al., 2016;Sharpee and Dean, 2016;Vieira and Gleason, 2019), for most effectors they are still unknown. The main factors complicating the largescale identification and characterization of effector-host protein interactions are the wide diversity of pathosystems, the difficulty in identifying bona fide effector genes, the collective contribution of effector proteins, the complexity of the host responses, and the lack of robust high-throughput techniques. For the model species Arabidopsis thaliana (Ath), to our knowledge, there are only two studies in which systematic effector-host protein interactions at the effectome-scale have been identified (Mukhtar et al., 2011;Weßling et al., 2014). In these studies plant interactors of effector proteins from Pseudomonas syringae (Psy, bacterium), Hyaloperonospora arabidopsidis (Hpa, oomycete), and Glovinomyces orontii (Gor, fungus) were identified by yeast two-hybrid (Y2H) assays. They reported that the effectors of these species converged onto a limited set of Ath proteins. These studies also demonstrated that many effector interactors are important for plant immunity and showed that their importance correlates with the level of effector convergence. Bacterial wilt, caused by Ralstonia pseudosolanacearum (Ralstonia solanacearum phylotype I, Rps), and black rot, caused by Xanthomonas campestris pv. campestris (Xcc), are listed among the top 10 scientifically and economically important plant bacterial diseases (Mansfield et al., 2012). Both Rps and Xcc are xylem-colonizing bacteria able to infect the model plant Ath (Deslandes et al., 1998;Buell, 2002).
They both rely on their type III secretion system for full virulence (Arlat et al., 1991(Arlat et al., , 1992. This "molecular syringe" allows the pathogen to deliver type III effector proteins (T3Es) directly into the host cell in order to promote disease. The roles of several of their T3Es have been characterized (White et al., 2009;Coll and Valls, 2013), but most knowledge on T3E functions comes from the study of Psy, which resides on leaf surfaces and in the leaf apoplast (Lindeberg et al., 2012;Büttner, 2016). Focusing mainly on a few species offers a partial view of effector biology. It is therefore crucial to expand our studies to other species to grasp the existing diversity of effector proteins and pathogen lifestyles.
To obtain a deeper understanding of the global Ath-effector protein interactome, we conducted three systematic large-scale screenings with T3Es from Rps and Xcc, the first vascular pathogens screened in this manner. Additionally, we conducted an extensive literature survey to gather published Ath interactors of effector proteins from pathogens from four different kingdoms of life: Bacteria, Chromista, Fungi, and Animalia. Combining all these data allowed us to identify 100 new "effector hubs" (i.e., Ath proteins interacting with two or more effectors). Together with Ath-Ath protein interactions retrieved from public databases, we generated an Ath-effector protein network that captures the wide diversity of Ath pathogens.
This network allowed us to detect general trends of effector interference with the host proteome. We have created a publicly available interactive knowledge database called EffectorK (for Effector Knowledge) that allows users to access and augment this network.

| Systematic identification of Arabidopsis interactors of R. pseudosolanacearum and X. campestris effectors
Three Y2H screenings were performed to identify Ath interactors of Rps and Xcc effector proteins. In a first screening, we identified 42 Ath interactors for 21 out of 56 T3Es from Rps strain GMI100 screened against a library of more than 8,000 full-length Ath cDNAs (8K space). The choice of the 56 Rps T3Es was guided by the available clones at the time of screening. In the second and third screenings, we identified 176 Ath interactors for 32 out of 48 T3Es from Rps strain GMI1000 and 52 Ath interactors for 18 out of 25 T3Es from Xcc strain 8,004 screened against an extended version of the previous library containing more than 12,000 Ath full-length cDNAs (12K space) ( Figure S1 and Table S1). Here the choice of Rps T3Es was constrained by a pool maximum imposed by the screening method (see Materials and Methods). T3Es were picked according to their highest degree of conservation within the species complex (Peeters et al., 2013). On average, 10.7 and 5.7 Ath interactors were found per Rps and Xcc T3E. These Ath cDNA libraries had been previously used to test interactions with 57 and 32 effector proteins from Hpa and Psy, respectively, (8K space) and 46 effector proteins from Gor (12K space) (Mukhtar et al., 2011;Weßling et al., 2014). The subset of interactions of effectors from Rps, Xcc, and Gor in the 8K space was used to compare with previously published Hpa and Psy data K E Y W O R D S database, effectors, interactomics, network, Ralstonia, Xanthomonas ( Figure 1). In general, Rps effectors interacted on average with more Ath proteins than the other screened species; however, this difference is only statistically significant when compared to Gor effectors (one-tailed Wilcoxon signed-rank test p < .001). These data show that effector proteins from these five different species, on average, tend to interact with a similar number of Ath proteins regardless of kingdom, life style, or effectome size.

| Effectors converge onto a limited set of Arabidopsis proteins
We compared the Rps and Xcc effector interactors identified in our screenings with the interactors previously identified for Hpa, Psy, and Gor effector proteins (Mukhtar et al., 2011;Weßling et al., 2014).
To avoid bias related to the size of the screened library, we considered only the subset of effector interactors present in the 8K space ( Figure S2 Gor. The total number of species-specific effector interactors was 221 out of 299 (73.9%). These data show that most effector interactors are kingdom-and species-specific.
To evaluate whether Rps and Xcc effectors interact randomly or converge onto a common set of Ath proteins we performed simulations rewiring effector-Ath protein interactions within the 8K space.
In these simulations, each effector was assigned randomly as many Ath proteins as it had interacted with in our screenings. Then, the number of interactors found on all simulations was plotted and compared with the experimental data ( Figure 2a). The number of effector interactors observed in our screenings was significantly lower than the numbers obtained in the random simulations for both Rps and Xcc. Similar results had been reported for effectors from Hpa, Psy, and Gor (Mukhtar et al., 2011;Weßling et al., 2014). This shows that, similarly to other species, both Rps and Xcc effectors also interact with a common subset of Ath proteins (i.e., intraspecific convergence).
These random rewiring simulations also allowed us to determine whether effectors from different species interact randomly or convergently with Ath proteins. For this, the number of common interactors of effectors from different species was compared with the experimental data ( Figure 2b). When comparing all three kingdoms, the number of common interactors observed was significantly higher than expected by random rewiring. We then analysed all possible binary, ternary, quaternary, and quinary combinations of species and in all cases the number of common interactors observed was higher than expected randomly ( Figure 2c). These differences were all statistically significant except for the common interactors of effectors from Psy and Xcc (p = .058; Figure S3). This could indicate that these two species are the most different in terms of effector targeting.
However, considering that Psy and Xcc are precisely the two species with the lowest number of effectors for which interactors have been identified (Psy: 32 and Xcc: 18 effector proteins), it is likely that the high p value is caused by the limited sample size. This shows that effectors from all these five species interact with a common subset of Ath proteins (i.e., interspecific convergence).
Altogether, our data indicate that Rps and Xcc effectors converge both intra-and interspecifically onto a set of limited Ath proteins, behaving similarly to effectors from other previously screened pathogen species. This suggests the existence of a convergent set of effector interactors common to evolutionarily distant pathogens that might have a predominant role in the general modulation of the host responses.

Arabidopsis-effector protein interactions
In order to gather more knowledge on Ath-effector protein interactions, we conducted an extensive literature search compiling data from a wider spectrum of bacterial, fungal, oomycete, and invertebrate effector proteins. We only considered published direct protein-protein interactions that had been confirmed by classic techniques such as Y2H, co-immunoprecipitation, pull-down, F I G U R E 1 Arabidopsis thaliana (Ath) degree of effector proteins from Glovinomyces orontii (Gor), Hyaloperonospora arabidopsidis (Hpa), Pseudomonas syringae (Psy), Xanthomonas campestris pv. campestris (Xcc), and Ralstonia pseudosolanacearum (Rps). Comparison of the Ath degree (i.e., number of Ath interactors per effector) of effector proteins from Gor, Hpa, Psy, Xcc, and Rps found in the 8,000-Ath-cDNA collection (8K space). Horizontal black bars represent the median. Colours represent the kingdom (orange: Fungi, yellow: Chromista, and blue: Bacteria) F I G U R E 2 Effectors converge intra-and interspecifically onto a common set of Arabidopsis thaliana (Ath) proteins. (a) Left: random and intraspecific convergent interactions of effectors (purple squares) with Ath proteins (green circles) can be distinguished by random network rewiring and simulation. Adapted from Weßling et al. (2014). Middle and right: number of Ath interactors in the 8K space of effectors from Xanthomonas campestris pv. campestris (Xcc) strain 8,004 and Ralstonia pseudosolanacearum (Rps) strain GMI1000 found in 10,000 degreepreserving simulations (grey) versus the observed number (red arrow). (b) Left: random and interspecific convergent interactions of effectors from different species (purple and orange squares) with Ath proteins (green circles) can be distinguished by random network rewiring and simulation. Right: number of common Ath interactors in the 8K space of effectors from Chromista, Bacteria, and Fungi found in 10,000 simulations (grey) versus the observed number (red arrow). (c) Scatterplot of observed versus simulated number of common Ath interactors between all binary, ternary, quaternary, and quinary combinations of species. x = y regression is represented with a dashed grey line protein-fragment complementation, fluorescence resonance energy transfer, or mass spectrometry. We compiled 287 interactions found in 80 peer-reviewed publications involving 218 Ath proteins and 72 effectors from 22 pathogen species (Table S2).
Among these 22 pathogens, there were nine bacterial species, mostly proteobacteria but also a phytoplasma species; eight invertebrate species, including both nematodes and insects; four oomycete, and one fungal species. While this collection of species does not represent the full diversity of Ath pathogens, it covers the majority of pathogens for which effector interactors have been reported. We can see that, despite being one of the major pathogen classes, few studies have described fungal effector interactors. This illustrates one of the current gaps in our knowledge of effector interactors in Ath.

| Identification of one hundred new "effector hubs"
To compare experimental and published data, we combined all the interactions curated from the published data together with data from our large-scale Y2H screenings. This resulted in a total of 564 different Ath proteins interacting with pathogen effectors.
Our screenings on Rps and Xcc effectors identified 235 interactors. Similar published screenings on Psy, Gor, or Hpa effectors had identified 200 interactors (Mukhtar et al., 2011;Weßling et al., 2014). The literature curation allowed us to identify 218 effector interactors. From the 235 Rps and Xcc effectors interactors found in our screening, 166 were new, which represents 29.4% of the total interactors compiled in this study ( Figure 3). This highlights the potential of such systematic and high-throughput large-scale screenings in identifying novel effector interactors. The average effector degree (i.e., the number of effectors interacting with a given Ath protein) was 2.3 but it was unevenly distributed among the 564 interactors, with 350 of them interacting with only one effector (62%) and 14 interacting with more than 10 effectors (2.5%) ( Figure S4). The contribution of our experimental data was important in the identification of single interactors as we identified 93 out of the 350 (26.6%). More remarkable was the contribution in the identification of "effector hubs," defined here as Ath proteins interacting with two or more effectors (Figure 4). The definition of "hub" has been debated and it has been traditionally associated with proteins that are highly connected in interactomic networks (Vandereyken et al., 2018). Our definition of "effector hub" came from the need to designate the Ath proteins that interact with several effectors and is based exclusively on the number of interacting effector proteins. We identified 100 new effector hubs and increased the degree of 42 previously described effector hubs (Table S3).
To evaluate the potential relevance of the newly identified effector hubs in plant immunity, we conducted a second literature survey to check if the corresponding Ath genes had previously reported functions in plant immunity or in pathogen fitness in planta (Table 1).
Sixteen out of the 100 new effector hub genes have already been described for their altered infection or other immunity-related phenotype when mutated, silenced or overexpressed. Additionally, the orthologs of three other new hubs in other plant species also produced altered infection phenotypes when silenced or overexpressed. A total of 19 out of the 100 newly identified effector hubs have already been shown to be involved in biotic stress responses.
Considering that many of the remaining newly defined effector hubs have been poorly characterized (e.g., hypothetical proteins or descriptions based on homology or belonging to a protein family), it is likely that the number of effector hubs involved in immunity was underestimated. This constitutes a valuable source of novel candidates for further functional characterization.
In terms of organism of origin, most of the 564 interactors are bacterial effector interactors, as could be expected considering that 132 out of the 266 total effectors compiled came from bacteria ( Figure S4). In the case of effector hubs, it is noteworthy that 133 out of the 214 hubs described in this work interact with effectors from a single kingdom while there are only 64, 16, and one hubs interacting with effectors from two, three or four different kingdoms, respectively (Table S3). Although biased by the structure of the data, this could suggest kingdom specificity of effector targeting.

F I G U R E 3
Overlap among effector interactors depending on the origin of the data set. Area-proportional Venn diagram showing the overlap among effector interactors identified in the largescale yeast two-hybrid (Y2H) screenings performed in this study, in similar large-scale Y2H already published, and in the manual curation of the literature. The total number of effector interactors coming from each dataset is indicated in parentheses

| Construction of an interaction network involving Arabidopsis and effector proteins
We constructed an Ath-effector protein interaction network compiling the previously described experimental and literature-compiled data with Ath-Ath protein interactions from public databases and the literature (Stark et al., 2006;Dreze et al., 2011;Orchard et al., 2014;Smakowska-Luzan et al., 2018). From the total of 49,500 interactions compiled in this study, 48,597 were grouped into a single connected component constituting what we defined as our Ath-effector interactomic network (Table S4)

| Effector interactors tend to occupy key positions in the Arabidopsis-effector protein interaction network
To further investigate the potential impact of effectors on the plant interactome, we evaluated the importance of their interactors for the organization of the network. We focused on two main network topology parameters: "degree" and "betweenness centrality" (Figure 4).
The "degree" of a protein represents the number of proteins that it interacts with. In this study we differentiated two types of degrees depending on the nature of the interacting proteins: the Ath degree of a given effector or Ath protein (i.e. the number of interacting Ath proteins) and the effector degree for a given Ath protein (i.e. the number of interacting effector proteins). The "betweenness centrality" of a protein is the fraction of all shortest paths connecting two proteins from the network that pass through it. There are two main types of key proteins in a network (Li et al., 2017): (a) proteins important for local network organization, typically showing high degree, and (b) proteins important for the global diffusion of the information through the network, characterized by high betweenness centrality. It had been previously reported in more limited networks that effectors tend to interact with host proteins with high degree and high centrality (Memišević et al., 2015;Li et al., 2017;Ahmed et al., 2018). We then  (Table 2). Effectively, the area under the curve value of effector interactors was higher than the value of the rest of the Ath proteins. This indicates that effector interactors present generally higher Ath degree than the rest of the Ath proteins. Similarly, we compared the betweenness centrality of these two groups of proteins (Table 2 and Figure S5). Effector interactors also presented significantly higher betweenness centrality values than the rest of the Ath proteins.
Altogether, these results indicate that effectors preferentially interact with Ath proteins that are more connected to other Ath proteins and that occupy more central positions in the interactomic network as reported for smaller networks (Li et al., 2017;Ahmed et al., 2018).

Arabidopsis-effector interaction network
We then wanted to test if the Ath degree and betweenness centrality values differed among distinct types of effector interactors (Table 2 and Figure S5). First, we compared multipathogen and pathogen-specific interactors as previously described ( Figure S2). Multipathogen effector F I G U R E 4 Network topology parameters. Example of a simple interactomic network of three effector proteins (purple squares) and nine Arabidopsis thaliana (Ath) proteins (green circles) to illustrate our definition of "effector hub" (i.e., Ath protein interacting with two or more effectors; highlighted in red) and the three network topology parameters analysed in this study. 1, Effector degree: number of effectors that interact with a given Ath protein; 2, Ath degree: number of Ath proteins that interact with a given effector or Ath protein; 3, Betweenness centrality: fraction of all shortest paths connecting two proteins from the network that pass through a given protein  b Orthologous gene in other plant species, as defined by EnsemblPlants (Kersey et al., 2018), characterized for a role in immunity.
interactors presented significantly higher Ath degree and betweenness centrality compared to pathogen-specific effector interactors. We also compared effector hubs with single effector interactors. Similarly, effector hubs also showed higher betweenness centrality and Ath degree than single effector interactors. This last observation implies that an Ath protein that interacts with several effectors tends also to interact with more Ath proteins. To evaluate whether this is biologically relevant or a bias of the "stickiness" of a protein, we compared the Ath and effector degree values of all effector interactors. Our results showed that these two parameters are not correlated (Pearson correlation coefficient = 0.3221; Figure S6). This suggests that effector hubs interact with more Ath proteins than single effector interactors and that this is not due to a higher stickiness of these proteins. Altogether, these results show that the general tendencies of effector interactors (i.e. more connected to other Ath proteins and more central in the Arabidopsis-effector interaction network) are stronger among effector hubs compared to single interactors, and among multipathogen effector interactors compared to pathogen-specific interactors. This reflects the importance of interfering with key position proteins for the modulation of host-pathogen interactions.

| Bacterial core T3Es interact with more connected and central Ath proteins
Our work on Rps and Xcc together with previous work on Psy T3Es

TA B L E 2 Cumulative
Ath and effector degrees and betweenness centrality of different groups of effector interactors on bacterial pathogen species for which other resources have been generated, particularly in terms of abundance and diversity of sequenced genomes and thus curated T3E repertoires (Lindeberg et al., 2012;Guy et al., 2013;Peeters et al., 2013;Roux et al., 2015;Dillon et al., 2019;Sabbagh et al., 2019). The most conserved set of T3Es, or "core effectome," from each of the three bacterial species has been previously defined (Guy et al., 2013;Dillon et al., 2019;Sabbagh et al., 2019). We then tested whether these subsets of T3Es behaved differently from the rest of bacterial T3Es in terms of interaction with host proteins (Table 2 and Figure S7). Our data showed that core and variable T3Es from the three species do not differ in Ath degree nor betweenness centrality. We then tested if there were any differences between the network properties of the interactors of core T3Es and the other bacterial T3E interactors. Core T3Es interactors showed higher effector degree, Ath degree, and betweenness centrality than the rest of interactors of bacterial T3Es. This suggests that, although core T3Es in general do not have more interactors than the rest of bacterial T3Es, they do interact with more highly connected and central Ath proteins. This might imply that core T3Es have a larger potential to interfere with the host interactome, which could explain the selective pressure to maintain them in the majority of strains.

| EffectorK, an online interactive knowledge database to explore the Arabidopsis-effector interactomic data
In order to facilitate the access and exploration of all the data presented in this work, we have generated EffectorK (for "Effector Additionally, EffectorK also allows users to find the shortest paths between two queried proteins in the network.
In order to update, expand, and further improve EffectorK, we encourage users to submit their own interactomic data by filing in and sending a dedicated template available on the site. These data will be verified by the curator team prior to their incorporation in the database. More information about usage, content, and data submission is accessible online, under the tabs "Help" and "Contribute" of the database web server. Please contact us if you have any question or suggestions by email via contact@effectork.org.

| D ISCUSS I ON
In this study we identified systematically Ath interactors of effectors from the vascular bacterial pathogens Rps and Xcc. We combined this information with other Ath interactors identified in similar experimental setups. Additionally, we conducted an extensive literature review to gather published Ath interactors of effectors from a wide variety of pathogens, including other bacterial species and also oomycete, fungal, and invertebrate pathogens. Studying this combined interactomic dataset allowed us to identify new trends of how effectors interfere with the plant proteome and evaluate whether previously described network principles were still supported on a wider scale. We showed that there are no substantial differences in terms of connectivity among the effectomes of five different pathogen species screened systematically (Figure 1). We have reinforced previously described intra-and interspecific convergence of effector F I G U R E 6 Graphical representation of interactomic data on EffectorK. Graphical representation of interactomic data from Xcc effector XopAC (AvrAC). XopAC, in purple, interacts with 36 Ath proteins, in green (only 12 shown for better visualization). The size of a protein node is proportional to its degree (e.g. CSN5B interacts with 50 proteins, BIK1 with six, and APK1A only with XopAC). The thickness of the connecting edges indicates the level of confidence: narrow edges represent physical interaction detected by only one technique, whereas thick edges indicate that the interaction has been detected by at least two independent techniques (e.g. XopAC interaction with BIK1 has been detected by coimmunoprecipitation and pulldown assays, whereas the interaction with APK1A, only by Y2H) targeting with effectors from two new species (Mukhtar et al., 2011;Weßling et al., 2014), and showed at the same time that most effector interactors are pathogen specific (Figure 2 and S2). Our analyses also supported the previously described tendency of effectors to interact with plant proteins better connected and central in the network (Li et al., 2017;Ahmed et al., 2018), and showed that this tendency is even stronger among effector hubs, multipathogen interactors, and bacterial core T3E interactors (Table 2 and Figure S5).

| The balance between interactor specificity and convergence
Our data showed that most effector interactors were pathogenspecific ( Figure S2) but at the same time effectors converge interspecifically onto a small subset of Ath proteins ( Figure 2B This was the case when we compared the percentage of speciesspecific interactors of effectors from Hpa, Psy, and Gor that passed from being 73.9%, 64.9%, and 46.7% in previous works (Mukhtar et al., 2011;Weßling et al., 2014), to 51.7%, 58.9%, and 35.6%, respectively, in the present study ( Figure S2). Nevertheless, a total of five screened species is probably not powerful enough to sustain this claim. (b) If, in contrast, the interactor specificity increased with the number of screened species, it would mean that the different pathogens have evolved unique ways to modulate the interaction with the host. If this were the case, deeper analyses comparing related pathogens (e.g. species with similar lifestyle or from the same kingdom) could allow identifying trait-specific interactors (e.g. effector interactors exclusive among vascular pathogen effectors). In any case, to better understand the similarities and particularities on how effectors modulate host processes, it is essential to increase the number of pathogen species screened for effector interactors at the effectome-scale.

| Large-scale screenings fill the gap in the identification of effector interactors
Including manually curated data from literature has allowed us to broaden significantly the diversity of plant pathogen species compared to similar studies. However, 346 out the 564 described Arabidopsis effector interactors have been identified exclusively through large-scale Y2H screenings against partial libraries of Ath cDNAs. As with any other large-scale screening, the technical limitations together with the incompleteness of the library might have led to an underestimation of the plant-effector interactome of the five screened species (Brückner et al., 2009). The relatively small overlap between the large-scale Y2H screenings and manually curated literature data sets might be a consequence of this limitation ( Figure 3). This small overlap illustrates the current knowledge gap in the characterization of the full plant interactome of pathogen effectors. Extensive work will be required to characterize further effector-host protein interactions in other pathosystems. As one of the simplest yet powerful high-throughput techniques for proteinprotein interaction detection, our work, like others before, highlights the potential of such large-scale Y2H screenings in the identification of novel effector interactors in an easy, cheap, and systematic manner.

| EffectorK, an entry point to explore and make sense of plant-effector interactomics
To conclude, our work also provides valuable resources for the plant-pathogen interaction community. We described 540 new Ath-Rps and Ath-Xcc effector protein interactions that allowed us to identify 166 new effector interactors (Table S1). We also manually curated several publications to assemble a collection of 287 Ath-effector protein interactions from a wide variety of pathogens (Table S2). All this allowed us to identify 100 novel effector hubs (Table S3). The contribution to plant immunity of these effector hubs has been described for 19 of them, but remains untested for the majority (Table 1). This constitutes a list of promising candidates for further functional characterization. All these data were integrated in EffectorK, a knowledge database where users can have easy access to the Ath-effector protein interactions and explore the resulting interactomic network visually and interactively.
While major efforts were made to capture the maximal diversity on the pathogen side, we limited our work to the Arabidopsis plant model. Thanks to the built-in homology search tools available, users can also use their own data as query regardless of the species studied. It is therefore feasible to use EffectorK as a starting point to build on and extend to crop plant-effector protein interactomics. In the long term, these data could be exploited to better understand how pathogens interact with these crops with the prospect of selecting breeding candidates for improved tolerance or resistance against pathogens.

| Cloning of Rps and Xcc T3E genes
All the cloning of the T3E genes from Rps and Xcc was performed by BP gateway BP or TOPO cloning (Thermo Fisher Scientific, Waltham, MA, USA) to generate pENTRY plasmids, which were later transferred into the appropriate Y2H plasmids (Mukhtar et al., 2011) using the LR gateway reaction (Thermo Fisher Scientific). Table S5 contains all the PCR primers and final plasmid identities describing the collection of plasmids used in this study. Gene sequence information from Rps strain GMI1000 (GenBank accessions: NC_003295 and NC_003296) (Salanoubat et al., 2002) can be obtained from www.ralst o-T3E.org (Sabbagh et al., 2019) and from the published genome of Xcc strain 8,004 (NC_007086) (Qian, 2005).

| Y2H screenings
The Y2H screening was performed in semi-liquid ("8K space" screening) and liquid ("12K space" screening) media as recently reported (Monachello et al., 2019), which is an adaptation of a previously developed Y2H-solid pipeline (Dreze et al., 2010). In both protocols the same low copy number yeast expression vectors and the two yeast strains, Saccharomyces cerevisiae Y8930 and Y8800, were used. The expression of the GAL1-HIS3 reporter gene was tested with 1 mM 3AT (3-amino-1,2,4-triazole, a competitive inhibitor of the HIS3 gene product) unless described otherwise. Prior to Y2H screening, DB-X strains were tested for auto-activation of the GAL1-HIS3 reporter gene in the absence of AD-Y plasmid. In case of auto-activation, DB-X were physically removed from the collection of baits and screened against the (DB)-Ath-cDNA collections using their AD-X constructs.
Briefly, DB-X baits expressing yeasts were individually grown (30 °C for 72 hr) in 50-ml polypropylene conical tubes containing 5 ml of fresh selective media (Sc−leucine, Sc−Leu). Pools were created by mixing a maximum of 72 and 50 individual bait yeast strains for the "8K space" and "12K space", respectively. Subsequently, 120 and 50 µl of these individual pools were plated into 96-well and 384-well low-profile microplates for Ath-cDNA "8K space" and "12K space" collections, respectively. Glycerol stocks of the (AD)-Ath-cDNA "8K space" and "12K space" collections were thawed, replicated by handpicking or using a colony picker Qpix2 XT into 96-well and 384-well plates filled with 120 and 50 µl of fresh selective media (Sc−trypto-

| Database content and manual curation
Binary interactions between Ath proteins with each other and with pathogen effector proteins were compiled on tabular form keeping track of the protein names and accessions, species and ecotypes/ strains of origin, techniques used to detect the interactions and the reference. Ath-Ath protein interactions were compiled from the Arabidopsis Interactome Smakowska-Luzan et al., 2018) and the public databases BioGrid (www.thebi ogrid.org [Stark et al., 2006], downloaded in September 2019) and IntAct (www.ebi.ac.uk/intact [Orchard et al., 2014], downloaded in September 2019). We only kept the direct interactions with the evidence codes "co-crystal structure," "FRET" (fluorescence resonance energy transfer), "PCA" (protein-fragment complementation assay), "reconstituted complex" or "two-hybrid" on BioGrid and "physical association" on IntAct. Ath-effector protein interactions were gathered from our experimental Y2H data together with the similarly produced data on Hpa, Psy, and Gor effectors (Mukhtar et al., 2011;Weßling et al., 2014). In addition, an extensive keyword search on effector-Arabidopsis literature was done to retrieve interactions from 80 published articles. A confidence level was assigned to each interaction depending on the number of independent techniques used in a publication for validation: "1" if the interaction was detected by only one technique and "2" if the interaction was validated by at least a second technique. Some interactions lacked important information but, in order to maximize the extent of our network, several assumptions were taken instead of discarding useful data. First, gene models for Ath proteins were rarely mentioned on publications so we assumed the first gene model available on the latest version of the Arabidopsis genome (Araport11 (Cheng et al., 2017)). Second, when the ecotype/ strain of the organism was not explicitly stated, a generic "NA" (not available) was assigned.

| Computational simulations of random targeting of Ath proteins by single pathogen effectors (intraspecific convergence)
Significance of the intraspecific convergence was tested, comparing our experimental data with random simulations as previously published (Weßling et al., 2014). Briefly, for each effector of Xcc and Rps we assigned randomly the same number of Ath interactors as experimentally observed from the degree-preserved list of 8K proteins.
The distribution obtained from 10,000 simulations was plotted and compared to the experimentally obtained data. The p value of the experimental data were calculated as follows: number of simulations where the number of interactors is lower than or equal to experimentally observed is divided by the number of simulations. When the number of simulations with fewer interactors than observed was zero, the p value was set to <.0001.

| Computational simulations of random targeting of Ath proteins by several pathogen effectors (interspecific convergence)
The significance of the interspecific convergence was tested by comparing our experimental data and previously published data with random simulations as published (Mukhtar et al., 2011;Weßling et al., 2014). Briefly, for each effector of all compared pathogens we assigned the same number of Ath interactors as experimentally observed/published from the list of 8K proteins. The distribution obtained from 10,000 simulations was plotted and compared to experimentally and published data. The p values of the experimental data were calculated as follows: number of simulations where the number of common interactors between species was higher or equal than the experimentally observed is divided by the number of simulations. When the number of simulations with more common interactors than observed was zero, the p value was set to <.0001.

| Overlap of effector interactors
The overlap of effector interactors from the different datasets was calculated without limiting the screening space. For representation of the data, Venn diagrams were generated using the Venn Diagrams tool from VIB-UGent Center for Plant Systems Biology (www.bioin forma tics.psb.ugent.be/webto ols/Venn/). The overlap of effector interactors from the different datasets were calculated not limiting to any limited space. For an area-proportional representation of the data, a Venn diagram was generated using BioVenn (Hulsen et al., 2008).

| Network topology analyses
The topology parameters of the Ath-effector interactomic network were calculated on Cytoscape 3.7.2 (Shannon, 2003). Our analyses focused on two key node parameters: degree and betweenness centrality. The degree of a protein is a measure of its connectivity and denotes the number of proteins interacting with it. Throughout this work, we have differentiated two kinds of degrees: (a) effector degree (i.e. number of interacting effector proteins) and (b) Ath degree (i.e. number of interacting Ath proteins).
The betweenness centrality measures the proportion of shortest pathways between two proteins that passes through a given node.
These parameters were compared against different subsets of data and statistical tests were performed in R language (R Core Team, 2019). The cumulative distributions of these parameters among different subset of data were plotted and the area under the curve was estimated using Simpson's rule with the "Bolstad2" package (Bolstad, 2009).

| Database construction
The databases were built using the software architecture recently described (Carrère et al., 2019). The files submitted by the curator team were automatically checked for typographic mistakes using ad hoc Perl scripts and loaded into a Neo4J database and indexed in an ElasticSearch search engine. Each release was rebuilt from scratch.
Data were made accessible through a web interface (see Results and Discussion) built on Cytoscape.js library (Franz et al., 2016). The raw data used for the database setup are available in the "Data" section of www.effec tork.org and the source code is available at https:// frama git.org/LIPM-BIOIN FO/KGBB.

CO N FLI C T O F I NTE R E S T
None of the authors has a conflict of interest to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are openly available in EffectorK at www.effec tork.org. (1) p value = number of simulations where the number of interactors ≤ experimentally observed number of interactors number of simulations

O RCI D
(2) p value = number of simulations where the number of common interactors ≥ experimentally observed number of common interactors number of simulations