Topological analysis of a bacterial DedA protein associated with alkaline tolerance and antimicrobial resistance

Maintaining membrane integrity is of paramount importance to the survival of bacteria as the membrane is the site of multiple crucial cellular processes including energy generation, nutrient uptake, and antimicrobial efflux. The DedA family of integral membrane proteins are widespread in bacteria and are associated with maintaining the integrity of the membrane. In addition, DedA proteins have been linked to resistance to multiple classes of antimicrobials in various microorganisms. Therefore, the DedA family are attractive targets for the development of new antibiotics. Despite DedA family members playing a key physiological role in many bacteria, their structure, function and physiological role remain unclear. To help illuminate the structure of the bacterial DedA proteins, we have performed substituted cysteine accessibility method (SCAM) analysis on the most comprehensively characterized bacterial DedA protein, YqjA from Escherichia coli. By probing the accessibility of 15 cysteine residues across the length of YqjA using thiol reactive reagents, we have mapped the topology of the protein. Using these data, we have experimentally validated a structural model of YqjA generated using evolutionary co-variance, which consists of an α-helical bundle with two re-entrant hairpin loops reminiscent of several secondary active transporters. In addition, our cysteine accessibility data suggests that YqjA forms an oligomer wherein the protomers are arranged in a parallel fashion. This experimentally verified model of YqjA lays the foundation for future work in understanding the function and mechanism of this interesting and important family.


Introduction 1
The bacterial cell envelope is involved in many crucial processes from interaction with the 2 environment, signaling and nutrient uptake to energy production. As such, maintaining the integrity 3 of the membrane is of paramount importance to the survival of the bacterium during the various 4 rigours of its existence. Therefore, targeting proteins involved in the maintenance of membrane 5 integrity has great potential for the production or enhancement of antimicrobials. One such protein 6 family is the DedA superfamily of integral membrane proteins that are found widespread in bacteria, 7 including many human pathogens 1,2 . Members of the DedA superfamily are enigmatic; their 8 function and physiological role remain unclear. However, the effects of disrupting DedA function 9 are substantial. The E. coli genome encodes 8 DedA genes and they are collectively essential 3 ; in 10 addition, the single DedA gene encoded by the Borrelia burgdorferi genome is also essential 4 , 11 indicating a crucial physiological role for this protein family. Deletion of the genes encoding two 12 DedA family members in E. coli, YqjA and YghB, results in a pleiotropic phenotype including 13 sensitivity to elevated temperatures and pH 5-8 , cell division defects caused by an inability to secrete 14 periplasmic amidases 9 ; and sensitivity to multiple antimicrobial agents 10 . This sensitivity to 15 antimicrobial compounds can be mitigated by reducing the pH of the growth medium thus 16 increasing the proton motive force, increasing the extracellular Na + concentration, or by 17 overexpression of mdfA, which encodes a proton-coupled multidrug efflux transporter, which also 18 moonlights as a monovalent cation/H + exchanger 10 . These findings, and the observation that two 19 conserved, membrane embedded acidic residues are essential for function 10 , strongly suggests 20 that members of the DedA superfamily have transport activity likely involving proton flux. Beyond 21 E. coli, members of the DedA superfamily are required for colistin resistance in Klebsiella 22 pneumoniae and Burkholderia thailandensis [11][12][13] , and are involved in resistance to cationic 23 antimicrobial peptides (CAMP) in Salmonella enterica and Neisseria meningitidis 14,15 , and the 24 macrocyclic alkaloid, halicyclamine A in Mycobacterium bovis 16 , further highlighting the broad 25 potential impact of understanding DedA structure and function for the treatment of drug-resistant 26

infections. 27
Understanding the function and mechanism of the DedA family has been hampered by a lack of 28 experimentally derived structural information. Based on hydropathy profile alignments it has been 29 proposed that the DedA family share a similar fold to the LeuT family of transporters 17 . LeuT 30 transporters consist of an inverted structural repeat consisting of 5 transmembrane helices that 31 evolved via gene duplication and inversion of a 5 TM progenitor 18 . Due to the similarity of their 32 hydropathy profiles which reports on general structural features, the DedA superfamily was 33 proposed to be the 5 TM LeuT progenitor, suggesting that DedA proteins form dual topology 34 oligomers in the membrane 17 . More recently, evolutionary co-variance analysis has been used to 35 generate a 3D model for members of the DedA superfamily, suggesting the presence of 2 re-entrant 1 hairpin loops 2,19 . While this topology has been partially experimentally verified for a human DedA 2 superfamily member 2 , there has been no experimental validation of the structural arrangement of 3 any bacterial DedA protein, which are distantly related to the human homologues and may have 4 diverged in function 1 . 5 Here, we have sought to gain a better understanding of the structure of bacterial DedA proteins by 6 performing substituted cysteine accessibility method (SCAM) on the most comprehensively 7 characterized DedA protein, YqjA, from E. coli. Our data reveal that the accessibility of several 8 residues does not match what is expected based on topology model prediction software that is 9 traditionally used to investigate membrane protein topology. However, our SCAM data do support 10 a structural model generated using evolutionary co-variance analysis, which, in support of previous 11 studies 2,19 , predicts a compact structure for YqjA comprised of 2 re-entrant hairpins. Furthermore, 12 our data strongly indicate that YqjA forms an oligomer in which the subunits are arranged in a 13 strictly parallel fashion. 14 15

13
Phenotypic rescue assays 14 To perform the temperature sensitivity rescue assay, BW25113wt or BW25113∆∆ cells were freshly 15 transformed with arabinose-inducible pBAD-based plasmid encoding a variant of yqjA or gltph (a 16 non-DedA membrane protein to be used to control for the effects overexpression of membrane 17 proteins has on bacterial growth). The transformed strains were grown overnight in LB 18 supplemented with 100 µg/ml ampicillin, harvested, normalized to an OD600 of 1 using, then serially 19 diluted. 5 µl of each dilution was spotted onto LB agar supplemented with 100 µg/ml ampicillin and 20 0.001% (w/v) L-arabinose. Once dry, the plates were incubated at a permissive temperature of 21 To perform SCAM, E. coli TOP10 were freshly transformed with pBAD plasmid encoding a variant 25 of yqjA upstream of a histidine tag. A single colony of transformed cells was grown overnight at 26 37 o C in LB supplemented with 100 µg/ml ampicillin. Following overnight incubation, the cultures 27 were diluted to an OD600 of 0.2 using fresh LB supplemented with ampicillin and grown at 30 o C for 28 1.5 h. Protein expression was then induced by addition of 0.1% (w/v) L-arabinose and the cells 29 were grown for 1 h. Cells were harvested, resuspended in PBS to an OD between 0.6-0.8 then 30 divided into 4 equal volume samples. One sample was incubated with 10 mM MTSES, one was 31 incubated with 10 mM NEM, and the 2 remaining samples were incubated in the absence of thiol 32 reactive reagent (replaced with an equal volume of water). The samples were incubated at room 33 temperature in the dark for 1 hour. The treated cells were harvested by centrifugation, washed with 34 PBS to remove excess MTSES and NEM, resuspended in lysis buffer (15 mM Tris pH 7.6, 1% (w/v) 35 SDS, 6.2 M urea), 6.25 mM mPEG5K, DNase and protease inhibitors, and incubated at room 1 temperature for 1 hour. SDS-PAGE sample buffer was added to the samples, which were then 2 separated using SDS-PAGE and visualized using Wester blotting with an anti-his tag antibody 3 (Invitrogen). 4 5 Copper phenanthroline-based crosslinking 6 Cell samples containing overexpressed YqjA variants were prepared as described above for the 7 SCAM assay. Once prepared, the samples were incubated at room temperature in the presence of Structural models for YqjA were generated using EVfold 25 using the default parameters. High quality 24 modelling data were generated for various bitscore thresholds (0.1, 0.3, 0.5 and 0.7). We selected 25 the model generated from the 0.7 threshold dataset due to it having the most sequence coverage 26 and because it produced the highest number of feasible models. Models were visualized using 27 PyMol, which was also used to generate the structural images. 28

Results 1
Consensus topological analysis suggest YqjA contains 5 transmembrane helices 2 To provide preliminary information on the topological arrangement of YqjA, we determined a 3 hydropathy plot of YqjA, which provides an indication of the level of hydrophobicity in the primary 4 sequence. Analysis of the YqjA hydropathy plot reveals 6 distinctly hydrophobic regions that likely 5 relate to transmembrane spanning regions (Fig. 1A). To take this analysis further, we used 6 TOPCONS and TMHMM to generate a consensus predicted topology for YqjA 23,24 . TOPCONS 7 compares the topology predictions from 5 separate topology prediction algorithms (OPTOPUS,8 Philius, Polyphobius, SCAMPI and SPOCTOPUS) and produces a consensus predicted topology. 9 Comparison of the outputs from different prediction algorithms revealed substantial variation in the 10 number of predicted transmembrane regions and the location of N-terminus; 4 out of the 5 11 algorithms predicted 5 TMs with a Nout/Cin orientation, although there was variation in the location 12 of the TMs (Fig. 1B). Philius predicted 6 TMs with a Nin/Cin orientation, which was also the same 13 prediction made by TMHMM (Fig. 1B). On balance of all the predictions generated, the consensus 14 topology model for YqjA is 5 TM with Nout/Cin (Fig. 1C). 15

SCAM analysis of YqjA reveals a topological map incongruent with a 5 TM model 17
To experimentally probe the membrane topology of YqjA, we performed substituted cysteine 18 accessibility method (SCAM). While there are many variants of the SCAM approach 26 , the basic 19 premise is that single cysteines are individually introduced into an integral membrane protein, and 20 their location in relation to the membrane (periplasmic or cytoplasmic) is assessed by the 21 accessibility of each cysteine to thiol-reactive reagents that are either permeable or impermeable 22 to the intact membrane. Here, we employed an approach similar to SCAM approaches used 23 To perform SCAM on YqjA, we introduced single cysteine residues individually into a cysteine-free 25 variant of YqjA to produce a library of single cysteine mutants. Each member of the mutant library 26 was then expressed independently in E. coli from a plasmid in-frame with a C-terminal histidine tag 27 allowing for detection of expressed YqjA in whole cell extracts via Western blotting. We harvested 28 the cells expressing the cysteine variants and incubated samples with either 2-sulfonatoethyl 29 methanesulfonate (MTSES), which is a cysteine-reactive reagent impermeable to the inner 30 membrane (but can traverse the outer membrane), or N-ethylmaleimide (NEM), which is permeable 31 to both E. coli membranes. Thus, MTSES would only conjugate to cysteines accessible to the 32 periplasm, whereas NEM could react with cysteines accessible to both the periplasm and 33 cytoplasm. Cysteine residues buried in the protein core or in the middle of a transmembrane region 34 would likely react with neither MTSES nor NEM. Conjugation of the introduced cysteines to MTSES 35 or NEM would protect that position from further thiol-specific reactions. Thus, the level of 1 protection afforded by MTSES and NEM to further reaction is an indicator of its position relative to 2 the membrane. To assess the level of thiol protection, we solubilised the membrane and denatured 3 the protein using SDS, and incubated the sample with methoxypolyethylene glycol maleimide 4 (mPEG5K), which reacts with free cysteines to add 5 kDa mass to the protein that would be 5 separable from the unmodified protein using SDS-PAGE. We then visualized YqjA using Western 6 blotting with an anti-his tag antibody to assess the level of YqjA PEGylation.

36
To perform SCAM analysis of YqjA, we first needed to produce a cysteine-free version of YqjA by 1 mutating the two native cysteines, C83 and C191, to serine (hereafter referred to as YqjAcysless). 2 To test that the YqjAcysless was fully functional, we expressed the mutated gene from an 3 arabinose-inducible plasmid and assessed its ability to restore growth in the double dedA deletion 4 E. coli strain, BW25113∆yqjA∆yghB (BW25113∆∆, hereafter), which is unable to grow at elevated 5 temperatures (the same phenotype seen for the well characterized strain BC202, which has the 6 same double dedA deletion in an E. coli W3110 background 5 ). Expression of both wildtype YqjA 7 (YqjAwt) and YqjAcysless restored growth to BW25113∆∆ at 44 o C, whereas expression of an 8 unrelated integral membrane protein, the aspartate transporter GltPh, was unable to restore growth 9 at this temperature, demonstrating that YqjAcysless was functional and folded (Fig 2A). Using 10 YqjAcysless as a background, we generated a panel of 15 single cysteine YqjA mutants with 11 cysteines distributed throughout the amino acid sequence of YqjA (Fig. 1C). We established that 12 all 15 single cysteine YqjA mutants were functional as demonstrated by their ability to restore 13 growth to BW25113∆∆ at elevated temperatures ( Fig. 2A), giving us confidence that they would 14 accurately report on the topology of the fully folded protein.

16
Each of the single cysteine variants was expressed in E. coli and subjected to the SCAM procedure 17 described previously. Treatment of each of the 15 cysteine mutants with mPEG5K alone resulted 18 in a single higher molecular weight band demonstrating that each protein contained a single 19 cysteine, which was able to react with cysteine reactive reagents (Fig. 2B). Based on the 20 assumption that MTSES would only protect against PEGylation for cysteines exposed to the 21 periplasm, and NEM would protect cysteines exposed to both the periplasm and cytoplasm, our 22 SCAM data suggested that A20C, S52C, V59C and V180C were exposed to the periplasm, L53C, 23 V55C, V99C, L117C and G217C are exposed to the cytoplasm, and L38C, C83, L139C, L162C, 24 C191, and V200C are all buried in the membrane/protein core (Fig. 2B). 25 Mapping these experimentally-defined accessibility measurements onto the 2D topology models 26 revealed that while most of the locations determined by SCAM matched the predicted topology, 4 27 positions were the complete opposite; V99C and L117C were predicted to be periplasmic, but 28 were located in the cytoplasm by SCAM, whereas S52C and V59C were predicted to be 29 cytoplasmic but were located in the periplasm according to our SCAM data (Fig. 2C). These data 30 suggest that YqjA adopts a substantially different arrangement than that represented in the 2D 31 models.  While performing our SCAM analysis, we noticed that unlike all the other single cysteine mutants, 13 V180C had a substantial band in the Western blot at approximately 50 kDa. This ~50 kDa band 14 was present in all of the samples for V180C, but was most prominent in the sample that was 15 incubated in absence of any cysteine labelling reagent (Fig. 2B). The molecular weight of this band 1 corresponds approximately with that of dimeric YqjA, which we reasoned was likely stabilized by 2 inter-protomer disulfide formation between V180C residues, as has been seen previously for C191 3 in YqjA 30 . To investigate this possibility further, we incubated YqjAV180C-expressing cells with 4 increasing concentrations of the oxidizing agent copper phenanthroline (CuPhen) and observed a 5 CuPhen concentration-dependent increase in the ~50 kDa band intensity with a concurrent 6 decrease in the intensity of the band corresponding to the YqjA monomer (Fig. 3A). In addition, we 7 observed no higher molecular weight band in the presence of reducing agent dithiothreitol (DTT), 8 nor when YqjAcysless was incubated with the same range of CuPhen concentrations (Fig. 3A). To 9 investigate whether this phenomenon was specific to V180C and to rule out the possibility that we 10 are observing crosslinking of the cysteines after the protein has been denatured for SDS-PAGE 11 analysis, we incubated YqjAA20C-expressing cells with the same CuPhen concentrations; A20C is 12 located on the periplasmic side of the YqjA but in a different region of the protein to V180C. We 13 observed no higher molecular weight band in the absence of CuPhen for YqjAA20C, and only very 14 minimal apparent crosslinking at the highest CuPhen concentration (Fig. 3A). Taken together, our 15 data suggest that V180C from two YqjAs are able to form an intermolecular disulfide, which is 16 formed when the protein is folded in the membrane. Due to the close proximity required to form a 17 disulfide, these data suggest that YqjA is an oligomer (although the oligomeric state is not possible 18 to glean from these data), the region containing V180C likely forms an oligomeric interface, and 19 due to V180C being located on the periplasmic side of the protein, this conclusion suggests that 20 the proteins involved in disulfide bond formation are arranged in a parallel fashion, contrary to 21 previous suggestions that DedA form an antiparallel dual topological arrangement (Fig. 3B) 17 . 22

SCAM analysis supports a model for YqjA based on evolutionary covariance analysis. 24
Ab initio models of members of the DedA superfamily have been generated using evolutionary 25 covariance analysis using trRosetta, which suggest DedA proteins form an a-helical bundle and 26 contain 2 re-entrant hairpin loops 2,19 . To generate a 3D model of YqjA on which to map our SCAM 27 data, we also used evolutionary covariance analysis, but with EVfold, which has also been used 28 previously for a DedA family member, but distantly related eukaryotic homologue of YqjA 2 .

21
EVfold analysis of YqjA produced high quality modelling data (according to the EVfold output) for 22 datasets with bitscores ranging from 0.1 (48199 sequences) to 0.7 (3917 sequences). For the 23 generation of the working model for our analysis, we selected the model generated using the 24 dataset with a bitscore of 0.7 because it provided the best protein sequence coverage and the 25 majority of the models it produced (29 out of the top 30 scoring models) modelled the C-terminal 26 helix to produce a sensible arrangement (overlay of the top 30 scoring models is shown in SI fig.  27 1A). One model generated using the 0.7 bitscore dataset arranged the C-terminal helix in a position 28 where it would be inserted into the core of the bilayer, which we do not consider feasible, and not 29 compatible with our experimental data (SI fig. 1A). While the highest scoring model generated using 30 the greatest evolutionary depth dataset (bitscore of 0.1) produced a 3D model similar to 0.7 31 bitscore dataset, the majority of the other high scoring models generated using this dataset 32 mishandled the C-terminal helix to produce a series of what we consider to be unfeasible structures 33 that are also inconsistent with our experimental data (overlay of the top 30 scoring models from 34 the 0.1 bitscore dataset is shown in SI fig. 1B). This mishandling of the C-terminal helix is likely due 35 to this region of YqjA being involved in the formation of an oligomeric interface. 36 As with previous ab initio models generated for DedA superfamily members 2,19 , the structural model 1 for YqjA generated using EVfold produces an a-helical bundle consisting of 3 membrane spanning 2 helices, 2 re-entrant hairpin loops and a short a helix perpendicular to the cytoplasmic side of the 3 membrane (Fig. 4 A and B). While the evolutionary co-variance approach cannot assign membrane 4 orientation, all of the in silico prediction software predict a cytoplasmic N-terminus (Fig 1B), which 5 also matches our experimental SCAM data. The tips of the two predicted re-entrant hairpins, which 6 we have named HPin and HPout, are predicted to meet in approximately the centre of the 7 membrane (Fig. 4C). Mapping on the known functionally essential residues, E39, D51, R130 and 8 R136 6,30,31 , we note with interest that they are clustered at the interface of these 2 predicted hairpins, 9 suggesting that this region constitutes a crucial active/binding site, and providing support for this 10 structural model (Fig. 4D).

31
Mapping our experimental SCAM data onto the EVfold structural model, we find that the cysteine 1 protection we observed by MTSES and NEM is structurally rationalized for 13 out of the 15 2 positions tested (Fig. 4E). A20C, S52C, V59C, and V180C, which were all protected by the 3 membrane impermeable MTSES, suggesting they are accessible to the periplasmic side of the 4 membrane, are clearly positioned on the periplasmic side of the protein with 3 of the positions 5 located in loop regions (Fig. 4E, blue residues). L38C, C83, C191, L139C, L162C, and L200C, which 6 were protected by neither MTSES nor NEM, suggesting they are embedded in the 7 protein/membrane core, form a band around the centre of the protein which is likely membrane 8 embedded, explaining the lack of accessibility (Fig. 4E, orange residues). V99C, L117C and G217C, 9 which were only protected by NEM, suggesting a cytoplasmic location, are positioned in 10 cytoplasmic loops in the structural model. 11 Residues L53C and V55C are predicted to be positioned in the arm of HPout on the periplasmic 12 side of the protein, but are only protectable by NEM (Fig 4E and 2B). While this may make the 13 structural model and SCAM data seem incongruent, the exclusive accessibility of L53C and V55C 14 to NEM can be explained by one of two possibilities; NEM, which is relatively hydrophobic 15 compared to the negatively charged MTSES, is able to penetrate deeper into hydrophobic pockets 16 on the periplasmic side of YqjA; or, conformational changes occur in the re-entrant hairpin loops, 17 as seen for hairpin-containing secondary active transporters [32][33][34][35] , expose that region of HPout to 18 the cytoplasmic solution. The fact that S52C, which is obviously proximal to L53C and V55C due 19 to its position in the primary sequence, is protectable by MTSES, which cannot penetrate the 20 bilayer, very strongly supports the positioning of this region on the periplasmic side of the protein. Therefore, our SCAM data support the EVfold model, but more information on the structure and 22 dynamics of YqjA is required to fully rationalize all of the SCAM data collected. To provide further 23 support for the EVfold model, we also obtained the structural model for YqjA from the Alphafold 24 Protein Structural Database 36 . The Alphafold model of YqjA has the same overall topology as the 25 EVfold model and contains the two re-entrant hairpin loops (SI fig. 2). In addition, the accessibility 26 of each cysteine from our SCAM data can be rationalised similarly in both models. The primary 27 difference between the two YqjA models is the positioning of the C-terminal helix, which overlays 28 HPin in the Alphafold model but is adjacent to HPout in the EVfold model (SI fig. 2B). However, 29 with the C-terminal helix likely involved in homo-oligomer interface formation, it is a problematic 30 region to model using evolutionary coupling due to the difficulty in differentiating between intra-31 and inter-protomer residue coupling. In this study, we have performed a cysteine accessibility scan on the most comprehensively 2 characterized DedA protein, YqjA from E. coli and determined the relative position of 15 individually 3 substituted cysteine residues to the membrane. By comparing our experimental SCAM data to a 4 2D topological model obtained using traditional approaches, we find our experimental data in 5 disagreement with the 2D topology map in several positions. However, our SCAM data is in good 6 agreement with a structural model generated using evolutionary covariance. This is the first 7 experimental verification of a 3D structural model of a bacterial DedA family member. The structural 8 model of YqjA predicts the presence of 2 re-entrant hairpin loops, the tips of which converge to 9 bring 4 functionally relevant amino acids into close proximity, likely forming the binding or active 10 site of the protein. This experimentally tested model supports a recent model of YqjA generated 11 using a similar modelling approach 19 , and with a structural model of a distantly related eukaryotic 12 DedA superfamily member 2 . Our SCAM analysis of YqjA also suggests that V180 is able to form 13 interprotomer disulfide bonds, which strongly suggests that YqjA is an oligomer, and due to the 14 position of V180 on the periplasmic side of the membrane, these data indicate that the oligomer 15 forms a parallel arrangement in the membrane. substrate interactions. Re-entrant loops are also thought to be involved in gating access to the 35 binding site and undergo conformational changes by which they control ingress and egress of the 1 substrate(s) to and from the binding site 32,33,35,44 . Four functionally essential residues have been 2 identified in YqjA; E39, D51, R130 and R136 6,30,31 , and while the exact role these residues play in 3 the function of YqjA is unknown, they coalesce at the tips of the re-entrant hairpins, demonstrating 4 the importance of this structural motif in YqjA. While direct functional measurements are yet to be 5 made for any member of the DedA superfamily, there is strong circumstantial evidence that at least 6 one of the functions of YqjA is as a monovalent cation/H + exchanger. For example, YqjA is required 7 for growth at high pH 8 , and alkalinotolerance mechanisms often involve Na + or K + /H + exchangers, 8 e.g. NhaA and MdtM 45-47 . In addition, YqjA contains membrane embedded acidic residues that are 9 essential for its ability to rescue growth defects caused by disruption of yqjA and yghB genes in E. 10 coli 10 . Furthermore, the growth defects observed in the E. coli strains in which yqjA and yghB are 11 disrupted can be rescued by lowering the external pH to artificially bolster the proton motive force 12 (PMF), by increasing the external monovalent cation concentration, or by overexpressing mdfA 13 which encodes an multidrug efflux pump that also has Na + /H + activity 10 . Due to the positioning of 14 the well-conserved, functionally important residues at the tips of the re-entrant hairpins, it is likely 15 that this is the substrate binding site with the binding and release of protons by E39 and D51 a 16 central part of the mechanism. However, more structural information is required to resolve the 17 details of this protein region, identify the exact position of the binding site, identify the ligand(s), 18 and to assess the conformational dynamics that may be required for its mechanism. 19 20

YqjA likely forms a homooligomer 21
Our SCAM data show that the YqjAV180C variant produced a higher-than-expected molecular 22 weight band on the Western blot (Fig. 2B), which we demonstrated is likely due to disulfide 23 formation between cysteines in the same position in two protomers, suggesting that YqjA forms an 24 oligomer. The fact that only one of the cysteine mutants gave a dimeric species indicates that this 25 dimerization is likely due to close proximity of the V180 residues. It has previously been suggested 26 that E. coli YqjA is able to form an oligomer due to the observation that the native cysteine residue 27 between C191 and L195C in YqjA, it has been asserted that the protein forms a dimer. However, 33 as disulfides would only ever form between a maximum of 2 cysteines, a dimeric band in the 34 membrane would also be observed if YqjA formed a higher oligomeric state, for example, a trimer 35 or a tetramer. Therefore, more detailed biochemical analyses of purified DedA superfamily 1 members is required before a definitive oligomeric state can be concluded. Based on hydropathy profile alignments using the AlignMe program, it was hypothesized that the 5 DedA family shares a common ancestor with LeuT-fold transporters 48 . The leuT fold core is 6 composed of an inverted structural repeat of 5 transmembrane helices that evolved via duplication 7 and fusion of an ancestral gene 49,50 . However, for the repeats to be inverted, the ancestral gene 8 would need to produce a protein that was able to reside in the membrane in a mixed topology. Due 9 the marked similarity between the hydropathy profiles of certain DedA proteins and one repeat of 10 selected LeuT fold proteins, it was suggested that the DedA family could represent a "half-module" 11 of the LeuT fold 17 . Our SCAM data demonstrate that YqjA resides in the membrane in a single 12 orientation. Firstly, if YqjA had a mixed topology in the membrane then we would expect to observe 13 incomplete protection by MTSES, which is impermeable to the membrane, for all positions 14 predicted to be accessible to the periplasm or cytoplasm. However, in all cases, MTSES fully 15 protects cysteines accessible to the periplasm, but has no effect on cysteines located in the 16 cytoplasm. Secondly, V180C is able to form an interprotomer disulfide bond, and as this position 17 is predicted, and shown experimentally, to be present in a periplasmic loop, this observation 18 strongly suggests that the two crosslinked protomers are adopting the same orientation in the 19 membrane. Taken together, these data demonstrate that YqjA is present in the membrane in a 20 single orientation. However, what is true for E. coli YqjA may not be the case for other members of 21 the DedA superfamily as there are clearly different functional DedA groups even within the same 22 organism 3 , and there is strong evidence that some DedA proteins have ambiguous charge bias 17 , 23 which could allow them to be dual topology. Therefore, it is possible that some DedA proteins 24 adopt a dual topology, but further work is required to resolve this issue. 25

26
Here, we have combined evolutionary covariance modelling with cysteine labelling to generate and 27 experimentally validate a structural model of the archetypal DedA protein, YqjA from E. coli. This 28 work provides insight into the architecture of bacterial DedA proteins and will aid in the delineation 29 of the function and mechanism of this widespread, physiologically important protein family.   Rainbow-coloured ribbon representation of the Alphafold (left) and EVfold (right) models of YqjA 30 oriented according to the hairpin loop positioning. Note the main difference between the models 31 being the positioning of the C-terminal helix coloured in red in both cases. B) The same models as 32 in A), but with the predicted re-entrant hairpin loops highlighted (HPout in blue, HPin in orange). 33