The Open Protein Structure Annotation Network
PDB Keyword


    Table of contents
    1. 1. Protein Summary
    2. 2. Ligand Summary
    3. 3. References

    Title Structure of the first representative of Pfam family PF09410 (DUF2006) reveals a structural signature of the calycin superfamily that suggests a role in lipid metabolism. Acta Crystallogr.,Sect.F 66 1153-1159 2010
    Site JCSG
    PDB Id 2ich Target Id 360535
    Molecular Characteristics
    Source Nitrosomonas europaea atcc 19718
    Alias Ids TPS1448,NP_841447.1, BIG_240 Molecular Weight 37521.90 Da.
    Residues 334 Isoelectric Point 4.99
    Sequence vlapvvpgkalefpqdfgahndfriewwyvtgwletptgkplgfqitffrtateidrdnpshfapdqli iahvalsdpaigklqhdqkiaragfdlayartgntdvklddwifvretdgryrtrieaedftltfiltp sqplmlqgengfsrkgpgapqasyyysephlqvsgiinrqgedipvtgtawldrewsseyldpnaagwd wisanlddgsalmafqirgkddskiwayaalrdasghtrlftpdqvsfhpirtwrsartqavypvatrv ltgetewqitplmddqeldsrasagavywegavtftrdgqpagrgymeltgyvrplsm
      BLAST   FFAS

    Structure Determination
    Method XRAY Chains 2
    Resolution (Å) 2.00 Rfree 0.232
    Matthews' coefficent 2.35 Rfactor 0.182
    Waters 394 Solvent Content 47.34

    Ligand Information


    Google Scholar output for 2ich
    1. TOPSAN: a collaborative annotation environment for structural genomics
    D Weekes, SS Krishna, C Bakolitsa, IA Wilson - BMC , 2010 - biomedcentral.com
    2. Structural classification of proteins and structural genomics: new insights into protein folding and evolution
    A Andreeva, AG Murzin - Acta Crystallographica Section F: Structural , 2010 - scripts.iucr.org
    3. Crystal structure of the Bach1 BTB domain and its regulation of homodimerization
    N Ito, M Watanabe_Matsui, K Igarashi - Genes to , 2009 - Wiley Online Library
    4. Structure of the first representative of Pfam family PF09410 (DUF2006) reveals a structural signature of the calycin superfamily that suggests a role in lipid metabolism
    HJ Chiu, C Bakolitsa, A Skerra, A Lomize - Section F: Structural , 2009 - scripts.iucr.org
    5. Identification of two-histidines one-carboxylate binding motifs in proteins amenable to facial coordination to metals
    B Amrein, M Schmid, G Collet, P Cuniasse, F Gilardoni - Metallomics, 2012 - xlink.rsc.org

    Protein Summary

    Gene NE1406 from Nitrosomonas europaea encodes the NP_841447 protein that belongs to the DUF2006 group (PF09410) with over 400 homologs in bacteria, archaea and fungi. Pfam DUF2006 family overlaps with the COG5621 family of predicted secreted hydrolases (COG5621) and is distantly related to two other Pfam families, hydroxyneurosporene synthases (PF07143) and Svf1-like proteins (PF08622). Very intriguingly, a small family of animal proteins, exemplified by  Drosophila melanogaster protein CG3706,  Caenorhabditis elegans protein C05D2.7, amphioxus protein BRAFLDRAFT_218873 or chicken (?!?) protein XP_414184 seem also be related to the DUF2006 family. While the rather strange distribution in animal genomes  may suggest lateral gene transfer, this protein is present in all known nematode and anthropod genomes and all animal homologs are closely related to each other and rather distant from all bacterial/archaeal proteins.

    Crystal structure of NE1406 consists of two repeats of a novel variant of a up-and-down beta barrel structure, with the first repeat containing a three strand insert to a common fold. This structure has some similarity to a structure of the putative phenolic acid decarboxylase pc05870a from Lactobacillus plantarum, recently solved by JCSG (2gc9) despite very low sequence identity (9% seq id). Proteins with a beta barrel fold are both found as soluble proteins, predominantly as members of the so-called calycin superfamily (Flower 1993), and as integral transmembrane proteins, especially porins in the outer membrane of Gram-negative bacteria or mitochondria (Galdiero et al 2007). Similarly as for pc05870a, the barrel has an elongated shape, which can almost be described as consisting of two sheets, while integral membrane beta barrel proteins are much more symmetrical, almost adopting cylindrical shape.

    Crystal structure of NE1406 contains a ligand (NHE - 2-[n-cyclohexylamino]ethane sulfonic acid) that interacts with conserved Trp49, Trp228, and Trp230 and other residues that are not conserved. NHE is bound at the interface that is formed by the two beta barrel structures within one polypeptide chain. There is Phe71 next to the entrance of the ligand (NHE) binding cleft while two large Trp side chains (49 and 228) sandwich NHE from two sides. There is also an empty space next to the bound NHE surrounded by residues that include conserved Tyr51, Glu346, and Asp213. 

    There is a second, smaller cleft within one of the beta barrel structures harboring glycerol (from the crystallization medium). This ligand is bound close to the end of the beta barrel that is capped by a short 310-helix formed by residues 37-40, starting after a conserved Pro36. The side chains of the conserved residues Glu48 and Trp50 form the bottom of this cavity.

    Another cluster of conserved residues may be important for proper folding (contains conserved: Gly54, Leu56, Phe156, Gly195). The rest of conserved residues includes mainly glycines and prolines that probably are important only for proper structure of this protein (have structural role).

    Analysis of the crystallographic packing of NP_841447.1 using the PQS server indicates that a monomer is the biologically relevant form.


     2ich_Fig1 (4).png


    Figure 1: Structure of NE1406. Stereo ribbon diagram colored from blue to red, N- to C-terminus. Secondary structure elements and termini are indicated.


    SCOP classifies NE1406 as a new fold termed AttH-like. The structure is composed of two domains that likely arose from gene duplication (Fig. 2). The N-terminal domain (residues 24-220) comprises 13 β-strands arranged in the form of a flattened barrel with a 310-helix (H1 in Fig. 1) capping the barrel at one end (Fig. 1). Arranged perpendicularly with respect to the long-axis of the N-terminal barrel, the C-terminal domain (residues 221-352) comprises 10 β-strands and can be superimposed on the N-terminal domain with a main-chain rmsd of 2.4 Å over 105 residues (Fig. 2A) although the sequence identity is insignificant at only 9% (Fig. 2B). Strands β5-β6 are absent from the C-terminal domain while strand β11 is replaced by a loop (Fig. 2A). The 310-helix capping the N-terminal barrel is in the C-terminal domain replaced by two longer strands, β18-β19, that cross-over in the middle to hydrogen-bond again at opposite ends (Figs. 1, 2A).



    Fig2_2ich (1).png


    Figure 2: NE1406 exhibits domain duplication. (A) Stereo ribbon diagram of NE1406 N-terminal domain (residues 24-220, in blue) with NE1406 C-terminal domain (residues 221 to 352, in gray). (B) Structure-guided alignment of the N- and C-terminal domains of NE1406. Identical residues are boxed in orange, conservative substitutions in purple. Ala-74 is underlined to denote the eight-residue break in chain between Ala-74 and Ser-83. The missing region was not modeled due to poor electron density and is likely to be flexible.


    A search with FATCAT using the entire NE1406 sequence gave no significant hits. Individually, the N- and C-terminal domains both showed the closest structural similarity with the outer membrane PagL from Pseudomonas aeruginosa (PDB id: 2ERV) with a main-chain rmsd of 3.4 Å and 3.2 Å over 198 and 160 residues with a sequence identity of 3% and 4% respectively. Other structurally similar proteins superimposable on NE1406 with 0 twists include, for the N-terminal domain, outer membrane proteins (PDB ids: 2JMM, 1K24, 1P4T), a phenolic acid decarboxylase (PDB ids: 2GC9), avidin- and streptavidin-related proteins (PDB ids: 1AVD, 1WBI, 1Y52, 2CIQ, 2UYW, 1STP), fatty-acid binding proteins (PDB ids: 1G5W, 2Q9S), nitrophorin (PDB id: 1D2U, 1U17) and a retinoic-acid binding protein (PDB id: 1BLR). Similar results were obtained for the C-terminal domain.


    Phenolic acid decarboxylases, avidins, fatty-acid binding proteins, nitrophorins and retinoic-acid binding proteins are all members of the calycin superfamily (Flower 1993a, Flower 1993b, Pfam clan CL0116). This structural superfamily with very low global sequence similarity is characterized by an antiparallel 8-stranded β-barrel and a short 310-helix, located directly before the first β-strand, that closes off the barrel at one end. The 310-helix is latched by a conserved cation-pi interaction involving a tryptophan from the first β-strand and a lysine or arginine residue from the final β-strand of the barrel, with both residues additionally forming hydrogen bonds with main-chain atoms in the 310-helix (Flower 2000). This signature is conserved in NE1406 (Fig. 3) and the DUF2006 family. In NE1406, the side-chain of Arg-214 from strand β13 interacts with main-chain residues in both strand β1 and the N-terminal 310-helix whereas hydrogen-bonding of the Trp-50 side-chain to the 310-helix is mediated via a glycerol molecule (Fig. 3). A strictly conserved glutamate (Glu-48 in NE1406) provides additional main-chain bonding to the loop between the 310-helix and strand β1. The calycin signature is absent from the NE1406 C-terminal domain (Fig. 2).




    Figure 3: NE1406 exhibits the calycin superfamily structural signature. Stereo ribbon diagram of the N-terminal domain of NE1406 shows, in red, the stacked arginine and tryptophan residues characteristic of the calycin fold (Flower 2000). Hydrogen bonds are indicated with dashed lines. A glycerol molecule, in cyan, mediates bonding of Trp-50 to the 310-helix. 


    Analysis of the structural superposition of NE1406 with members of the calycin superfamily reveals a number of systematic differences (Fig. 4). The sheets forming the NE1406 barrel are both longer and flatter than the ones seen in calycins, resulting in a narrower opening at the bottom of the barrel, where the binding site of calycins resides. Secondary structure elements, such as the long C-terminal alpha-helix characteristic of lipocalins (e.g. nitrophorin), a structurally and functionally distinct subclass (Flower 2000; Skerra 2000) of the calycins, are also absent from NE1406. Finally, the side-chains of the signature residues are in different conformations than those typically described for calycins, with Trp-50 adopting a different rotamer in NE1406 than the one found in calycins and Arg-214 not adopting a fully extended conformation.


    NE1406 likely provides the first structural template for two additional protein families. A search with HHpred against Pfam gave E-values of 1E-15 and 1.5E-07 for PF07143 and PF08622 respectively. PF07143 is implicated in carotene metabolism while PF08622 is involved in promoting survival during oxidative stress. The role of isoprenoids in photoprotection in plants (Penuelas 2005) and anti-oxidant defence in other eukaryotes (Tapiero 2004, Rao 2008) is well documented. A number of lipocalins, such as apolipoprotein D (Sanchez 2006, Charron 2008, Eichinger et al 2007), neutrophil gelatinase-associated lipocalin (Roudkenar 2008, Goetz et al 2002) and alpha1-microglobulin (Olson 2008, Schönfeld et al 2008) provide protection against oxidative stress by means of isoprenoids. Other members of the calycin superfamily, such as avidins (PF01382) and triabins (PF03973), are not involved in this response. We therefore searched for other indications that NE1406 might be related to the lipocalin/cytosolic fatty-acid binding protein family (PF00061).




    Figure 4: Structural superposition of NE1406 with members of the calycin superfamily. (A) Ribbon diagrams depicting front and back view of NE1406 (PDB id: 2ICH, residues 24-220, in grey) and nitrophorin 4 from Rhodnius prolixus (PDB id: 1D2U, residues 22-205, in green). (B) Ribbon diagrams depicting front and back view of NE1406 (PDB id: 2ICH, residues 24-220, in grey) and avidin from Gallus gallus (PDB id: 1AVD, residues 3-125, in blue). The Trp-Arg signatures are depicted in ball-and-stick. The ligands, heme for nitrophorin 4 and biotin for avidin, are colored in orange.


    In addition to their common structural features, lipocalins are characterized by the presence of three short conserved regions (SCR) in their primary sequence that act as a family signature (see Grzyb 2006 for review). While kernel lipocalins contain all three SCRs, outlier lipocalins possess only one or two SCRs. The sequence of NE1406 shows weak similarity to SCRs as determined for bacterial lipocalins (Bishop 2000). As with lipocalins, the locations of SCR1 and SCR3 in NE1406 match correspond to the first and last strand of the barrel. SCR2 is located in loop regions between strands for both NE1406 and bacterial lipocalins. However, in bacterial lipocalins SCR2 is located between strands β6-β7 whereas in NE1406 it is found between strands β2-β3, making the significance of this limited similarity unclear.


    SCR1 signature from E. coli (PDB id: 2ACO)      38-RYLGTWYEIARTD-50

    SCR1 signature from P. aeruginosa               38-RYQGTWYELARLP-50

    SCR1 signature from C. jejeuni                                    22-NYMGEWLEIARKP-34

    NE1406                                          46-RIEWWYVTGWLET-58


    SCR2 signature from E. coli (PDB id: 2ACO)     121-LDREYRHALV-130

    SCR1 signature from P. aeruginosa              128-VDDDYRTALV-137

    SCR1 signature from C. jejeuni                                  107-VDSEYKVAIV-116

    NE1406                                          77-IDRDNPSHFA-86


    SCR3 signature from E. coli (PDB id: 2ACO)     136-DYLWILSRTP-145

    SCR1 signature from P. aeruginosa              143-EYLWLLSRTP-152

    SCR1 signature from C. jejeuni                                  122-KYLWILARNI-131

    NE1406                                         207-TGTAWLDREW-216



    A search with CastP, identified the largest cavity (820 Å^3) in NE1406 as occurring along interface between the two domains. Residues implicated in this interaction are highly or strictly conserved among DUF2006 homologs, suggesting that the domain interface plays a functional role. The cavity is lined by a large number of aromatic (Trp-49, Tyr-51, Phe-71, Tyr-185, Trp-216, Tyr-220, Trp-228, Trp-230, Phe-244, Tyr-326) and hydrophobic residues (Ile-47, Ile-92, Ala-93, Leu-221 Ile-246, A-322). The NE1406 domain interface is also the region of the DUF2006 family that exhibits the highest degree of sequence conservation. In the structure, this interface harbors a bound buffer molecule (NHE) as ligand. The residues involved in the interaction (Ile-47, Trp-49, Phe-71, Lys-175, Tyr-185, Glu-215, Ser-217 and Tyr-220 from the N-terminal domain, Leu-221, Trp-228, Trp-230, Phe-244, Ile-246 and Glu-346 from the C-terminal domain) are highly or strictly conserved among DUF2006 homologs, suggesting that this interface plays a functional role.


    Lipocalins have been likened to antibodies because of the high degree of structural plasticity that their binding sites exhibit with numerous examples where structural consolidation occurs upon binding (see Skerra 2008 for review). As a result, the lipocalin fold has been employed in a number of protein-engineering studies (Beste 1999, Korndorfer 2003). While it is possible that the two lipocalin-like barrels in NE1406 adopt different conformations in the presence of a ligand, in the crystal structure they lack the large internal cavity that is typical for lipocalins and also the long structurally flexible loops at the open end of the beta barrel (Skerra 2000). In fact, only one of the beta barrel domains of NE1406 harbors a small glycerol molecule as ligand.


    The ability to form dimers is another feature of the lipocalin family, with ligand presence influencing oligomerization (Grzyb 2006). Although NE1406 is not predicted to form a dimer in solution, the relative orientation of the two protein domains within the polypeptide chain could be subject to regulation. With a molecule bound at the domain interface, the two barrels are stabilized in a perpendicular orientation with respect to each other. Binding at the domain interface might also play a role in regulating the shape of the binding cavity within one or both of the beta barrels.


    Finally, some lipocalins, such as Blc, ApoD and lazarillo, are known to be periphally anchored to biological membranes where they are thought to play a role in membrane biogenesis and repair (Bishop 2000). A search with PROFtmb, shows that NE1406 is not predicted to be a transmembrane beta-barrel (Z-score 2.9). However, the loop between strands β2-β3 and the 23 N-terminal residues, both of which lie on the same side of the molecule, are predicted to associate with membranes (Fig. 5), which may be similar to the type of association with lipoprotein particles presumed for ApoD (Eichinger et al 2007).




    Figure 5: Model of NE1406 interacting with a membrane. The dashed lines indicate the regions not modeled in the structure. N indicates the N-terminus.


    Bacterial lipocalins are peripheral outer membrane proteins fixed via a lipoprotein lipid anchor. Expressed under conditions known to exert stress on the bacterial envelope, the Blc from E. coli has a high affinity for lysophospholipids (LPLs), which may also be bound inside the beta-barrel, and is thought to be involved in cell envelope LPL transport (Campanacci 2006). Although the exact mechanisms of transperiplasmic movement of lipids between inner and outer membranes are largely unknown, ATP-binding cassette transporters are involved in this process (Doerrler 2004, Reyes 2005).


    [Dr. Andrei Lomize?]


    The genome context (http://string.embl.de) of NE1406 shows a predicted functional association with the lipoprotein-releasing system ATP-binding protein LolD (lolD), a putative uroporphyrin-III C-methyltransferase (NE1403), an enzyme implicated in heme biosynthesis, and co-occurrence with an ATP-binding protein ABC transporter (NE1404). ATP-dependent ABC transporters show a high degree of confidence in a functional association with most members of the DUF2006 family, as do other transmembrane proteins including Na+/H+ antiporters, sensor histidine kinases, lipoproteins (e.g. LprI precursor in Mycobacterium tuberculosis). The systematic presence of ATP-dependent cassetes and lipoproteins is compatible with a role for the DUF2006 family in energy-requiring lipid transport, while the presence of numerous signal transduction genes might indicate expression under specific conditions, such as environmental stress. Further experiments will be required to functionally characterize NE1406,  determine whether it associates with lipids in vitro or in vivo and whether its transcription is subject to environmental regulation.


    Given the wide phylogenetic presence of the DUF2006 family, if an experimental connection to calycins or lipocalins is determined, this would present the first evidence of a lipocalin-related protein in the Archaea domain and would settle the question of whether or not this family may have arisen via horizontal transfer to eukaryotic cells from the endosymbiotic alpha-proteobacterial ancestor of the mitochondrion (Bishop 2000).


    (Adam 2008)[Ref]

    (Tam 1993)

    (David 2003)


    Ligand Summary





    1. (No Results)


      Discuss this publication
    You must login to post a comment.
    All content on this site is licensed under a Creative Commons Attribution 3.0 License
    Powered by MindTouch