NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1937369563|ref|NP_148981|]
View 

protein transport protein Sec31A [Rattus norvegicus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 super family cl29593
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
84-340 2.58e-29

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


The actual alignment was detected with superfamily member cd00200:

Pssm-ID: 475233 [Multi-domain]  Cd Length: 289  Bit Score: 118.98  E-value: 2.58e-29
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563   84 DVSGVLIAGGENGNIILYDPSkiiagDKEVVIAQKDkHTGPVRALDVnIFQTNLVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     19 PDGKLLATGSGDGTIKVWDLE-----TGELLRTLKG-HTGPVRDVAA-SADGTYLASGSSDKTIRLWDLETGECVRTLTG 91
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  164 KTQppeDISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvaTQMVLASEDDRLpvVQM 243
Cdd:cd00200     92 HTS---YVSSVAFSPD-GRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNS--VAFSPD--GTFVASSSQDGT--IKL 161
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  244 WDLRfASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAAS 323
Cdd:cd00200    162 WDLR-TGKCVATLTGHTGEVNSVAFS-PDGEKLLSSSSDGTIKLWDLSTGKCLGTLRGHENGVNSVAFSP-DGYLLASGS 238
                          250
                   ....*....|....*..
gi 1937369563  324 FDGRIRVYSIMGGSIDG 340
Cdd:cd00200    239 EDGTIRVWDLRTGECVQ 255
ACE1-Sec16-like super family cl14807
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
573-767 5.95e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


The actual alignment was detected with superfamily member cd09233:

Pssm-ID: 449359 [Multi-domain]  Cd Length: 314  Bit Score: 59.19  E-value: 5.95e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  573 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKyFAKSQSKIT---RLITAVVMKNWKEIVESC---- 644
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  645 -----DLKNWREALAAVLTYAKPD-EFSALCDLlgarleseGDSLLRTQ----ACLCYICAGnverlvacwtkAQDGSNP 714
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALVEL--------GDLLAQRGlveaAHICYLLAG-----------VPLGPYP 207
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  715 LS-------LQDLIEKVVILR--KAVQLT------QALDTNTVG--ALLAEKMsQYANLLAAQGSIAAAL 767
Cdd:cd09233    208 SSpsscllgGAVHNKSPRTFAtpEAIQLTeiyeyaLSLGNPQFGlpHLQPYKL-IHAARLAELGLVSEAL 276
PHA03247 super family cl33720
large tegument protein UL36; Provisional
795-1146 1.04e-07

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 56.87  E-value: 1.04e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  795 PGQESSRSSYEGQPLPKGGPGPLAGH--PQVSRVQ-SQQYYPQVRIAPTVTTWSDRTPTALPshPPAAcpSDTQGGNPPP 871
Cdd:PHA03247  2628 PPSPSPAANEPDPHPPPTVPPPERPRddPAPGRVSrPRRARRLGRAAQASSPPQRPRRRAAR--PTVG--SLTSLADPPP 2703
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  872 PGfimhgnvVPNSPAPLPTSPGhmhsqpppypqpqpyQPAQQYSFGTGGSAVYRPQQPVAPPAsnayPNAPYVSPVASYS 951
Cdd:PHA03247  2704 PP-------PTPEPAPHALVSA---------------TPLPPGPAAARQASPALPAAPAPPAV----PAGPATPGGPARP 2757
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  952 GQPQMYTAQPASSPTSSSAPLPPPPSSGASFQHGGPGAPPSSSAYALPPGTTGTPPAASELPASQRTGPqngwNDPPALN 1031
Cdd:PHA03247  2758 ARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAG----PLPPPTS 2833
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563 1032 RVPKKKKLPENFMPPVPITSPIMNPGGDPQ---PQGLQQQPSASGPRSSHASFPQPHLAggqpfhgiQQPLAQTGMPPSF 1108
Cdd:PHA03247  2834 AQPTAPPPPPGPPPPSLPLGGSVAPGGDVRrrpPSRSPAAKPAAPARPPVRRLARPAVS--------RSTESFALPPDQP 2905
                          330       340       350
                   ....*....|....*....|....*....|....*...
gi 1937369563 1109 SKPNTEGAPGAPIGNTIQHVQALPTEKITKKPIPDEHL 1146
Cdd:PHA03247  2906 ERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPL 2943
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
84-340 2.58e-29

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 118.98  E-value: 2.58e-29
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563   84 DVSGVLIAGGENGNIILYDPSkiiagDKEVVIAQKDkHTGPVRALDVnIFQTNLVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     19 PDGKLLATGSGDGTIKVWDLE-----TGELLRTLKG-HTGPVRDVAA-SADGTYLASGSSDKTIRLWDLETGECVRTLTG 91
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  164 KTQppeDISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvaTQMVLASEDDRLpvVQM 243
Cdd:cd00200     92 HTS---YVSSVAFSPD-GRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNS--VAFSPD--GTFVASSSQDGT--IKL 161
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  244 WDLRfASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAAS 323
Cdd:cd00200    162 WDLR-TGKCVATLTGHTGEVNSVAFS-PDGEKLLSSSSDGTIKLWDLSTGKCLGTLRGHENGVNSVAFSP-DGYLLASGS 238
                          250
                   ....*....|....*..
gi 1937369563  324 FDGRIRVYSIMGGSIDG 340
Cdd:cd00200    239 EDGTIRVWDLRTGECVQ 255
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 6.26e-24

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 105.76  E-value: 6.26e-24
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563   89 LIAGGENGNIILYDpskiIAGDKEvvIAQKDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTQPp 168
Cdd:COG2319    177 LASGSDDGTVRLWD----LATGKL--LRTLTGHTGAVRSVAFS-PDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGS- 248
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  169 edISCIAWNRQVQHiLASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDvATQMVLASEDDRlpvVQMWDLRf 248
Cdd:COG2319    249 --VRSVAFSPDGRL-LASGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLWDLA- 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  249 ASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRI 328
Cdd:COG2319    319 TGKLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSPDGRTLAS-GSADGTV 396

                   ....*
gi 1937369563  329 RVYSI 333
Cdd:COG2319    397 RLWDL 401
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
573-767 5.95e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 59.19  E-value: 5.95e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  573 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKyFAKSQSKIT---RLITAVVMKNWKEIVESC---- 644
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  645 -----DLKNWREALAAVLTYAKPD-EFSALCDLlgarleseGDSLLRTQ----ACLCYICAGnverlvacwtkAQDGSNP 714
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALVEL--------GDLLAQRGlveaAHICYLLAG-----------VPLGPYP 207
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  715 LS-------LQDLIEKVVILR--KAVQLT------QALDTNTVG--ALLAEKMsQYANLLAAQGSIAAAL 767
Cdd:cd09233    208 SSpsscllgGAVHNKSPRTFAtpEAIQLTeiyeyaLSLGNPQFGlpHLQPYKL-IHAARLAELGLVSEAL 276
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
573-767 7.19e-08

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 55.26  E-value: 7.19e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  573 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKY----FAKSQSKITRLItAVVMK----NWKEIVE- 642
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFvrseFKGSNNKSGESL-AALYQvfagNSEEAVDe 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  643 --------SCDLKNWREALAAVLTYAKPDEFSALCDlLGARLESEGdslLRTQACLCYICAgNVERLVACWTKAQDGSNP 714
Cdd:pfam12931   79 lvppsknaLWALDNWRETLALVLSNRSPGDVEALLA-LGDLLAQYG---RTEAAHICFLLA-GLPLSQTVLLGADHVRFP 153
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1937369563  715 LSLQDLIEkvvilrkAVQLTQ----ALDTNTVGA-------LLAEKMsQYANLLAAQGSIAAAL 767
Cdd:pfam12931  154 STFGNDLE-------SILLTEiyeyALSLSPPQPpfvglphLLPYKL-QHAAVLAEYGLVSEAQ 209
PHA03247 PHA03247
large tegument protein UL36; Provisional
795-1146 1.04e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 56.87  E-value: 1.04e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  795 PGQESSRSSYEGQPLPKGGPGPLAGH--PQVSRVQ-SQQYYPQVRIAPTVTTWSDRTPTALPshPPAAcpSDTQGGNPPP 871
Cdd:PHA03247  2628 PPSPSPAANEPDPHPPPTVPPPERPRddPAPGRVSrPRRARRLGRAAQASSPPQRPRRRAAR--PTVG--SLTSLADPPP 2703
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  872 PGfimhgnvVPNSPAPLPTSPGhmhsqpppypqpqpyQPAQQYSFGTGGSAVYRPQQPVAPPAsnayPNAPYVSPVASYS 951
Cdd:PHA03247  2704 PP-------PTPEPAPHALVSA---------------TPLPPGPAAARQASPALPAAPAPPAV----PAGPATPGGPARP 2757
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  952 GQPQMYTAQPASSPTSSSAPLPPPPSSGASFQHGGPGAPPSSSAYALPPGTTGTPPAASELPASQRTGPqngwNDPPALN 1031
Cdd:PHA03247  2758 ARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAG----PLPPPTS 2833
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563 1032 RVPKKKKLPENFMPPVPITSPIMNPGGDPQ---PQGLQQQPSASGPRSSHASFPQPHLAggqpfhgiQQPLAQTGMPPSF 1108
Cdd:PHA03247  2834 AQPTAPPPPPGPPPPSLPLGGSVAPGGDVRrrpPSRSPAAKPAAPARPPVRRLARPAVS--------RSTESFALPPDQP 2905
                          330       340       350
                   ....*....|....*....|....*....|....*...
gi 1937369563 1109 SKPNTEGAPGAPIGNTIQHVQALPTEKITKKPIPDEHL 1146
Cdd:PHA03247  2906 ERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPL 2943
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
772-1127 2.02e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 52.46  E-value: 2.02e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  772 DNTNQPDIVQLRDRLCRAQGRSVPGQESSRSSYEGQPLPKGGPGPLAGHPQVSRVQSQQYYPQVRIAPTVTTwSDRTPTA 851
Cdd:pfam03154  159 DSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTL-IQQTPTL 237
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  852 LPSHPPAACPSDTQGGNPPPPGFI---------MHGNVVPnSPAPLPTSPGHMHSQP-----PPYPQPQPYQPAQQYSFG 917
Cdd:pfam03154  238 HPQRLPSPHPPLQPMTQPPPPSQVspqplpqpsLHGQMPP-MPHSLQTGPSHMQHPVppqpfPLTPQSSQSQVPPGPSPA 316
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  918 TGGSAVYRPQQPVA--------PPASNAYPNAPYVSP---VASYSGQPQMYTAQpassptssSAPLPPPPSSGASFQHGG 986
Cdd:pfam03154  317 APGQSQQRIHTPPSqsqlqsqqPPREQPLPPAPLSMPhikPPPTTPIPQLPNPQ--------SHKHPPHLSGPSPFQMNS 388
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  987 PGAPP------SSSAYALPPgtTGTPPAASELPASQ---------------RTGPQNGWNDPP--ALNRVPKKKKLPEN- 1042
Cdd:pfam03154  389 NLPPPpalkplSSLSTHHPP--SAHPPPLQLMPQSQqlppppaqppvltqsQSLPPPAASHPPtsGLHQVPSQSPFPQHp 466
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563 1043 FMP--PVPITSPiMNPGGDPQPQGLQQQPSASGPRSSHASFPQPHLAGGQPFHGIQQPLAQTGMPPSFSKPNTEGAPGAP 1120
Cdd:pfam03154  467 FVPggPPPITPP-SGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEPT 545

                   ....*..
gi 1937369563 1121 IGNTIQH 1127
Cdd:pfam03154  546 VVNTPSH 552
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
203-333 4.61e-04

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 44.69  E-value: 4.61e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  203 PIIKVSdhsNRMHCSGLAWHPDVATQMVLASEDDrlpVVQMWDLrfASSPLRV-LENHARGILAIAWSMADPELLLSCGK 281
Cdd:PLN00181   525 PVVELA---SRSKLSGICWNSYIKSQVASSNFEG---VVQVWDV--ARSQLVTeMKEHEKRVWSIDYSSADPTLLASGSD 596
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1937369563  282 DAKILCSNPNTGEVLYELPTNTQWCFdIQWCPRNPAVLSAASFDGRIRVYSI 333
Cdd:PLN00181   597 DGSVKLWSINQGVSIGTIKTKANICC-VQFPSESGRSLAFGSADHKVYYYDL 647
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
292-332 7.66e-03

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 35.37  E-value: 7.66e-03
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|.
gi 1937369563   292 TGEVLYELPTNTQWCFDIQWCPRNPAVLSAaSFDGRIRVYS 332
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASG-SDDGTIKLWD 40
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
84-340 2.58e-29

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 118.98  E-value: 2.58e-29
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563   84 DVSGVLIAGGENGNIILYDPSkiiagDKEVVIAQKDkHTGPVRALDVnIFQTNLVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     19 PDGKLLATGSGDGTIKVWDLE-----TGELLRTLKG-HTGPVRDVAA-SADGTYLASGSSDKTIRLWDLETGECVRTLTG 91
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  164 KTQppeDISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvaTQMVLASEDDRLpvVQM 243
Cdd:cd00200     92 HTS---YVSSVAFSPD-GRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNS--VAFSPD--GTFVASSSQDGT--IKL 161
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  244 WDLRfASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAAS 323
Cdd:cd00200    162 WDLR-TGKCVATLTGHTGEVNSVAFS-PDGEKLLSSSSDGTIKLWDLSTGKCLGTLRGHENGVNSVAFSP-DGYLLASGS 238
                          250
                   ....*....|....*..
gi 1937369563  324 FDGRIRVYSIMGGSIDG 340
Cdd:cd00200    239 EDGTIRVWDLRTGECVQ 255
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-332 2.13e-25

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 107.81  E-value: 2.13e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563   13 AWSPAQNhpiYLATGtsaqqldatfSTNASLEIFELD-------LSDPSLDMKSCATFSSSHRyhkliwgphkmdskgdv 85
Cdd:cd00200     16 AFSPDGK---LLATG----------SGDGTIKVWDLEtgellrtLKGHTGPVRDVAASADGTY----------------- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563   86 sgvLIAGGENGNIILYDPSKiiagdKEVVIAQKDkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     66 ---LASGSSDKTIRLWDLET-----GECVRTLTG-HTSYVSSVA---FSPDgrILSSSSRDKTIKVWDVETGKCLTTLRG 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  164 KTQPpedISCIAWNrQVQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDrlpVVQM 243
Cdd:cd00200    134 HTDW---VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKL 203
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  244 WDLRfASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaAS 323
Cdd:cd00200    204 WDLS-TGKCLGTLRGHENGVNSVAFS-PDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GS 280

                   ....*....
gi 1937369563  324 FDGRIRVYS 332
Cdd:cd00200    281 ADGTIRIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 6.26e-24

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 105.76  E-value: 6.26e-24
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563   89 LIAGGENGNIILYDpskiIAGDKEvvIAQKDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTQPp 168
Cdd:COG2319    177 LASGSDDGTVRLWD----LATGKL--LRTLTGHTGAVRSVAFS-PDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGS- 248
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  169 edISCIAWNRQVQHiLASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDvATQMVLASEDDRlpvVQMWDLRf 248
Cdd:COG2319    249 --VRSVAFSPDGRL-LASGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLWDLA- 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  249 ASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRI 328
Cdd:COG2319    319 TGKLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSPDGRTLAS-GSADGTV 396

                   ....*
gi 1937369563  329 RVYSI 333
Cdd:COG2319    397 RLWDL 401
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 2.42e-22

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 101.14  E-value: 2.42e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563   89 LIAGGENGNIILYDpskiIAGDKEvvIAQKDKHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGAKTQ 166
Cdd:COG2319    135 LASGSADGTVRLWD----LATGKL--LRTLTGHSGAVTSVA---FSPDgkLLASGSDDGTVRLWDLATGKLLRTLTGHTG 205
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  167 PpedISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDRlpvVQMWDL 246
Cdd:COG2319    206 A---VRSVAFSPD-GKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRS--VAFSPD-GRLLASGSADGT---VRLWDL 275
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  247 RfASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDG 326
Cdd:COG2319    276 A-TGELLRTLTGHSGGVNSVAFS-PDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLAS-GSDDG 352

                   ....*..
gi 1937369563  327 RIRVYSI 333
Cdd:COG2319    353 TVRLWDL 359
WD40 COG2319
WD40 repeat [General function prediction only];
121-338 4.43e-20

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 94.21  E-value: 4.43e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  121 HTGPVRALDVNiFQTNLVASGANESEIYIWDLnnfATPMTPGAKTQPPEDISCIAWNRQvQHILASASPSGRATVWDLRK 200
Cdd:COG2319     77 HTAAVLSVAFS-PDGRLLASASADGTVRLWDL---ATGLLLRTLTGHTGAVRSVAFSPD-GKTLASGSADGTVRLWDLAT 151
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  201 NEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDRlpvVQMWDLRfASSPLRVLENHARGILAIAWSmADPELLLSCG 280
Cdd:COG2319    152 GKLLRTLTGHSGAVTS--VAFSPD-GKLLASGSDDGT---VRLWDLA-TGKLLRTLTGHTGAVRSVAFS-PDGKLLASGS 223
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1937369563  281 KDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAASFDGRIRVYSIMGGSI 338
Cdd:COG2319    224 ADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSP-DGRLLASGSADGTVRLWDLATGEL 280
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
171-337 3.54e-17

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 83.54  E-value: 3.54e-17
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  171 ISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMhcSGLAWHPDvATQMVLASEDDrlpVVQMWDLRfAS 250
Cdd:cd00200     12 VTCVAFSPD-GKLLATGSGDGTIKVWDLETGELLRTLKGHTGPV--RDVAASAD-GTYLASGSSDK---TIRLWDLE-TG 83
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  251 SPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAASFDGRIRV 330
Cdd:cd00200     84 ECVRTLTGHTSYVSSVAFS-PDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSP-DGTFVASSSQDGTIKL 161

                   ....*..
gi 1937369563  331 YSIMGGS 337
Cdd:cd00200    162 WDLRTGK 168
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
573-767 5.95e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 59.19  E-value: 5.95e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  573 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKyFAKSQSKIT---RLITAVVMKNWKEIVESC---- 644
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  645 -----DLKNWREALAAVLTYAKPD-EFSALCDLlgarleseGDSLLRTQ----ACLCYICAGnverlvacwtkAQDGSNP 714
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALVEL--------GDLLAQRGlveaAHICYLLAG-----------VPLGPYP 207
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  715 LS-------LQDLIEKVVILR--KAVQLT------QALDTNTVG--ALLAEKMsQYANLLAAQGSIAAAL 767
Cdd:cd09233    208 SSpsscllgGAVHNKSPRTFAtpEAIQLTeiyeyaLSLGNPQFGlpHLQPYKL-IHAARLAELGLVSEAL 276
WD40 COG2319
WD40 repeat [General function prediction only];
89-247 9.77e-09

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 59.15  E-value: 9.77e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563   89 LIAGGENGNIILYDpskiIAGDKEVVIaqKDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTqpp 168
Cdd:COG2319    261 LASGSADGTVRLWD----LATGELLRT--LTGHSGGVNSVAFS-PDGKLLASGSDDGTVRLWDLATGKLLRTLTGHT--- 330
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1937369563  169 EDISCIAWNRQVQhILASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDvATQMVLASEDDRlpvVQMWDLR 247
Cdd:COG2319    331 GAVRSVAFSPDGK-TLASGSDDGTVRLWDLATGELLRTLTGHTGAVT--SVAFSPD-GRTLASGSADGT---VRLWDLA 402
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
573-767 7.19e-08

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 55.26  E-value: 7.19e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  573 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKY----FAKSQSKITRLItAVVMK----NWKEIVE- 642
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFvrseFKGSNNKSGESL-AALYQvfagNSEEAVDe 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  643 --------SCDLKNWREALAAVLTYAKPDEFSALCDlLGARLESEGdslLRTQACLCYICAgNVERLVACWTKAQDGSNP 714
Cdd:pfam12931   79 lvppsknaLWALDNWRETLALVLSNRSPGDVEALLA-LGDLLAQYG---RTEAAHICFLLA-GLPLSQTVLLGADHVRFP 153
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1937369563  715 LSLQDLIEkvvilrkAVQLTQ----ALDTNTVGA-------LLAEKMsQYANLLAAQGSIAAAL 767
Cdd:pfam12931  154 STFGNDLE-------SILLTEiyeyALSLSPPQPpfvglphLLPYKL-QHAAVLAEYGLVSEAQ 209
PHA03247 PHA03247
large tegument protein UL36; Provisional
795-1146 1.04e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 56.87  E-value: 1.04e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  795 PGQESSRSSYEGQPLPKGGPGPLAGH--PQVSRVQ-SQQYYPQVRIAPTVTTWSDRTPTALPshPPAAcpSDTQGGNPPP 871
Cdd:PHA03247  2628 PPSPSPAANEPDPHPPPTVPPPERPRddPAPGRVSrPRRARRLGRAAQASSPPQRPRRRAAR--PTVG--SLTSLADPPP 2703
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  872 PGfimhgnvVPNSPAPLPTSPGhmhsqpppypqpqpyQPAQQYSFGTGGSAVYRPQQPVAPPAsnayPNAPYVSPVASYS 951
Cdd:PHA03247  2704 PP-------PTPEPAPHALVSA---------------TPLPPGPAAARQASPALPAAPAPPAV----PAGPATPGGPARP 2757
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  952 GQPQMYTAQPASSPTSSSAPLPPPPSSGASFQHGGPGAPPSSSAYALPPGTTGTPPAASELPASQRTGPqngwNDPPALN 1031
Cdd:PHA03247  2758 ARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAG----PLPPPTS 2833
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563 1032 RVPKKKKLPENFMPPVPITSPIMNPGGDPQ---PQGLQQQPSASGPRSSHASFPQPHLAggqpfhgiQQPLAQTGMPPSF 1108
Cdd:PHA03247  2834 AQPTAPPPPPGPPPPSLPLGGSVAPGGDVRrrpPSRSPAAKPAAPARPPVRRLARPAVS--------RSTESFALPPDQP 2905
                          330       340       350
                   ....*....|....*....|....*....|....*...
gi 1937369563 1109 SKPNTEGAPGAPIGNTIQHVQALPTEKITKKPIPDEHL 1146
Cdd:PHA03247  2906 ERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPL 2943
PHA03247 PHA03247
large tegument protein UL36; Provisional
751-1119 1.86e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 56.10  E-value: 1.86e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  751 SQYANLLAAQGSIAAALAFLPDNTNQPDIVQLRDRlCRAQGRsvPGQESSrssyegqplPKGGPGPLAGHPQVSRVQS-- 828
Cdd:PHA03247  2632 SPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRR-ARRLGR--AAQASS---------PPQRPRRRAARPTVGSLTSla 2699
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  829 QQYYPQVRIAPTVTTWSDRTPTAL-PSHPPAACPSDTQGGNPPPPGfimHGNVVPNSPAPLPTSPghmhsqpppypqpqp 907
Cdd:PHA03247  2700 DPPPPPPTPEPAPHALVSATPLPPgPAAARQASPALPAAPAPPAVP---AGPATPGGPARPARPP--------------- 2761
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  908 yqpaqqysfgtggsAVYRPQQPvAPPASNAYPNAPYVSPVASYSGQPQMYTAQPASSPTSSSAPLPPPPSSGASFQHGGP 987
Cdd:PHA03247  2762 --------------TTAGPPAP-APPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAG 2826
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  988 GAPPSSSAYALPPGTTGTPPAASELPAS--------QRTGPQNGWNDPPALNRVPKKKKLPENFMPPVPITSPIMNPGgd 1059
Cdd:PHA03247  2827 PLPPPTSAQPTAPPPPPGPPPPSLPLGGsvapggdvRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQ-- 2904
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563 1060 pqPQGLQQQPSASGPRSSHASFPQPHLAGGQPfhgiQQPLAQTGMPPSfskPNTEGAPGA 1119
Cdd:PHA03247  2905 --PERPPQPQAPPPPQPQPQPPPPPQPQPPPP----PPPRPQPPLAPT---TDPAGAGEP 2955
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
252-336 8.19e-07

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 52.34  E-value: 8.19e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  252 PLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRIRVY 331
Cdd:cd00200      1 LRRTLKGHTGGVTCVAFS-PDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLAS-GSSDKTIRLW 78

                   ....*
gi 1937369563  332 SIMGG 336
Cdd:cd00200     79 DLETG 83
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
772-1127 2.02e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 52.46  E-value: 2.02e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  772 DNTNQPDIVQLRDRLCRAQGRSVPGQESSRSSYEGQPLPKGGPGPLAGHPQVSRVQSQQYYPQVRIAPTVTTwSDRTPTA 851
Cdd:pfam03154  159 DSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTL-IQQTPTL 237
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  852 LPSHPPAACPSDTQGGNPPPPGFI---------MHGNVVPnSPAPLPTSPGHMHSQP-----PPYPQPQPYQPAQQYSFG 917
Cdd:pfam03154  238 HPQRLPSPHPPLQPMTQPPPPSQVspqplpqpsLHGQMPP-MPHSLQTGPSHMQHPVppqpfPLTPQSSQSQVPPGPSPA 316
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  918 TGGSAVYRPQQPVA--------PPASNAYPNAPYVSP---VASYSGQPQMYTAQpassptssSAPLPPPPSSGASFQHGG 986
Cdd:pfam03154  317 APGQSQQRIHTPPSqsqlqsqqPPREQPLPPAPLSMPhikPPPTTPIPQLPNPQ--------SHKHPPHLSGPSPFQMNS 388
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  987 PGAPP------SSSAYALPPgtTGTPPAASELPASQ---------------RTGPQNGWNDPP--ALNRVPKKKKLPEN- 1042
Cdd:pfam03154  389 NLPPPpalkplSSLSTHHPP--SAHPPPLQLMPQSQqlppppaqppvltqsQSLPPPAASHPPtsGLHQVPSQSPFPQHp 466
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563 1043 FMP--PVPITSPiMNPGGDPQPQGLQQQPSASGPRSSHASFPQPHLAGGQPFHGIQQPLAQTGMPPSFSKPNTEGAPGAP 1120
Cdd:pfam03154  467 FVPggPPPITPP-SGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEPT 545

                   ....*..
gi 1937369563 1121 IGNTIQH 1127
Cdd:pfam03154  546 VVNTPSH 552
PHA03247 PHA03247
large tegument protein UL36; Provisional
789-1056 9.63e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 47.24  E-value: 9.63e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  789 AQGRSVPGQESSRSSyegQPLPKGGPGPLAghPQVSRVQSQQYYPqvriAPTVTTWSDRTPTALPSHPPAACPSDTQGGN 868
Cdd:PHA03247  2745 PAGPATPGGPARPAR---PPTTAGPPAPAP--PAAPAAGPPRRLT----RPAVASLSESRESLPSPWDPADPPAAVLAPA 2815
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  869 PPPPGFIMHGNVVPNSPAPLPTSPghmhsqpPPYPQPQPYQPAQQYSFGTGGSAVYRP--QQPVAPPASNAYPNAPYVS- 945
Cdd:PHA03247  2816 AALPPAASPAGPLPPPTSAQPTAP-------PPPPGPPPPSLPLGGSVAPGGDVRRRPpsRSPAAKPAAPARPPVRRLAr 2888
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  946 PVASYSGQPQmytaqpasSPTSSSAPLPPPPSSGASFQHGGPGAPPSSSAYALPPGTTGTPPAASELPASQRTGPQNGWN 1025
Cdd:PHA03247  2889 PAVSRSTESF--------ALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVP 2960
                          250       260       270
                   ....*....|....*....|....*....|....*
gi 1937369563 1026 DPPALNRVPKKKKLPENFM----PPVPITSPIMNP 1056
Cdd:PHA03247  2961 QPWLGALVPGRVAVPRFRVpqpaPSREAPASSTPP 2995
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
203-333 4.61e-04

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 44.69  E-value: 4.61e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  203 PIIKVSdhsNRMHCSGLAWHPDVATQMVLASEDDrlpVVQMWDLrfASSPLRV-LENHARGILAIAWSMADPELLLSCGK 281
Cdd:PLN00181   525 PVVELA---SRSKLSGICWNSYIKSQVASSNFEG---VVQVWDV--ARSQLVTeMKEHEKRVWSIDYSSADPTLLASGSD 596
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1937369563  282 DAKILCSNPNTGEVLYELPTNTQWCFdIQWCPRNPAVLSAASFDGRIRVYSI 333
Cdd:PLN00181   597 DGSVKLWSINQGVSIGTIKTKANICC-VQFPSESGRSLAFGSADHKVYYYDL 647
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
292-332 7.66e-03

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 35.37  E-value: 7.66e-03
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|.
gi 1937369563   292 TGEVLYELPTNTQWCFDIQWCPRNPAVLSAaSFDGRIRVYS 332
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASG-SDDGTIKLWD 40
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
839-1100 9.51e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 40.24  E-value: 9.51e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  839 PTVTTWSDRTPTALPSHPPAACPSDTQGGNPPPPgfimhgnvvpnSPAPLPTSPGHMHSQPPPYPQPQPYQPAQQYSFGT 918
Cdd:PRK12323   375 ATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAA-----------APAAAAAARAVAAAPARRSPAPEALAAARQASARG 443
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  919 GGSAVYRPQQPVAPPASNAYPNAPYVSPVASYSGQPQMYTAQPASSPTSSSAPLPPPPSsgasfqhggPGAPPSSSAYAL 998
Cdd:PRK12323   444 PGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEEL---------PPEFASPAPAQP 514
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1937369563  999 PPGTTGTPPAASELPA-SQRTGPQNGWNDPPALNRVPKKKKLPENFMPPVPitspimnpggdpqpqglqqqPSASGPRSS 1077
Cdd:PRK12323   515 DAAPAGWVAESIPDPAtADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRP--------------------PRASASGLP 574
                          250       260
                   ....*....|....*....|....
gi 1937369563 1078 HASFPQ-PHLAGGQPFHGIQQPLA 1100
Cdd:PRK12323   575 DMFDGDwPALAARLPVRGLAQQLA 598
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH