NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1318663194|ref|NP_001346277|]
View 

protein transport protein Sec31A isoform d [Mus musculus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 super family cl29593
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-332 8.57e-25

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


The actual alignment was detected with superfamily member cd00200:

Pssm-ID: 475233 [Multi-domain]  Cd Length: 289  Bit Score: 105.88  E-value: 8.57e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194   13 AWSPAQNhpiYLATGtsaqqldatfSTNASLEIFELD-------LSDPSLDMKSCATFSSSHRyhkliwgphkmdskgdv 85
Cdd:cd00200     16 AFSPDGK---LLATG----------SGDGTIKVWDLEtgellrtLKGHTGPVRDVAASADGTY----------------- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194   86 sgvLIAGGENGNIILYDPSKiiagdKEVVIAQKDkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     66 ---LASGSSDKTIRLWDLET-----GECVRTLTG-HTSYVSSVA---FSPDgrILSSSSRDKTIKVWDVETGKCLTTLRG 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  164 KTQPpedISCIAWNrQVQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDrlpVIQM 243
Cdd:cd00200    134 HTDW---VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKL 203
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  244 WDLRfASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaAS 323
Cdd:cd00200    204 WDLS-TGKCLGTLRGHENGVNSVAFS-PDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GS 280

                   ....*....
gi 1318663194  324 FDGRISVYS 332
Cdd:cd00200    281 ADGTIRIWD 289
PHA03247 super family cl33720
large tegument protein UL36; Provisional
792-1088 5.97e-10

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 64.19  E-value: 5.97e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  792 PVSGQESSQSPYERQPLSKGRPGPVAGHSQMPRVQTQQYYPhgenPPPPGFIMQGNVIPNPAAPLPTAPGH-MPSQLPPY 870
Cdd:PHA03247  2702 PPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPP----AVPAGPATPGGPARPARPPTTAGPPApAPPAAPAA 2777
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  871 PQPQPYQPAQQYSFGTGGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASfPPPSSGASF 950
Cdd:PHA03247  2778 GPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPP-PSLPLGGSV 2856
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  951 QHGGPGA--PPSSSAYALPpgTTGTPPAASELPASQRTGPQNGWNDPPalnrvPKKKKMPENFMPPVPITSPIMNPSGDP 1028
Cdd:PHA03247  2857 APGGDVRrrPPSRSPAAKP--AAPARPPVRRLARPAVSRSTESFALPP-----DQPERPPQPQAPPPPQPQPQPPPPPQP 2929
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1318663194 1029 QSQgLQQQPSTPGPLSshasfPQQHLAG-GQPFHGVQQPLAQTGMPPSFSKPNTEGAPGAP 1088
Cdd:PHA03247  2930 QPP-PPPPPRPQPPLA-----PTTDPAGaGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAP 2984
ACE1-Sec16-like super family cl14807
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
572-766 1.76e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


The actual alignment was detected with superfamily member cd09233:

Pssm-ID: 449359 [Multi-domain]  Cd Length: 314  Bit Score: 60.73  E-value: 1.76e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  572 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKyFAKSQSKIT---RLITAVVMKNWREIVESC---- 643
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  644 -----DLKNWREALAAVLTYAKPD-EFSALCdLLGTRLEREGDSLlrtQACLCYICAGnverlvacwtkAQDGSSPLS-- 715
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALV-ELGDLLAQRGLVE---AAHICYLLAG-----------VPLGPYPSSps 211
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1318663194  716 -----LQDLIEKVVILR--KAVQLT------QALDTNTVG--ALLAEKMsQYASLLAAQGSIAAAL 766
Cdd:cd09233    212 scllgGAVHNKSPRTFAtpEAIQLTeiyeyaLSLGNPQFGlpHLQPYKL-IHAARLAELGLVSEAL 276
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-332 8.57e-25

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 105.88  E-value: 8.57e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194   13 AWSPAQNhpiYLATGtsaqqldatfSTNASLEIFELD-------LSDPSLDMKSCATFSSSHRyhkliwgphkmdskgdv 85
Cdd:cd00200     16 AFSPDGK---LLATG----------SGDGTIKVWDLEtgellrtLKGHTGPVRDVAASADGTY----------------- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194   86 sgvLIAGGENGNIILYDPSKiiagdKEVVIAQKDkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     66 ---LASGSSDKTIRLWDLET-----GECVRTLTG-HTSYVSSVA---FSPDgrILSSSSRDKTIKVWDVETGKCLTTLRG 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  164 KTQPpedISCIAWNrQVQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDrlpVIQM 243
Cdd:cd00200    134 HTDW---VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKL 203
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  244 WDLRfASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaAS 323
Cdd:cd00200    204 WDLS-TGKCLGTLRGHENGVNSVAFS-PDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GS 280

                   ....*....
gi 1318663194  324 FDGRISVYS 332
Cdd:cd00200    281 ADGTIRIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 1.31e-23

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 104.99  E-value: 1.31e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194   89 LIAGGENGNIILYDpskiIAGDKEvvIAQKDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTQPp 168
Cdd:COG2319    177 LASGSDDGTVRLWD----LATGKL--LRTLTGHTGAVRSVAFS-PDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGS- 248
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  169 edISCIAWNRQVQHiLASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDvATQMVLASEDDRlpvIQMWDLRf 248
Cdd:COG2319    249 --VRSVAFSPDGRL-LASGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLWDLA- 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  249 ASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRI 328
Cdd:COG2319    319 TGKLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSPDGRTLAS-GSADGTV 396

                   ....*
gi 1318663194  329 SVYSI 333
Cdd:COG2319    397 RLWDL 401
PHA03247 PHA03247
large tegument protein UL36; Provisional
792-1088 5.97e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 64.19  E-value: 5.97e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  792 PVSGQESSQSPYERQPLSKGRPGPVAGHSQMPRVQTQQYYPhgenPPPPGFIMQGNVIPNPAAPLPTAPGH-MPSQLPPY 870
Cdd:PHA03247  2702 PPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPP----AVPAGPATPGGPARPARPPTTAGPPApAPPAAPAA 2777
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  871 PQPQPYQPAQQYSFGTGGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASfPPPSSGASF 950
Cdd:PHA03247  2778 GPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPP-PSLPLGGSV 2856
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  951 QHGGPGA--PPSSSAYALPpgTTGTPPAASELPASQRTGPQNGWNDPPalnrvPKKKKMPENFMPPVPITSPIMNPSGDP 1028
Cdd:PHA03247  2857 APGGDVRrrPPSRSPAAKP--AAPARPPVRRLARPAVSRSTESFALPP-----DQPERPPQPQAPPPPQPQPQPPPPPQP 2929
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1318663194 1029 QSQgLQQQPSTPGPLSshasfPQQHLAG-GQPFHGVQQPLAQTGMPPSFSKPNTEGAPGAP 1088
Cdd:PHA03247  2930 QPP-PPPPPRPQPPLA-----PTTDPAGaGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAP 2984
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
572-766 1.76e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 60.73  E-value: 1.76e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  572 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKyFAKSQSKIT---RLITAVVMKNWREIVESC---- 643
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  644 -----DLKNWREALAAVLTYAKPD-EFSALCdLLGTRLEREGDSLlrtQACLCYICAGnverlvacwtkAQDGSSPLS-- 715
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALV-ELGDLLAQRGLVE---AAHICYLLAG-----------VPLGPYPSSps 211
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1318663194  716 -----LQDLIEKVVILR--KAVQLT------QALDTNTVG--ALLAEKMsQYASLLAAQGSIAAAL 766
Cdd:cd09233    212 scllgGAVHNKSPRTFAtpEAIQLTeiyeyaLSLGNPQFGlpHLQPYKL-IHAARLAELGLVSEAL 276
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
769-1048 2.50e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 58.63  E-value: 2.50e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  769 LPDNTNQPNIVQLRDRLCKAQGKPVSGQESSQSPYERQPLSKG--RPGPVAGHSQMPRVQTQQYYPHGENPPPpgFIMQG 846
Cdd:pfam03154  311 PGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSMPhiKPPPTTPIPQLPNPQSHKHPPHLSGPSP--FQMNS 388
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  847 NVIPNPA-APLPTAPGHMPSQLPPYPQPQPYQPAQQYSfgtggAAAYRP---QQPVAPPASNAYPNTPYISPVASYSGQP 922
Cdd:pfam03154  389 NLPPPPAlKPLSSLSTHHPPSAHPPPLQLMPQSQQLPP-----PPAQPPvltQSQSLPPPAASHPPTSGLHQVPSQSPFP 463
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  923 QMytaqqassptsssaasfpppssgaSFQHGGPGA--PPSSSAYALPPGTTGTPPAASELPASQRTGPqngwndppalnr 1000
Cdd:pfam03154  464 QH------------------------PFVPGGPPPitPPSGPPTSTSSAMPGIQPPSSASVSSSGPVP------------ 507
                          250       260       270       280       290
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1318663194 1001 vpkkkKMPENFMPPVPITSPIMNPSGDPQSQGLQQQPSTPGP----LSSHAS 1048
Cdd:pfam03154  508 -----AAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEPtvvnTPSHAS 554
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
572-766 3.56e-07

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 53.33  E-value: 3.56e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  572 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKY----FAKSQSKITRLItAVVMK----NWREIVE- 641
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFvrseFKGSNNKSGESL-AALYQvfagNSEEAVDe 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  642 --------SCDLKNWREALAAVLTYAKPDEFSALCDlLGTRLEREGdslLRTQACLCYICAgNVERLVACWTKAQDGSSP 713
Cdd:pfam12931   79 lvppsknaLWALDNWRETLALVLSNRSPGDVEALLA-LGDLLAQYG---RTEAAHICFLLA-GLPLSQTVLLGADHVRFP 153
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1318663194  714 LSLQDLIEkvvilrkAVQLTQ----ALDTNTVGA-------LLAEKMsQYASLLAAQGSIAAAL 766
Cdd:pfam12931  154 STFGNDLE-------SILLTEiyeyALSLSPPQPpfvglphLLPYKL-QHAAVLAEYGLVSEAQ 209
KREPA2 cd23959
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ...
885-989 4.97e-03

Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.


Pssm-ID: 467780 [Multi-domain]  Cd Length: 424  Bit Score: 41.01  E-value: 4.97e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  885 GTGGAAAYRPQQPVAP-PASNAYPNTPY-------ISPVASYSGQPQMYTAQQASSPTSSSAASFPPPSSGASFQHGGPG 956
Cdd:cd23959    123 SSTQRETHKTAQVAPPkAEPQTAPVTPFgqlpmfgQHPPPAKPLPAAAAAQQSSASPGEVASPFASGTVSASPFATATDT 202
                           90       100       110
                   ....*....|....*....|....*....|....*...
gi 1318663194  957 APPSSSAYALP-----PGTTGTPPAASELPASQRTGPQ 989
Cdd:cd23959    203 APSSGAPDGFPaeasaPSPFAAPASAASFPAAPVANGE 240
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-332 8.57e-25

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 105.88  E-value: 8.57e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194   13 AWSPAQNhpiYLATGtsaqqldatfSTNASLEIFELD-------LSDPSLDMKSCATFSSSHRyhkliwgphkmdskgdv 85
Cdd:cd00200     16 AFSPDGK---LLATG----------SGDGTIKVWDLEtgellrtLKGHTGPVRDVAASADGTY----------------- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194   86 sgvLIAGGENGNIILYDPSKiiagdKEVVIAQKDkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     66 ---LASGSSDKTIRLWDLET-----GECVRTLTG-HTSYVSSVA---FSPDgrILSSSSRDKTIKVWDVETGKCLTTLRG 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  164 KTQPpedISCIAWNrQVQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDrlpVIQM 243
Cdd:cd00200    134 HTDW---VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKL 203
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  244 WDLRfASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaAS 323
Cdd:cd00200    204 WDLS-TGKCLGTLRGHENGVNSVAFS-PDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GS 280

                   ....*....
gi 1318663194  324 FDGRISVYS 332
Cdd:cd00200    281 ADGTIRIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 1.31e-23

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 104.99  E-value: 1.31e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194   89 LIAGGENGNIILYDpskiIAGDKEvvIAQKDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTQPp 168
Cdd:COG2319    177 LASGSDDGTVRLWD----LATGKL--LRTLTGHTGAVRSVAFS-PDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGS- 248
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  169 edISCIAWNRQVQHiLASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDvATQMVLASEDDRlpvIQMWDLRf 248
Cdd:COG2319    249 --VRSVAFSPDGRL-LASGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLWDLA- 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  249 ASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRI 328
Cdd:COG2319    319 TGKLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSPDGRTLAS-GSADGTV 396

                   ....*
gi 1318663194  329 SVYSI 333
Cdd:COG2319    397 RLWDL 401
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 4.79e-22

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 99.99  E-value: 4.79e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194   89 LIAGGENGNIILYDpskiIAGDKEvvIAQKDKHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGAKTQ 166
Cdd:COG2319    135 LASGSADGTVRLWD----LATGKL--LRTLTGHSGAVTSVA---FSPDgkLLASGSDDGTVRLWDLATGKLLRTLTGHTG 205
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  167 PpedISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDRlpvIQMWDL 246
Cdd:COG2319    206 A---VRSVAFSPD-GKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRS--VAFSPD-GRLLASGSADGT---VRLWDL 275
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  247 RfASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDG 326
Cdd:COG2319    276 A-TGELLRTLTGHSGGVNSVAFS-PDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLAS-GSDDG 352

                   ....*..
gi 1318663194  327 RISVYSI 333
Cdd:COG2319    353 TVRLWDL 359
WD40 COG2319
WD40 repeat [General function prediction only];
121-338 6.70e-20

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 93.82  E-value: 6.70e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  121 HTGPVRALDVNiFQTNLVASGANESEIYIWDLnnfATPMTPGAKTQPPEDISCIAWNRQvQHILASASPSGRATVWDLRK 200
Cdd:COG2319     77 HTAAVLSVAFS-PDGRLLASASADGTVRLWDL---ATGLLLRTLTGHTGAVRSVAFSPD-GKTLASGSADGTVRLWDLAT 151
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  201 NEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDRlpvIQMWDLRfASSPLRVLENHARGILAVAWSmADPELLLSCG 280
Cdd:COG2319    152 GKLLRTLTGHSGAVTS--VAFSPD-GKLLASGSDDGT---VRLWDLA-TGKLLRTLTGHTGAVRSVAFS-PDGKLLASGS 223
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1318663194  281 KDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAASFDGRISVYSIMGGSI 338
Cdd:COG2319    224 ADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSP-DGRLLASGSADGTVRLWDLATGEL 280
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
171-337 1.09e-16

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 82.00  E-value: 1.09e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  171 ISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMhcSGLAWHPDvATQMVLASEDDrlpVIQMWDLRfAS 250
Cdd:cd00200     12 VTCVAFSPD-GKLLATGSGDGTIKVWDLETGELLRTLKGHTGPV--RDVAASAD-GTYLASGSSDK---TIRLWDLE-TG 83
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  251 SPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAASFDGRISV 330
Cdd:cd00200     84 ECVRTLTGHTSYVSSVAFS-PDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSP-DGTFVASSSQDGTIKL 161

                   ....*..
gi 1318663194  331 YSIMGGS 337
Cdd:cd00200    162 WDLRTGK 168
PHA03247 PHA03247
large tegument protein UL36; Provisional
792-1088 5.97e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 64.19  E-value: 5.97e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  792 PVSGQESSQSPYERQPLSKGRPGPVAGHSQMPRVQTQQYYPhgenPPPPGFIMQGNVIPNPAAPLPTAPGH-MPSQLPPY 870
Cdd:PHA03247  2702 PPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPP----AVPAGPATPGGPARPARPPTTAGPPApAPPAAPAA 2777
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  871 PQPQPYQPAQQYSFGTGGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASfPPPSSGASF 950
Cdd:PHA03247  2778 GPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPP-PSLPLGGSV 2856
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  951 QHGGPGA--PPSSSAYALPpgTTGTPPAASELPASQRTGPQNGWNDPPalnrvPKKKKMPENFMPPVPITSPIMNPSGDP 1028
Cdd:PHA03247  2857 APGGDVRrrPPSRSPAAKP--AAPARPPVRRLARPAVSRSTESFALPP-----DQPERPPQPQAPPPPQPQPQPPPPPQP 2929
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1318663194 1029 QSQgLQQQPSTPGPLSshasfPQQHLAG-GQPFHGVQQPLAQTGMPPSFSKPNTEGAPGAP 1088
Cdd:PHA03247  2930 QPP-PPPPPRPQPPLA-----PTTDPAGaGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAP 2984
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
572-766 1.76e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 60.73  E-value: 1.76e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  572 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKyFAKSQSKIT---RLITAVVMKNWREIVESC---- 643
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  644 -----DLKNWREALAAVLTYAKPD-EFSALCdLLGTRLEREGDSLlrtQACLCYICAGnverlvacwtkAQDGSSPLS-- 715
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALV-ELGDLLAQRGLVE---AAHICYLLAG-----------VPLGPYPSSps 211
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1318663194  716 -----LQDLIEKVVILR--KAVQLT------QALDTNTVG--ALLAEKMsQYASLLAAQGSIAAAL 766
Cdd:cd09233    212 scllgGAVHNKSPRTFAtpEAIQLTeiyeyaLSLGNPQFGlpHLQPYKL-IHAARLAELGLVSEAL 276
WD40 COG2319
WD40 repeat [General function prediction only];
89-247 3.37e-09

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 60.31  E-value: 3.37e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194   89 LIAGGENGNIILYDpskiIAGDKEVVIaqKDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTqpp 168
Cdd:COG2319    261 LASGSADGTVRLWD----LATGELLRT--LTGHSGGVNSVAFS-PDGKLLASGSDDGTVRLWDLATGKLLRTLTGHT--- 330
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  169 EDISCIAWNRQVQhILASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPD---VATqmvlASEDDRlpvIQMWD 245
Cdd:COG2319    331 GAVRSVAFSPDGK-TLASGSDDGTVRLWDLATGELLRTLTGHTGAVT--SVAFSPDgrtLAS----GSADGT---VRLWD 400

                   ..
gi 1318663194  246 LR 247
Cdd:COG2319    401 LA 402
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
769-1048 2.50e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 58.63  E-value: 2.50e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  769 LPDNTNQPNIVQLRDRLCKAQGKPVSGQESSQSPYERQPLSKG--RPGPVAGHSQMPRVQTQQYYPHGENPPPpgFIMQG 846
Cdd:pfam03154  311 PGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSMPhiKPPPTTPIPQLPNPQSHKHPPHLSGPSP--FQMNS 388
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  847 NVIPNPA-APLPTAPGHMPSQLPPYPQPQPYQPAQQYSfgtggAAAYRP---QQPVAPPASNAYPNTPYISPVASYSGQP 922
Cdd:pfam03154  389 NLPPPPAlKPLSSLSTHHPPSAHPPPLQLMPQSQQLPP-----PPAQPPvltQSQSLPPPAASHPPTSGLHQVPSQSPFP 463
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  923 QMytaqqassptsssaasfpppssgaSFQHGGPGA--PPSSSAYALPPGTTGTPPAASELPASQRTGPqngwndppalnr 1000
Cdd:pfam03154  464 QH------------------------PFVPGGPPPitPPSGPPTSTSSAMPGIQPPSSASVSSSGPVP------------ 507
                          250       260       270       280       290
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1318663194 1001 vpkkkKMPENFMPPVPITSPIMNPSGDPQSQGLQQQPSTPGP----LSSHAS 1048
Cdd:pfam03154  508 -----AAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEPtvvnTPSHAS 554
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
806-1095 3.17e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 58.24  E-value: 3.17e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  806 QPLSKGRPGPvaghsQMPRVQTQQYYPHGENPPPPGFIMQGNVI---PNPAAPLPTAPGHMPSQLPPYPQPQPYQPAQQY 882
Cdd:pfam03154  250 QPMTQPPPPS-----QVSPQPLPQPSLHGQMPPMPHSLQTGPSHmqhPVPPQPFPLTPQSSQSQVPPGPSPAAPGQSQQR 324
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  883 SFGTGGAAAYRPQQPVA----PPASNAYPNT--PYISPVASY-SGQPQMYTAQQASSPTSSSAASFPPPS-----SGASF 950
Cdd:pfam03154  325 IHTPPSQSQLQSQQPPReqplPPAPLSMPHIkpPPTTPIPQLpNPQSHKHPPHLSGPSPFQMNSNLPPPPalkplSSLST 404
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  951 QHGGPGAPPS----SSAYALPPgttgtPPAASELPASQRTGPQNGWNDPP--ALNRVPKKKKMPEN-FMP--PVPITSPI 1021
Cdd:pfam03154  405 HHPPSAHPPPlqlmPQSQQLPP-----PPAQPPVLTQSQSLPPPAASHPPtsGLHQVPSQSPFPQHpFVPggPPPITPPS 479
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1318663194 1022 MNPSGDPQSQGLQQQPSTpGPLSSHASFPQQHLAGGQPFHGVQQPLAQTGMPPSFSKPNTEGAPGAPIGNTIQH 1095
Cdd:pfam03154  480 GPPTSTSSAMPGIQPPSS-ASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEPTVVNTPSH 552
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
819-1094 1.56e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 55.93  E-value: 1.56e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  819 HSQMPRVQTQQYYPHGENPPPPGFIMQGNVIPNPAAPL----PTAPGHMPSQLPPYPQPQPYQPAQQYSFGTGG-AAAYR 893
Cdd:pfam03154  168 QTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSvppqGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRlPSPHP 247
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  894 PQQPVAPPA-SNAYPNTPYISPVASYSGQPQMYTAQQ--------------ASSPTSSSAASFPPPSSGASFQ-HGGPGA 957
Cdd:pfam03154  248 PLQPMTQPPpPSQVSPQPLPQPSLHGQMPPMPHSLQTgpshmqhpvppqpfPLTPQSSQSQVPPGPSPAAPGQsQQRIHT 327
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  958 PPSSSAYA--LPPGTTGTPPAASEL---------PASQRTGPQNgwNDPPALNRVPKKKKMPENfMPPVPITSPIMN--- 1023
Cdd:pfam03154  328 PPSQSQLQsqQPPREQPLPPAPLSMphikpppttPIPQLPNPQS--HKHPPHLSGPSPFQMNSN-LPPPPALKPLSSlst 404
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194 1024 ---PSGDP-------QSQGLQ----------QQPSTPGPLSSHASFPQQHLAGGQPfhgvqqPLAQ----TGMPPSFSKP 1079
Cdd:pfam03154  405 hhpPSAHPpplqlmpQSQQLPpppaqppvltQSQSLPPPAASHPPTSGLHQVPSQS------PFPQhpfvPGGPPPITPP 478
                          330
                   ....*....|....*
gi 1318663194 1080 NTEGAPGAPIGNTIQ 1094
Cdd:pfam03154  479 SGPPTSTSSAMPGIQ 493
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
773-1152 1.66e-07

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 55.78  E-value: 1.66e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  773 TNQPNIVQLRDRLCKAQGKPVSGQESSQSPYERQPLSKGRP----GPVAGHSQMPRVQTQQyyPHGENPPP-------PG 841
Cdd:pfam09606   90 AGQGTRPQMMGPMGPGPGGPMGQQMGGPGTASNLLASLGRPqmpmGGAGFPSQMSRVGRMQ--PGGQAGGMmqpssgqPG 167
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  842 FIMQGNVIPNPAAPLPTAPGHMPSQLPPYPQPQPYQPAQQYSFGT--GGAAAYRPQQPVAPPASN----AYPNTPYispv 915
Cdd:pfam09606  168 SGTPNQMGPNGGPGQGQAGGMNGGQQGPMGGQMPPQMGVPGMPGPadAGAQMGQQAQANGGMNPQqmggAPNQVAM---- 243
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  916 asySGQPQMYTAQQASSPTSSSAASFPPPSSGASFQHGGPG---APPSSSAYALPP-GTTGTPPAASELPASQRTGPQNG 991
Cdd:pfam09606  244 ---QQQQPQQQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGqpmGPPGQQPGAMPNvMSIGDQNNYQQQQTRQQQQQQGG 320
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  992 wNDPPALNRVP------------------KKKKMPENF----MPPVPITSPIMNPSGDP----------QSQGLQQ--QP 1037
Cdd:pfam09606  321 -NHPAAHQQQMnqsvgqggqvvalgglnhLETWNPGNFgglgANPMQRGQPGMMSSPSPvpgqqvrqvtPNQFMRQspQP 399
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194 1038 STPGPLSSHASFPQQHLAGGQPF-HGVQQPLAQTG-MPPSFSKPNTEGaPGAPIGNTIQ---------HVQALPTEK--- 1103
Cdd:pfam09606  400 SVPSPQGPGSQPPQSHPGGMIPSpALIPSPSPQMSqQPAQQRTIGQDS-PGGSLNTPGQsavnsplnpQEEQLYREKyrq 478
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....*....
gi 1318663194 1104 ITKKPIPEEHLILKTTfedliqrclssaTDPQTKRKLDDASKRLEFLYD 1152
Cdd:pfam09606  479 LTKYIEPLKRMIAKME------------NDPGDIDKMNKMKRLLEILSN 515
PHA03247 PHA03247
large tegument protein UL36; Provisional
790-1090 1.75e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 56.10  E-value: 1.75e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  790 GKPVSGQESSQSPYERQPLSKGRPGPVAGHSQMPRVQTQQYYPHGENPPPPGFimqgnviPNPAAPLPTA--------PG 861
Cdd:PHA03247  2631 PSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRP-------RRRAARPTVGsltsladpPP 2703
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  862 HMPSQLPPYPQPQPYQPAQQYSFGTGGAAAYRPQQPVAP--------PASNAYPNTPYI-------SPVASYSGQPQMYT 926
Cdd:PHA03247  2704 PPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPavpagpatPGGPARPARPPTtagppapAPPAAPAAGPPRRL 2783
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  927 AQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSSSAYALPPGTTGTPPAASELPASQRTGP------QNGWNDP--PAL 998
Cdd:PHA03247  2784 TRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPpppslpLGGSVAPggDVR 2863
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  999 NRVPKKKKMPENFMPPVPITSPIMNPSGDPQSQGLQQQPSTPGPLSSHASFPQ-QHLAGGQPFHGVQQPLAQTGMPPSFS 1077
Cdd:PHA03247  2864 RRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPpQPQPQPPPPPQPQPPPPPPPRPQPPL 2943
                          330
                   ....*....|....
gi 1318663194 1078 KPNTEGAP-GAPIG 1090
Cdd:PHA03247  2944 APTTDPAGaGEPSG 2957
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
919-1113 2.43e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 55.16  E-value: 2.43e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  919 SGQPQMYTAQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSS-SAYALPPGTTGTPPAASELPASQRTGPQNGWNDPPa 997
Cdd:pfam03154  161 SAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSApSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHP- 239
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  998 lNRVPK--------KKKMPENFMPPVPITSPIMNPSGDPQSQGLQQQPS--------TPGPLSSHASFPQ-----QHLAG 1056
Cdd:pfam03154  240 -QRLPSphpplqpmTQPPPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPShmqhpvppQPFPLTPQSSQSQvppgpSPAAP 318
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1318663194 1057 GQPFHGVQQPLAQTgMPPSFSKPNTEGAPGAPIgnTIQHVQALPTEKITKKPIPEEH 1113
Cdd:pfam03154  319 GQSQQRIHTPPSQS-QLQSQQPPREQPLPPAPL--SMPHIKPPPTTPIPQLPNPQSH 372
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
572-766 3.56e-07

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 53.33  E-value: 3.56e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  572 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKY----FAKSQSKITRLItAVVMK----NWREIVE- 641
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFvrseFKGSNNKSGESL-AALYQvfagNSEEAVDe 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  642 --------SCDLKNWREALAAVLTYAKPDEFSALCDlLGTRLEREGdslLRTQACLCYICAgNVERLVACWTKAQDGSSP 713
Cdd:pfam12931   79 lvppsknaLWALDNWRETLALVLSNRSPGDVEALLA-LGDLLAQYG---RTEAAHICFLLA-GLPLSQTVLLGADHVRFP 153
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1318663194  714 LSLQDLIEkvvilrkAVQLTQ----ALDTNTVGA-------LLAEKMsQYASLLAAQGSIAAAL 766
Cdd:pfam12931  154 STFGNDLE-------SILLTEiyeyALSLSPPQPpfvglphLLPYKL-QHAAVLAEYGLVSEAQ 209
PHA03247 PHA03247
large tegument protein UL36; Provisional
852-1044 1.56e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.02  E-value: 1.56e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  852 PAAPLPTAPGHMPSQLPPYPQPQPYQPAQQYSFGTGGAAAYRP--QQPVAPPASNAYPntpyisPVASYSgQPQMYTAQQ 929
Cdd:PHA03247  2824 PAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPpsRSPAAKPAAPARP------PVRRLA-RPAVSRSTE 2896
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  930 ASSPTSSSAASFPPPSSGASFQHGGPGAPPSSSAYALPPGTTGTPPAASELPASQRTGPQNGWNDPPALNRVPKKKKMPE 1009
Cdd:PHA03247  2897 SFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAVPR 2976
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*..
gi 1318663194 1010 NFM----PPVPITSPIMNPSGDPQSQGLQQQPS--------TPGPLS 1044
Cdd:PHA03247  2977 FRVpqpaPSREAPASSTPPLTGHSLSRVSSWASslalheetDPPPVS 3023
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
252-336 2.53e-06

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 50.80  E-value: 2.53e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  252 PLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRISVY 331
Cdd:cd00200      1 LRRTLKGHTGGVTCVAFS-PDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLAS-GSSDKTIRLW 78

                   ....*
gi 1318663194  332 SIMGG 336
Cdd:cd00200     79 DLETG 83
DUF4813 pfam16072
Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. ...
886-982 1.92e-04

Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length.


Pssm-ID: 435117 [Multi-domain]  Cd Length: 288  Bit Score: 44.75  E-value: 1.92e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  886 TGGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPtsssaasfPPPSSGASFQHGGPGAPPSSSAYA 965
Cdd:pfam16072  163 AGGQQPAAPAAPAYPVAPAAYPAQAPAAAPAPAPGAPQTPLAPLNPVA--------AAPAAAAGAAAAPVVAAAAPAAAA 234
                           90
                   ....*....|....*..
gi 1318663194  966 LPPGTTGTPPAASELPA 982
Cdd:pfam16072  235 PPPPAPAAPPADAAPPA 251
PHA03247 PHA03247
large tegument protein UL36; Provisional
894-1088 1.98e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.08  E-value: 1.98e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  894 PQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSSSAYALPPGttGT 973
Cdd:PHA03247  2564 PDRSVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPD--PH 2641
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  974 PPAASELPASQRTGPQNGWNDPP----ALNRVPKKKKMPENFMPPV--PITSPIMNpSGDPQSQGLQQQPStPGPLSSHA 1047
Cdd:PHA03247  2642 PPPTVPPPERPRDDPAPGRVSRPrrarRLGRAAQASSPPQRPRRRAarPTVGSLTS-LADPPPPPPTPEPA-PHALVSAT 2719
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|.
gi 1318663194 1048 SFPQQHLAGGQPFhgvQQPLAQTGMPPSFSKPNTEGAPGAP 1088
Cdd:PHA03247  2720 PLPPGPAAARQAS---PALPAAPAPPAVPAGPATPGGPARP 2757
dnaA PRK14086
chromosomal replication initiator protein DnaA;
891-1088 3.73e-04

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 44.82  E-value: 3.73e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  891 AYRPQQPVAPPASNAYPNTPYIS--PVASYSGQPQMYTAQQASSPTSSSAasfPPPSSGASFQHGGPGA-PPSSSAYALP 967
Cdd:PRK14086    92 AGEPAPPPPHARRTSEPELPRPGrrPYEGYGGPRADDRPPGLPRQDQLPT---ARPAYPAYQQRPEPGAwPRAADDYGWQ 168
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  968 PGTTGTPPAASELPASQRTGPQNGWNDPPALNRvpkkkkmPENFMPPVPITSPimNPSGDPQSQGLQQQPStPGPLSSHA 1047
Cdd:PRK14086   169 QQRLGFPPRAPYASPASYAPEQERDREPYDAGR-------PEYDQRRRDYDHP--RPDWDRPRRDRTDRPE-PPPGAGHV 238
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....
gi 1318663194 1048 sfpqqHLAGGQPFHGVQQPL--AQTGMPPSFSK-PNTEGAPGAP 1088
Cdd:PRK14086   239 -----HRGGPGPPERDDAPVvpIRPSAPGPLAAqPAPAPGPGEP 277
PHA03247 PHA03247
large tegument protein UL36; Provisional
837-1090 5.43e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.54  E-value: 5.43e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  837 PPPPGfiMQGNVIPNPAAPLPTAPghmpsqlppypqpqpyqpaqqysfgtggAAAYRPQQPVAPPASNAyPNTPyISPVA 916
Cdd:PHA03247  2559 APPAA--PDRSVPPPRPAPRPSEP----------------------------AVTSRARRPDAPPQSAR-PRAP-VDDRG 2606
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  917 SYSGQPQMYTAQQASSPTSssaasfPPPSSGAS--FQHGGPG---APPSSSAYALPPGTTGTPP--AASELPASQRTGPQ 989
Cdd:PHA03247  2607 DPRGPAPPSPLPPDTHAPD------PPPPSPSPaaNEPDPHPpptVPPPERPRDDPAPGRVSRPrrARRLGRAAQASSPP 2680
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  990 NGWNdPPALNrvPKKKKMPENFMPPVPITSPimNPSGDPQSQGLqqqPSTPGPLSSHASFPQQHLAGGQPF--HGVQQPL 1067
Cdd:PHA03247  2681 QRPR-RRAAR--PTVGSLTSLADPPPPPPTP--EPAPHALVSAT---PLPPGPAAARQASPALPAAPAPPAvpAGPATPG 2752
                          250       260
                   ....*....|....*....|....*
gi 1318663194 1068 AQT--GMPPSFSKPNTEGAPGAPIG 1090
Cdd:PHA03247  2753 GPArpARPPTTAGPPAPAPPAAPAA 2777
PRK14959 PRK14959
DNA polymerase III subunits gamma and tau; Provisional
944-1059 5.77e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 184923 [Multi-domain]  Cd Length: 624  Bit Score: 44.29  E-value: 5.77e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  944 PSSGASFQHGGPGAP-PSSSAYAL--PPGTTGTPPAAselPASQRTgPQNGWNDPPALNRVPkKKKMPENFMPPVPITSP 1020
Cdd:PRK14959   373 PSGGGASAPSGSAAEgPASGGAATipTPGTQGPQGTA---PAAGMT-PSSAAPATPAPSAAP-SPRVPWDDAPPAPPRSG 447
                           90       100       110
                   ....*....|....*....|....*....|....*....
gi 1318663194 1021 IMnPSGDPQSQGLQQQPSTPGPLSSHASFPQQHLAGGQP 1059
Cdd:PRK14959   448 IP-PRPAPRMPEASPVPGAPDSVASASDAPPTLGDPSDT 485
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
898-1069 6.10e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 44.26  E-value: 6.10e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  898 VAPPASNAYPNTPYISPVASYSGQP------------QMyTAQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSSSAYA 965
Cdd:pfam09770  165 VAPKKAAAPAPAPQPAAQPASLPAPsrkmmsleeveaAM-RAQAKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQ 243
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  966 LPPGTTGTPP---AASELPASQRTGPQNGWNDPPALNRVPKKKKMPENfMPPVPI--TSPIMNPS--GDPQSQGLQQQPS 1038
Cdd:pfam09770  244 QQPQQQPQQPqqhPGQGHPVTILQRPQSPQPDPAQPSIQPQAQQFHQQ-PPPVPVqpTQILQNPNrlSAARVGYPQNPQP 322
                          170       180       190
                   ....*....|....*....|....*....|..
gi 1318663194 1039 TPGPLSSHASFPQQHLAGGQ-PFHGVQQPLAQ 1069
Cdd:pfam09770  323 GVQPAPAHQAHRQQGSFGRQaPIITHPQQLAQ 354
PHA03247 PHA03247
large tegument protein UL36; Provisional
813-1111 7.74e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.16  E-value: 7.74e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  813 PGP-VAGHSQMPRVQTQQYYPHGENPPPPGFimQGNVIPNPAAPLPTAPGHMPSQLppypqpqpyqPAQQYSFGTGGAAA 891
Cdd:PHA03247  2578 SEPaVTSRARRPDAPPQSARPRAPVDDRGDP--RGPAPPSPLPPDTHAPDPPPPSP----------SPAANEPDPHPPPT 2645
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  892 Y-----------------------------------RPQQPVAPPA---------SNAYPNTPYISPVASYSGQP-QMYT 926
Cdd:PHA03247  2646 VppperprddpapgrvsrprrarrlgraaqassppqRPRRRAARPTvgsltsladPPPPPPTPEPAPHALVSATPlPPGP 2725
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  927 AQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSSSAYALPPGTtgTPPAASELPASQRTGPQNGWNDPPALNRVPKKKK 1006
Cdd:PHA03247  2726 AAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAP--APPAAPAAGPPRRLTRPAVASLSESRESLPSPWD 2803
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194 1007 ---MPENFMPPVPITSPIMNPSG--DPQSQGLQQQPSTPGPLSSHASFPQQHLAGGQPFHGVQQPLAQTGMPPSFSKPNT 1081
Cdd:PHA03247  2804 padPPAAVLAPAAALPPAASPAGplPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPV 2883
                          330       340       350
                   ....*....|....*....|....*....|
gi 1318663194 1082 EGAPGAPIGNTIQHvQALPTEKITKKPIPE 1111
Cdd:PHA03247  2884 RRLARPAVSRSTES-FALPPDQPERPPQPQ 2912
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
885-1090 1.06e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 43.44  E-value: 1.06e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  885 GTGGAAAyRPQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSSSAY 964
Cdd:PRK07764   589 GPAPGAA-GGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGD 667
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  965 ALPPGTTGTPPAAsELPASQRTGPQNGWNDPPALNRV-PKKKKMPENFMPPVPITSPIMNPSGDPQSQGLQQQPSTPGPl 1043
Cdd:PRK07764   668 GWPAKAGGAAPAA-PPPAPAPAAPAAPAGAAPAQPAPaPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEP- 745
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*..
gi 1318663194 1044 sSHASFPQQHLAGGQPFHGVQQPLAQTGMPPSFSKPNTEGAPGAPIG 1090
Cdd:PRK07764   746 -DDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEEEEMAEDDAP 791
PRK10263 PRK10263
DNA translocase FtsK; Provisional
947-1182 1.21e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 43.15  E-value: 1.21e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  947 GASFQHGGPgappsssAYALPPGTTGTPPAASELPASQR---TGPQNGWNDPPALNRV---PKKKKMPENFMPPV--PIT 1018
Cdd:PRK10263   677 GEQYQHDVP-------VNAEDADAAAEAELARQFAQTQQqrySGEQPAGANPFSLDDFefsPMKALLDDGPHEPLftPIV 749
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194 1019 SPIMNPSGDPQSQGLQQQPSTPGPLSSHASFPQQHLAGGQPFHGVQQPLAQtgmPPSFSKPNTEGAPGAPIGNTIQHVQA 1098
Cdd:PRK10263   750 EPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAP---QPQYQQPQQPVAPQPQYQQPQQPVAP 826
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194 1099 LPTEKITKKPI---PEEHLILKTTFEDLIQRCLSSATDPQTKrklddaskrLEFLYDKLREqtLSPTIINGLHSIARSIE 1175
Cdd:PRK10263   827 QPQYQQPQQPVapqPQDTLLHPLLMRNGDSRPLHKPTTPLPS---------LDLLTPPPSE--VEPVDTFALEQMARLVE 895
                          250
                   ....*....|....*...
gi 1318663194 1176 TR-----------NYSEG 1182
Cdd:PRK10263   896 ARladfrikadvvNYSPG 913
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
814-1016 1.70e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 42.56  E-value: 1.70e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  814 GPVAghSQMPRVQTQQYYPHGENPPPPgfimqgnVIPNPAAPLPTAPGHMPSQLPPYPQPQPYQPAQQYSFGTGGAAAYR 893
Cdd:PRK12323   380 APVA--QPAPAAAAPAAAAPAPAAPPA-------APAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAP 450
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  894 PQQPVAPPASNAYPNTPYISPVASYSGQPQmytAQQASSPTSSSAASFPPPSSGAsfqhggPGAPPSSSAYALPPGTTGT 973
Cdd:PRK12323   451 APAPAAAPAAAARPAAAGPRPVAAAAAAAP---ARAAPAAAPAPADDDPPPWEEL------PPEFASPAPAQPDAAPAGW 521
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....
gi 1318663194  974 PPAASELPA-SQRTGPQNGWNDPPALNRVPKKKKMPENFMPPVP 1016
Cdd:PRK12323   522 VAESIPDPAtADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRP 565
dnaA PRK14086
chromosomal replication initiator protein DnaA;
792-988 3.04e-03

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 41.73  E-value: 3.04e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  792 PVSGQESSQSP----YERQPLSKGRPGPVAGHSQMPRVQTQQYYPHGENPP--PPGFimqgnvipnPAAPLPTAPGHMPS 865
Cdd:PRK14086    90 PSAGEPAPPPPharrTSEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPtaRPAY---------PAYQQRPEPGAWPR 160
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  866 QLPPYPQPQPYQPAQQYSFGTGGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSssaasfPPPS 945
Cdd:PRK14086   161 AADDYGWQQQRLGFPPRAPYASPASYAPEQERDREPYDAGRPEYDQRRRDYDHPRPDWDRPRRDRTDRPE------PPPG 234
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....
gi 1318663194  946 SGASfQHGGPGAPPSSSAYALPPGTT-GTPPAASELPASQRTGP 988
Cdd:PRK14086   235 AGHV-HRGGPGPPERDDAPVVPIRPSaPGPLAAQPAPAPGPGEP 277
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
885-1081 3.84e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 41.40  E-value: 3.84e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  885 GTGGAAAYRP--QQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASFPPPSSGA-------SFQHGGP 955
Cdd:PRK12323   367 QSGGGAGPATaaAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEAlaaarqaSARGPGG 446
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  956 GAPPSSSAYALPPGTTGTPPAASELPASQRTGPQNGWNDP----PALNRVPKKKKMPENFMPPVPITspiMNPSGDPQSQ 1031
Cdd:PRK12323   447 APAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAaapaPADDDPPPWEELPPEFASPAPAQ---PDAAPAGWVA 523
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|
gi 1318663194 1032 GLQQQPSTPGPLSSHASFPQQHLAGGQPFHGVQQPLAQTGMPPSFSKPNT 1081
Cdd:PRK12323   524 ESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPPRASASGL 573
DUF3824 pfam12868
Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It ...
794-910 4.49e-03

Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It is proline-rich, and the function is not known.


Pssm-ID: 372351 [Multi-domain]  Cd Length: 145  Bit Score: 38.95  E-value: 4.49e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  794 SGQESSQSPYERQPLSKGRPGPVAGHsqmprvqtQQYYPHGEN-PPPPGFIMQGNVIPNPAAPL-PTAPGHMPSqlppyp 871
Cdd:pfam12868   47 DYRDYYEDPYSPSPYPPSPAGPYASQ--------GQYYPETNYfPPPPGSTPQPPVDPQPNAPPpPYNPADYPP------ 112
                           90       100       110
                   ....*....|....*....|....*....|....*....
gi 1318663194  872 qpqpyqpaqqysfGTGGAAAYRPQQPVAPPASNAYPNTP 910
Cdd:pfam12868  113 -------------PPGAAPPPQPYQYPPPPGPDPYAPRP 138
KREPA2 cd23959
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ...
885-989 4.97e-03

Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.


Pssm-ID: 467780 [Multi-domain]  Cd Length: 424  Bit Score: 41.01  E-value: 4.97e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  885 GTGGAAAYRPQQPVAP-PASNAYPNTPY-------ISPVASYSGQPQMYTAQQASSPTSSSAASFPPPSSGASFQHGGPG 956
Cdd:cd23959    123 SSTQRETHKTAQVAPPkAEPQTAPVTPFgqlpmfgQHPPPAKPLPAAAAAQQSSASPGEVASPFASGTVSASPFATATDT 202
                           90       100       110
                   ....*....|....*....|....*....|....*...
gi 1318663194  957 APPSSSAYALP-----PGTTGTPPAASELPASQRTGPQ 989
Cdd:cd23959    203 APSSGAPDGFPaeasaPSPFAAPASAASFPAAPVANGE 240
PHA03379 PHA03379
EBNA-3A; Provisional
788-1088 5.36e-03

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 41.20  E-value: 5.36e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  788 AQGKPVSGQESSQSPYERQPLSKGRPGPV-----AGH---------SQMPRVQTQQYYPH---GENPPPPGFIMQGNVIP 850
Cdd:PHA03379   468 AQLPPGPLQDLEPGDQLPGVVQDGRPACApvpapAGPivrpweaslSQVPGVAFAPVMPQpmpVEPVPVPTVALERPVCP 547
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  851 NPAAPLPTAPGHMPSQLPPYPQPQPYQ-------PAQQYSFGTGGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGqPQ 923
Cdd:PHA03379   548 APPLIAMQGPGETSGIVRVRERWRPAPwtpnpprSPSQMSVRDRLARLRAEAQPYQASVEVQPPQLTQVSPQQPMEY-PL 626
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194  924 MYTAQQASSPtsssaasfPPPSSGASFQHGG-PGAPPSSSAYALP-PGTTGTPPAaselPASQRTGPqngwndppalnRV 1001
Cdd:PHA03379   627 EPEQQMFPGS--------PFSQVADVMRAGGvPAMQPQYFDLPLQqPISQGAPLA----PLRASMGP-----------VP 683
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663194 1002 PKKKKMPENFmpPVPITSPImnPSGDPQSQGLQQQPSTPGPLSSHASFPqqhlaGGQPFHGVQQPLAQT---GMPpsFSK 1078
Cdd:PHA03379   684 PVPATQPQYF--DIPLTEPI--NQGASAAHFLPQQPMEGPLVPERWMFQ-----GATLSQSVRPGVAQSqyfDLP--LTQ 752
                          330
                   ....*....|
gi 1318663194 1079 PNTEGAPGAP 1088
Cdd:PHA03379   753 PINHGAPAAH 762
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH