NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|971402013|ref|XP_015143811|]
View 

protein transport protein Sec24C isoform X3 [Gallus gallus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
COG5028 super family cl34873
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
321-1115 8.47e-175

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


The actual alignment was detected with superfamily member COG5028:

Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 535.53  E-value: 8.47e-175
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  321 PQPNYGGAYPGTPNYGSQPGPppkrldpdSIPS-PIQVIEDdrnnrgSEPFVTGVRG----QVPPLvTTNFLVKDQGNAS 395
Cdd:COG5028    85 SQQKFSSPYGGSMADGTAPKP--------TNPLvPVDLFED------QPPPISDLFLppppIVPPL-TTNFVGSEQSNCS 149
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  396 PRYIRCTSYNIPCTSDMAKQSQVPLAAVIKPLATLPPEETLPYLVDHGEsgPVRCNRCKAYMCPFMQFIEGGRRFQCCFC 475
Cdd:COG5028   150 PKYVRSTMYAIPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNIC 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  476 SCVTEVPAHYFQHLDHTGKRVDFYDRPELSLGSYEFLATVDYckNNKFPSPPAFIFMIDVSYNAVKSGLVRLICEELKSI 555
Cdd:COG5028   228 RSKNDVPEGFDNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILEN 305
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  556 LDYLPregNMEESAiRVGFVTYNKVLHFYNVKSSLaQPQMMVVSDVADMFVPLLDG-FLVNVNESRTVITSLLDQIPEMF 634
Cdd:COG5028   306 LDQIP---NFDPRT-KIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIF 380
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  635 ADTRETETVFGPviqagleALKAA-----ECAGKLFIFHTSLPIAeAPGKLKNRDDKklintdkEKTLFQPQTSFYSNLA 709
Cdd:COG5028   381 QDNKSPKNALGP-------ALKAAksligGTGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSLLSCKDSFYKEFA 445
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  710 KDCVAQGCCVDLFLFPNQYLDVATLGVVTYQTGGSIYKYAYFQLE--ADQDRFLNDLRRDVQKEVGFDAVMRVRTSTGIR 787
Cdd:COG5028   446 IECSKVGISVDLFLTSEDYIDVATLSHLCRYTGGQTYFYPNFSATrpNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLR 525
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  788 ATDFFGAFYMSNTTDVEMAGLDCDKTITVEFKHDDKLSeDSGALLQCALLYTSCAGQRRLRIHNLSLNCCTQLADLYRNC 867
Cdd:COG5028   526 VSSFYGNFFNRSSDLCAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASA 604
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  868 ETDTLINYLAKYAYRGVLNSPVKSVRDSLINQCAQILACYRKNCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQPGpE 947
Cdd:COG5028   605 DQLAIACILAKKASTKALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLALLKSSAFRSG-S 683
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  948 VTTDDRAYIRQLVTSMDVAETNVFFYPRLLPLTKADVDSDS-------LPAAIRNSEERLSKGDIYLLENGLNIFVWVGV 1020
Cdd:COG5028   684 TPSDIRISALNRLTSLPLKQLMRNIYPTLYALHDMPIEAGLpdegllvLPSPINATSSLLESGGLYLIDTGQKIFLWFGK 763
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013 1021 NVQQGLIQNLFGVSSFSQISSTLSTLPVLENPFSKKVRSIIDMLHLQ-RSRYMKLIIVKQ--EDKLEMLFKHFLVEDKSL 1097
Cdd:COG5028   764 DAVPSLLQDLFGVDSLSDIPSGKFTLPPTGNEFNERVRNIIGELRSVnDDSTLPLVLVRGggDPSLRLWFFSTLVEDKTL 843
                         810
                  ....*....|....*...
gi 971402013 1098 tGGASYVDFLCHMHKEIR 1115
Cdd:COG5028   844 -NIPSYLDYLQILHEKIK 860
Atrophin-1 super family cl38111
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
3-281 3.26e-13

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


The actual alignment was detected with superfamily member pfam03154:

Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 74.42  E-value: 3.26e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013     3 VNQHTHAGPPygqPQPGYQGYQQPAYGGqPLPGVPHT-QYGAYNGPMPGYQQPVPP-----QGSVRALPTSGAPPPASGT 76
Cdd:pfam03154  249 LQPMTQPPPP---SQVSPQPLPQPSLHG-QMPPMPHSlQTGPSHMQHPVPPQPFPLtpqssQSQVPPGPSPAAPGQSQQR 324
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    77 SLPSGHQGYSQFGQGDVQNGIPTSTAPMQ--RPPASQPFLPGSA------PAPVSQPSTFQQYG--PPPCSVQQLSNHMa 146
Cdd:pfam03154  325 IHTPPSQSQLQSQQPPREQPLPPAPLSMPhiKPPPTTPIPQLPNpqshkhPPHLSGPSPFQMNSnlPPPPALKPLSSLS- 403
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   147 gmTIGSTSVSAPP----PAGLGYGPPTSVPPV---SGSFSATGSGLYTPYTASPGPpppsvpqglplAQPPFSGQPVPTQ 219
Cdd:pfam03154  404 --THHPPSAHPPPlqlmPQSQQLPPPPAQPPVltqSQSLPPPAASHPPTSGLHQVP-----------SQSPFPQHPFVPG 470
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 971402013   220 RLPTEVPGFAPPPPA----TGIGASSYPPPTGAPRPPPMPGPPLSGQTVAGPPMSQPNHVSSPPPP 281
Cdd:pfam03154  471 GPPPITPPSGPPTSTssamPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPP 536
 
Name Accession Description Interval E-value
COG5028 COG5028
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
321-1115 8.47e-175

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 535.53  E-value: 8.47e-175
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  321 PQPNYGGAYPGTPNYGSQPGPppkrldpdSIPS-PIQVIEDdrnnrgSEPFVTGVRG----QVPPLvTTNFLVKDQGNAS 395
Cdd:COG5028    85 SQQKFSSPYGGSMADGTAPKP--------TNPLvPVDLFED------QPPPISDLFLppppIVPPL-TTNFVGSEQSNCS 149
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  396 PRYIRCTSYNIPCTSDMAKQSQVPLAAVIKPLATLPPEETLPYLVDHGEsgPVRCNRCKAYMCPFMQFIEGGRRFQCCFC 475
Cdd:COG5028   150 PKYVRSTMYAIPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNIC 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  476 SCVTEVPAHYFQHLDHTGKRVDFYDRPELSLGSYEFLATVDYckNNKFPSPPAFIFMIDVSYNAVKSGLVRLICEELKSI 555
Cdd:COG5028   228 RSKNDVPEGFDNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILEN 305
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  556 LDYLPregNMEESAiRVGFVTYNKVLHFYNVKSSLaQPQMMVVSDVADMFVPLLDG-FLVNVNESRTVITSLLDQIPEMF 634
Cdd:COG5028   306 LDQIP---NFDPRT-KIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIF 380
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  635 ADTRETETVFGPviqagleALKAA-----ECAGKLFIFHTSLPIAeAPGKLKNRDDKklintdkEKTLFQPQTSFYSNLA 709
Cdd:COG5028   381 QDNKSPKNALGP-------ALKAAksligGTGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSLLSCKDSFYKEFA 445
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  710 KDCVAQGCCVDLFLFPNQYLDVATLGVVTYQTGGSIYKYAYFQLE--ADQDRFLNDLRRDVQKEVGFDAVMRVRTSTGIR 787
Cdd:COG5028   446 IECSKVGISVDLFLTSEDYIDVATLSHLCRYTGGQTYFYPNFSATrpNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLR 525
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  788 ATDFFGAFYMSNTTDVEMAGLDCDKTITVEFKHDDKLSeDSGALLQCALLYTSCAGQRRLRIHNLSLNCCTQLADLYRNC 867
Cdd:COG5028   526 VSSFYGNFFNRSSDLCAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASA 604
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  868 ETDTLINYLAKYAYRGVLNSPVKSVRDSLINQCAQILACYRKNCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQPGpE 947
Cdd:COG5028   605 DQLAIACILAKKASTKALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLALLKSSAFRSG-S 683
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  948 VTTDDRAYIRQLVTSMDVAETNVFFYPRLLPLTKADVDSDS-------LPAAIRNSEERLSKGDIYLLENGLNIFVWVGV 1020
Cdd:COG5028   684 TPSDIRISALNRLTSLPLKQLMRNIYPTLYALHDMPIEAGLpdegllvLPSPINATSSLLESGGLYLIDTGQKIFLWFGK 763
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013 1021 NVQQGLIQNLFGVSSFSQISSTLSTLPVLENPFSKKVRSIIDMLHLQ-RSRYMKLIIVKQ--EDKLEMLFKHFLVEDKSL 1097
Cdd:COG5028   764 DAVPSLLQDLFGVDSLSDIPSGKFTLPPTGNEFNERVRNIIGELRSVnDDSTLPLVLVRGggDPSLRLWFFSTLVEDKTL 843
                         810
                  ....*....|....*...
gi 971402013 1098 tGGASYVDFLCHMHKEIR 1115
Cdd:COG5028   844 -NIPSYLDYLQILHEKIK 860
Sec24-like cd01479
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ...
524-772 1.40e-121

Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.


Pssm-ID: 238756 [Multi-domain]  Cd Length: 244  Bit Score: 373.53  E-value: 1.40e-121
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  524 PSPPAFIFMIDVSYNAVKSGLVRLICEELKSILDYLPREgnmeESAIRVGFVTYNKVLHFYNVKSSLAQPQMMVVSDVAD 603
Cdd:cd01479     1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGD----DPRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  604 MFVPLLDGFLVNVNESRTVITSLLDQIPEMFADTRETETVFGPVIQAGLEALKaaECAGKLFIFHTSLPIAEApGKLKNR 683
Cdd:cd01479    77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  684 DDKKLINTDKEKTLFQPQTSFYSNLAKDCVAQGCCVDLFLFPNQYLDVATLGVVTYQTGGSIYKYA--YFQLEADQDRFL 761
Cdd:cd01479   154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYPsfNFSAPNDVEKLV 233
                         250
                  ....*....|.
gi 971402013  762 NDLRRDVQKEV 772
Cdd:cd01479   234 NELARYLTRKI 244
Sec23_trunk pfam04811
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
524-768 6.62e-115

Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.


Pssm-ID: 398467 [Multi-domain]  Cd Length: 241  Bit Score: 355.79  E-value: 6.62e-115
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   524 PSPPAFIFMIDVSYNAVKSGLVRLICEELKSILDYLPREGNMeesaiRVGFVTYNKVLHFYNVKSSLAQPQMMVVSDVAD 603
Cdd:pfam04811    1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGDPRA-----RVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   604 MFVPLLDGFLVNVNESRTVITSLLDQIPEMFADTRETETVFGPVIQAGLEALKAAECAGKLFIFHTSLPIAEAPGKLKNR 683
Cdd:pfam04811   76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   684 DDKKLINTDKEKTLFQPQT-SFYSNLAKDCVAQGCCVDLFLFPNQYLDVATLGVVTYQTGGSIYKYAYFQLEADQDRFLN 762
Cdd:pfam04811  156 LDESHHGTDKEKAKLVKKAdKFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235

                   ....*.
gi 971402013   763 DLRRDV 768
Cdd:pfam04811  236 DLQRYF 241
PTZ00395 PTZ00395
Sec24-related protein; Provisional
522-1116 1.73e-49

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 192.60  E-value: 1.73e-49
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  522 KFPS-----PPAFIFMIDVSYNAVKSGLVRLICEELKSILDylpregNMEESAIRVGFVTYNKVLHFYNVKSSLAQP--- 593
Cdd:PTZ00395  943 KYPQvknmlPPYFVFVVECSYNAIYNNITYTILEGIRYAVQ------NVKCPQTKIAIITFNSSIYFYHCKGGKGVSgee 1016
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  594 ----------QMMVVSDVADMFVPL-LDGFLVNVNESRTVITSLLDQIPEMFADTRETETVFGPVIQAGLEALKAAECAG 662
Cdd:PTZ00395 1017 gdggggsgnhQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSCGNSALKIAMDMLKERNGLG 1096
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  663 KLFIFHTSLPIAeAPGKLKnrddkKLINTDKEKTLFQPQTSFYSNLAKDCVAQGCCVDLFLFP--NQYLDVATLGVVTYQ 740
Cdd:PTZ00395 1097 SICMFYTTTPNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFIISsnNVRVCVPSLQYVAQN 1170
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  741 TGGSIYKYAYFQLEAD-QDRFLNDLRRDVQKEVGFDAVMRVRTSTGIRATDFFGAFYMSNTT----DVEMAGLDCDKTIT 815
Cdd:PTZ00395 1171 TGGKILFVENFLWQKDyKEIYMNIMDTLTSEDIAYCCELKLRYSHHMSVKKLFCCNNNFNSIisvdTIKIPKIRHDQTFA 1250
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  816 VEFKHDDKLSEDSGALLQCALLYTSCAGQRRLRIHNLSLNCCTQLADLYRNCETDTLINYLAKYAYRGVLNSPVKSvrDS 895
Cdd:PTZ00395 1251 FLLNYSDISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNILIKQLCTNILHNDNYS--KI 1328
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  896 LINQCAQILACYRKNCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQpgPEVTTDDRAYIRQLVTSMDVAETNVFFYP- 974
Cdd:PTZ00395 1329 IIDNLAAILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNVTK--KEILHDLKVYSLIKLLSMPIISSLLYVYPv 1406
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  975 --------RLLPLTKADVDSD-SLPAAIRNSEERLSKGDIYLLENGLNIFVWVGVNVQQGLIQNLFGVSSFSQISSTLSt 1045
Cdd:PTZ00395 1407 myvihikgKTNEIDSMDVDDDlFIPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANFAKEIVGDIPTEKNAHELN- 1485
                         570       580       590       600       610       620       630
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 971402013 1046 lpVLENPFSKKVRSIIDML----HLqrSRYMKLIIVKQEDKLEMLFKHFLVEDKSlTGGASYVDFLCHMHKEIRQ 1116
Cdd:PTZ00395 1486 --LTDTPNAQKVQRIIKNLsrihHF--NKYVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYVNFLCFIHKLVHK 1555
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
3-281 3.26e-13

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 74.42  E-value: 3.26e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013     3 VNQHTHAGPPygqPQPGYQGYQQPAYGGqPLPGVPHT-QYGAYNGPMPGYQQPVPP-----QGSVRALPTSGAPPPASGT 76
Cdd:pfam03154  249 LQPMTQPPPP---SQVSPQPLPQPSLHG-QMPPMPHSlQTGPSHMQHPVPPQPFPLtpqssQSQVPPGPSPAAPGQSQQR 324
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    77 SLPSGHQGYSQFGQGDVQNGIPTSTAPMQ--RPPASQPFLPGSA------PAPVSQPSTFQQYG--PPPCSVQQLSNHMa 146
Cdd:pfam03154  325 IHTPPSQSQLQSQQPPREQPLPPAPLSMPhiKPPPTTPIPQLPNpqshkhPPHLSGPSPFQMNSnlPPPPALKPLSSLS- 403
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   147 gmTIGSTSVSAPP----PAGLGYGPPTSVPPV---SGSFSATGSGLYTPYTASPGPpppsvpqglplAQPPFSGQPVPTQ 219
Cdd:pfam03154  404 --THHPPSAHPPPlqlmPQSQQLPPPPAQPPVltqSQSLPPPAASHPPTSGLHQVP-----------SQSPFPQHPFVPG 470
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 971402013   220 RLPTEVPGFAPPPPA----TGIGASSYPPPTGAPRPPPMPGPPLSGQTVAGPPMSQPNHVSSPPPP 281
Cdd:pfam03154  471 GPPPITPPSGPPTSTssamPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPP 536
PHA03247 PHA03247
large tegument protein UL36; Provisional
46-380 3.15e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 64.96  E-value: 3.15e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   46 GPMPGYQQPVPPQGSvralPTSGAPPPASGTSLPSGHQGYSQFGQgdvqngiPTSTAPMQRPPASQPFLPGSAPAPVSQP 125
Cdd:PHA03247 2693 GSLTSLADPPPPPPT----PEPAPHALVSATPLPPGPAAARQASP-------ALPAAPAPPAVPAGPATPGGPARPARPP 2761
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  126 STfqqYGPPpcsvqqlsnhmagmtigstsvSAPPPAGLGYGPP--TSVPPVSGSFSATGSGLYTPYTASPGPPPPSVPQG 203
Cdd:PHA03247 2762 TT---AGPP---------------------APAPPAAPAAGPPrrLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAA 2817
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  204 LPLAQPPFSGQPVPTQRLPTEVPGFAPPPPATGIGASSYPPPTGAPRPPPMPGPPLSGQTVAGPPMSQPNHVSSPPPPLT 283
Cdd:PHA03247 2818 LPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTES 2897
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  284 LSGPHPGPPMSGPPPPTHPPQPGYQMQQNgsfgqvrgPQPNyggayPGTPNYGSQPGPPPKRLDPDSIPSPIQVIEDDRN 363
Cdd:PHA03247 2898 FALPPDQPERPPQPQAPPPPQPQPQPPPP--------PQPQ-----PPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWL 2964
                         330
                  ....*....|....*..
gi 971402013  364 NRGSEPFVTGVRGQVPP 380
Cdd:PHA03247 2965 GALVPGRVAVPRFRVPQ 2981
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
2-148 1.44e-04

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 45.95  E-value: 1.44e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013     2 SVNQHTHAGPPYGQPQPGYQgyqqpaYGGQPLpGVPHTQygayngPMPGYQQPVPPQGsvralPTSGAPPPASGTSlpsg 81
Cdd:TIGR01628  389 SPMGGAMGQPPYYGQGPQQQ------FNGQPL-GWPRMS------MMPTPMGPGGPLR-----PNGLAPMNAVRAP---- 446
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 971402013    82 hqgysqfGQGDVQNGIPTSTAPMQRPPASQPfLPGSAPAPVSQPSTFQQyGPPPCSVQQLSNHMAGM 148
Cdd:TIGR01628  447 -------SRNAQNAAQKPPMQPVMYPPNYQS-LPLSQDLPQPQSTASQG-GQNKKLAQVLASATPQM 504
PPE COG5651
PPE-repeat protein [Function unknown];
63-285 7.42e-03

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 39.88  E-value: 7.42e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   63 ALPTSGAPPPASGTslPSGHQGYSQFG--QGDVQNGIPTSTAPMQRPPASQPFLPGSAPAPVSqpstfqqygPPPCSVQQ 140
Cdd:COG5651   163 ALTPFTQPPPTITN--PGGLLGAQNAGsgNTSSNPGFANLGLTGLNQVGIGGLNSGSGPIGLN---------SGPGNTGF 231
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  141 LSNHMAGMTIGSTSVSAPPPAGLGYGPPTSVPPVSGSFSATGSGLYTPYTASPGPPPPSVPQGLPLAQPPFSGQPVPTQR 220
Cdd:COG5651   232 AGTGAAAGAAAAAAAAAAAAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAATGLGLG 311
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 971402013  221 LPTEVPGFAPPPPATGIGASSYPPPTGAPRPPPMPGPPLSGQTVAGPPMSQPNHVSSPPPPLTLS 285
Cdd:COG5651   312 AGGAAGAAGATGAGAALGAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGAAAG 376
 
Name Accession Description Interval E-value
COG5028 COG5028
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
321-1115 8.47e-175

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 535.53  E-value: 8.47e-175
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  321 PQPNYGGAYPGTPNYGSQPGPppkrldpdSIPS-PIQVIEDdrnnrgSEPFVTGVRG----QVPPLvTTNFLVKDQGNAS 395
Cdd:COG5028    85 SQQKFSSPYGGSMADGTAPKP--------TNPLvPVDLFED------QPPPISDLFLppppIVPPL-TTNFVGSEQSNCS 149
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  396 PRYIRCTSYNIPCTSDMAKQSQVPLAAVIKPLATLPPEETLPYLVDHGEsgPVRCNRCKAYMCPFMQFIEGGRRFQCCFC 475
Cdd:COG5028   150 PKYVRSTMYAIPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNIC 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  476 SCVTEVPAHYFQHLDHTGKRVDFYDRPELSLGSYEFLATVDYckNNKFPSPPAFIFMIDVSYNAVKSGLVRLICEELKSI 555
Cdd:COG5028   228 RSKNDVPEGFDNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILEN 305
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  556 LDYLPregNMEESAiRVGFVTYNKVLHFYNVKSSLaQPQMMVVSDVADMFVPLLDG-FLVNVNESRTVITSLLDQIPEMF 634
Cdd:COG5028   306 LDQIP---NFDPRT-KIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIF 380
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  635 ADTRETETVFGPviqagleALKAA-----ECAGKLFIFHTSLPIAeAPGKLKNRDDKklintdkEKTLFQPQTSFYSNLA 709
Cdd:COG5028   381 QDNKSPKNALGP-------ALKAAksligGTGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSLLSCKDSFYKEFA 445
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  710 KDCVAQGCCVDLFLFPNQYLDVATLGVVTYQTGGSIYKYAYFQLE--ADQDRFLNDLRRDVQKEVGFDAVMRVRTSTGIR 787
Cdd:COG5028   446 IECSKVGISVDLFLTSEDYIDVATLSHLCRYTGGQTYFYPNFSATrpNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLR 525
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  788 ATDFFGAFYMSNTTDVEMAGLDCDKTITVEFKHDDKLSeDSGALLQCALLYTSCAGQRRLRIHNLSLNCCTQLADLYRNC 867
Cdd:COG5028   526 VSSFYGNFFNRSSDLCAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASA 604
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  868 ETDTLINYLAKYAYRGVLNSPVKSVRDSLINQCAQILACYRKNCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQPGpE 947
Cdd:COG5028   605 DQLAIACILAKKASTKALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLALLKSSAFRSG-S 683
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  948 VTTDDRAYIRQLVTSMDVAETNVFFYPRLLPLTKADVDSDS-------LPAAIRNSEERLSKGDIYLLENGLNIFVWVGV 1020
Cdd:COG5028   684 TPSDIRISALNRLTSLPLKQLMRNIYPTLYALHDMPIEAGLpdegllvLPSPINATSSLLESGGLYLIDTGQKIFLWFGK 763
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013 1021 NVQQGLIQNLFGVSSFSQISSTLSTLPVLENPFSKKVRSIIDMLHLQ-RSRYMKLIIVKQ--EDKLEMLFKHFLVEDKSL 1097
Cdd:COG5028   764 DAVPSLLQDLFGVDSLSDIPSGKFTLPPTGNEFNERVRNIIGELRSVnDDSTLPLVLVRGggDPSLRLWFFSTLVEDKTL 843
                         810
                  ....*....|....*...
gi 971402013 1098 tGGASYVDFLCHMHKEIR 1115
Cdd:COG5028   844 -NIPSYLDYLQILHEKIK 860
Sec24-like cd01479
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ...
524-772 1.40e-121

Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.


Pssm-ID: 238756 [Multi-domain]  Cd Length: 244  Bit Score: 373.53  E-value: 1.40e-121
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  524 PSPPAFIFMIDVSYNAVKSGLVRLICEELKSILDYLPREgnmeESAIRVGFVTYNKVLHFYNVKSSLAQPQMMVVSDVAD 603
Cdd:cd01479     1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGD----DPRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  604 MFVPLLDGFLVNVNESRTVITSLLDQIPEMFADTRETETVFGPVIQAGLEALKaaECAGKLFIFHTSLPIAEApGKLKNR 683
Cdd:cd01479    77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  684 DDKKLINTDKEKTLFQPQTSFYSNLAKDCVAQGCCVDLFLFPNQYLDVATLGVVTYQTGGSIYKYA--YFQLEADQDRFL 761
Cdd:cd01479   154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYPsfNFSAPNDVEKLV 233
                         250
                  ....*....|.
gi 971402013  762 NDLRRDVQKEV 772
Cdd:cd01479   234 NELARYLTRKI 244
Sec23_trunk pfam04811
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
524-768 6.62e-115

Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.


Pssm-ID: 398467 [Multi-domain]  Cd Length: 241  Bit Score: 355.79  E-value: 6.62e-115
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   524 PSPPAFIFMIDVSYNAVKSGLVRLICEELKSILDYLPREGNMeesaiRVGFVTYNKVLHFYNVKSSLAQPQMMVVSDVAD 603
Cdd:pfam04811    1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGDPRA-----RVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   604 MFVPLLDGFLVNVNESRTVITSLLDQIPEMFADTRETETVFGPVIQAGLEALKAAECAGKLFIFHTSLPIAEAPGKLKNR 683
Cdd:pfam04811   76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   684 DDKKLINTDKEKTLFQPQT-SFYSNLAKDCVAQGCCVDLFLFPNQYLDVATLGVVTYQTGGSIYKYAYFQLEADQDRFLN 762
Cdd:pfam04811  156 LDESHHGTDKEKAKLVKKAdKFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235

                   ....*.
gi 971402013   763 DLRRDV 768
Cdd:pfam04811  236 DLQRYF 241
trunk_domain cd01468
trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi ...
524-766 8.86e-103

trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Some members of this family possess a partial MIDAS motif that is a characteristic feature of most vWA domain proteins.


Pssm-ID: 238745 [Multi-domain]  Cd Length: 239  Bit Score: 323.43  E-value: 8.86e-103
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  524 PSPPAFIFMIDVSYNAVKSGLVRLICEELKSILDYLPREGNMeesaiRVGFVTYNKVLHFYNVKSSLAQPQMMVVSDVAD 603
Cdd:cd01468     1 PQPPVFVFVIDVSYEAIKEGLLQALKESLLASLDLLPGDPRA-----RVGLITYDSTVHFYNLSSDLAQPKMYVVSDLKD 75
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  604 MFVPLLDGFLVNVNESRTVITSLLDQIPEMFAD--TRETETVFGPVIQAGLEALKAAECAGKLFIFHTSLPIAEaPGKLK 681
Cdd:cd01468    76 VFLPLPDRFLVPLSECKKVIHDLLEQLPPMFWPvpTHRPERCLGPALQAAFLLLKGTFAGGRIIVFQGGLPTVG-PGKLK 154
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  682 NRDDKKLINTDKEKTLFQPQTSFYSNLAKDCVAQGCCVDLFLFPNQYLDVATLGVVTYQTGGSIYKYAYFQLEADQDRFL 761
Cdd:cd01468   155 SREDKEPIRSHDEAQLLKPATKFYKSLAKECVKSGICVDLFAFSLDYVDVATLKQLAKSTGGQVYLYDSFQAPNDGSKFK 234

                  ....*
gi 971402013  762 NDLRR 766
Cdd:cd01468   235 QDLQR 239
PTZ00395 PTZ00395
Sec24-related protein; Provisional
522-1116 1.73e-49

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 192.60  E-value: 1.73e-49
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  522 KFPS-----PPAFIFMIDVSYNAVKSGLVRLICEELKSILDylpregNMEESAIRVGFVTYNKVLHFYNVKSSLAQP--- 593
Cdd:PTZ00395  943 KYPQvknmlPPYFVFVVECSYNAIYNNITYTILEGIRYAVQ------NVKCPQTKIAIITFNSSIYFYHCKGGKGVSgee 1016
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  594 ----------QMMVVSDVADMFVPL-LDGFLVNVNESRTVITSLLDQIPEMFADTRETETVFGPVIQAGLEALKAAECAG 662
Cdd:PTZ00395 1017 gdggggsgnhQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSCGNSALKIAMDMLKERNGLG 1096
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  663 KLFIFHTSLPIAeAPGKLKnrddkKLINTDKEKTLFQPQTSFYSNLAKDCVAQGCCVDLFLFP--NQYLDVATLGVVTYQ 740
Cdd:PTZ00395 1097 SICMFYTTTPNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFIISsnNVRVCVPSLQYVAQN 1170
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  741 TGGSIYKYAYFQLEAD-QDRFLNDLRRDVQKEVGFDAVMRVRTSTGIRATDFFGAFYMSNTT----DVEMAGLDCDKTIT 815
Cdd:PTZ00395 1171 TGGKILFVENFLWQKDyKEIYMNIMDTLTSEDIAYCCELKLRYSHHMSVKKLFCCNNNFNSIisvdTIKIPKIRHDQTFA 1250
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  816 VEFKHDDKLSEDSGALLQCALLYTSCAGQRRLRIHNLSLNCCTQLADLYRNCETDTLINYLAKYAYRGVLNSPVKSvrDS 895
Cdd:PTZ00395 1251 FLLNYSDISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNILIKQLCTNILHNDNYS--KI 1328
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  896 LINQCAQILACYRKNCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQpgPEVTTDDRAYIRQLVTSMDVAETNVFFYP- 974
Cdd:PTZ00395 1329 IIDNLAAILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNVTK--KEILHDLKVYSLIKLLSMPIISSLLYVYPv 1406
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  975 --------RLLPLTKADVDSD-SLPAAIRNSEERLSKGDIYLLENGLNIFVWVGVNVQQGLIQNLFGVSSFSQISSTLSt 1045
Cdd:PTZ00395 1407 myvihikgKTNEIDSMDVDDDlFIPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANFAKEIVGDIPTEKNAHELN- 1485
                         570       580       590       600       610       620       630
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 971402013 1046 lpVLENPFSKKVRSIIDML----HLqrSRYMKLIIVKQEDKLEMLFKHFLVEDKSlTGGASYVDFLCHMHKEIRQ 1116
Cdd:PTZ00395 1486 --LTDTPNAQKVQRIIKNLsrihHF--NKYVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYVNFLCFIHKLVHK 1555
Sec23_helical pfam04815
Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic ...
870-968 3.23e-35

Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices.


Pssm-ID: 461441 [Multi-domain]  Cd Length: 103  Bit Score: 129.54  E-value: 3.23e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   870 DTLINYLAKYAYRGVLNSPVKSVRDSLINQCAQILACYRKNCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQPGPEVT 949
Cdd:pfam04815    3 EAIAVLLAKKAVEKALSSSLSDAREALDNKLVDILAAYRKYCASSSSPGQLILPESLKLLPLYMLALLKSPALRGGNSSP 82
                           90
                   ....*....|....*....
gi 971402013   950 TDDRAYIRQLVTSMDVAET 968
Cdd:pfam04815   83 SDERAYARHLLLSLPVEEL 101
Sec23_BS pfam08033
Sec23/Sec24 beta-sandwich domain;
773-856 6.02e-30

Sec23/Sec24 beta-sandwich domain;


Pssm-ID: 429794 [Multi-domain]  Cd Length: 86  Bit Score: 113.79  E-value: 6.02e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   773 GFDAVMRVRTSTGIRATDFFGAFYMSNTTD-VEMAGLDCDKTITVEFKHDDKLSEDSGALLQCALLYTSCAGQRRLRIHN 851
Cdd:pfam08033    1 GFNAVLRVRTSKGLKVSGFIGNFVSRSSGDtWKLPSLDPDTSYAFEFDIDEPLPNGSNAYIQFALLYTHSSGERRIRVTT 80

                   ....*
gi 971402013   852 LSLNC 856
Cdd:pfam08033   81 VALPV 85
PLN00162 PLN00162
transport protein sec23; Provisional
403-849 2.00e-20

transport protein sec23; Provisional


Pssm-ID: 215083 [Multi-domain]  Cd Length: 761  Bit Score: 97.32  E-value: 2.00e-20
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  403 SYNI-PCTSDMAKQSQVPLAAVIKPLATLPPEETLPYlvdhgesGPVRCNRCKAYMCPFMQFIEGGRRFQCCFCSCVTEV 481
Cdd:PLN00162   15 SWNVwPSSKIEASKCVIPLAALYTPLKPLPELPVLPY-------DPLRCRTCRAVLNPYCRVDFQAKIWICPFCFQRNHF 87
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  482 PAHYF----QHLDhtgkrvdfydrPELslgsYEFLATVDY---CKNNKFPSPPAFIFMIDVS-----YNAVKSglvrlic 549
Cdd:PLN00162   88 PPHYSsiseTNLP-----------AEL----FPQYTTVEYtlpPGSGGAPSPPVFVFVVDTCmieeeLGALKS------- 145
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  550 eELKSILDYLPregnmeESAiRVGFVTY----------------------------NKVLHFYNVKSSLAQPQMMVVSDV 601
Cdd:PLN00162  146 -ALLQAIALLP------ENA-LVGLITFgthvhvhelgfsecsksyvfrgnkevskDQILEQLGLGGKKRRPAGGGIAGA 217
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  602 ADMFVPL-LDGFLVNVNESRTVITSLLDQI-PEMF---ADTRETETVfGPVIQ--AGLEALKAAECAGKLFIFhTSLPIA 674
Cdd:PLN00162  218 RDGLSSSgVNRFLLPASECEFTLNSALEELqKDPWpvpPGHRPARCT-GAALSvaAGLLGACVPGTGARIMAF-VGGPCT 295
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  675 EAPGKLKNRDDKKLINTDKE-----KTLFQPQTSFYSNLAKDCVAQGCCVDLFLFPnqyLD---VATLGVVTYQTGGSIY 746
Cdd:PLN00162  296 EGPGAIVSKDLSEPIRSHKDldkdaAPYYKKAVKFYEGLAKQLVAQGHVLDVFACS---LDqvgVAEMKVAVERTGGLVV 372
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  747 KYAYFqleaDQDRFLNDLRRDVQKE------VGFDAVMRVRTSTGIRATDFFG---------------AFYMSNTTDVEM 805
Cdd:PLN00162  373 LAESF----GHSVFKDSLRRVFERDgegslgLSFNGTFEVNCSKDVKVQGAIGpcaslekkgpsvsdtEIGEGGTTAWKL 448
                         490       500       510       520
                  ....*....|....*....|....*....|....*....|....*....
gi 971402013  806 AGLDCDKTITVEFKHDDKLSEDSGA-----LLQCALLYTSCAGQRRLRI 849
Cdd:PLN00162  449 CGLDKKTSLAVFFEVANSGQSNPQPpgqqfFLQFLTRYQHSNGQTRLRV 497
zf-Sec23_Sec24 pfam04810
Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
447-484 6.11e-16

Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain.


Pssm-ID: 461437 [Multi-domain]  Cd Length: 38  Bit Score: 72.48  E-value: 6.11e-16
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 971402013   447 PVRCNRCKAYMCPFMQFIEGGRRFQCCFCSCVTEVPAH 484
Cdd:pfam04810    1 PVRCRRCRAYLNPFCQFDFGGKKWTCNFCGTRNPVPPE 38
SEC23 COG5047
Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion];
399-1024 2.90e-14

Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion];


Pssm-ID: 227380 [Multi-domain]  Cd Length: 755  Bit Score: 77.61  E-value: 2.90e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  399 IRCTSYNIPCTSDMAKQSQVPLAAVIKPLATLPPEETLPYlvdhgesGPVRCNR-CKAYMCPFMQFIEGGRRFQCCFCSC 477
Cdd:COG5047    12 IRLTWNVFPATRGDATRTVIPIACLYTPLHEDDALTVNYY-------EPVKCTApCKAVLNPYCHIDERNQSWICPFCNQ 84
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  478 VTEVPAHYfqhLDHTGKRVDfydrPELSLGSyeflATVDYCKNNKFPSPPAFIFMIDVsynAVKSGLVRLICEELKSILD 557
Cdd:COG5047    85 RNTLPPQY---RDISNANLP----LELLPQS----STIEYTLSKPVILPPVFFFVVDA---CCDEEELTALKDSLIVSLS 150
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  558 YLPREgnmeesAIrVGFVTYNKVLHFYNVkSSLAQPQMMVVSDVADMFVPLLD--------------------------- 610
Cdd:COG5047   151 LLPPE------AL-VGLITYGTSIQVHEL-NAENHRRSYVFSGNKEYTKENLQellalskptksggfeskisgigqfass 222
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  611 GFLVNVNESRTVITSLLDQIPEmfaDTRETETVFGPVIQAGLeALKAA---------ECAGKLFIFhTSLPIAEAPGKLK 681
Cdd:COG5047   223 RFLLPTQQCEFKLLNILEQLQP---DPWPVPAGKRPLRCTGS-ALNIAsslleqcfpNAGCHIVLF-AGGPCTVGPGTVV 297
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  682 NRDDKK------LINTDKEKtLFQPQTSFYSNLAKDCVAQGCCVDLFLFPNQYLDVATLGVVTYQTGGSIYKYAYFQLEA 755
Cdd:COG5047   298 STELKEpmrshhDIESDSAQ-HSKKATKFYKGLAERVANQGHALDIFAGCLDQIGIMEMEPLTTSTGGALVLSDSFTTSI 376
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  756 DQDRFLNDLRRDVQK--EVGFDAVMRVRTSTGIRATDFFG---------------AFYMSNTTDVEMAGLDCDKTITVEF 818
Cdd:COG5047   377 FKQSFQRIFNRDSEGylKMGFNANMEVKTSKNLKIKGLIGhavsvkkkannisdsEIGIGATNSWKMASLSPKSNYALYF 456
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  819 K-----HDDKLSEDSGALLQCALLYTSCAGQRRLRIHNLSLNCCTQLADL-YRNCETDTLINYLAKYAyrgVLNSPVKSV 892
Cdd:COG5047   457 EialgaASGSAQRPAEAYIQFITTYQHSSGTYRIRVTTVARMFTDGGLPKiNRSFDQEAAAVFMARIA---AFKAETEDI 533
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  893 RDS-------LINQCaQILACYRKNcaSPSSAGqliLPECMKLLPVYLNCVLKSDVLQPGpEVTTDDRAYIRQLVTSMDV 965
Cdd:COG5047   534 IDVfrwidrnLIRLC-QKFADYRKD--DPSSFR---LDPNFTLYPQFMYHLRRSPFLSVF-NNSPDETAFYRHMLNNADV 606
                         650       660       670       680       690       700
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 971402013  966 AETNVFFYPRLLPLTKAD------VDSDSLPAAIrnseerlskgdIYLLENGLNIFVWVGVNVQQ 1024
Cdd:COG5047   607 NDSLIMIQPTLQSYSFEKggvpvlLDSVSVKPDV-----------ILLLDTFFHILIFHGSYIAQ 660
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
3-281 3.26e-13

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 74.42  E-value: 3.26e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013     3 VNQHTHAGPPygqPQPGYQGYQQPAYGGqPLPGVPHT-QYGAYNGPMPGYQQPVPP-----QGSVRALPTSGAPPPASGT 76
Cdd:pfam03154  249 LQPMTQPPPP---SQVSPQPLPQPSLHG-QMPPMPHSlQTGPSHMQHPVPPQPFPLtpqssQSQVPPGPSPAAPGQSQQR 324
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    77 SLPSGHQGYSQFGQGDVQNGIPTSTAPMQ--RPPASQPFLPGSA------PAPVSQPSTFQQYG--PPPCSVQQLSNHMa 146
Cdd:pfam03154  325 IHTPPSQSQLQSQQPPREQPLPPAPLSMPhiKPPPTTPIPQLPNpqshkhPPHLSGPSPFQMNSnlPPPPALKPLSSLS- 403
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   147 gmTIGSTSVSAPP----PAGLGYGPPTSVPPV---SGSFSATGSGLYTPYTASPGPpppsvpqglplAQPPFSGQPVPTQ 219
Cdd:pfam03154  404 --THHPPSAHPPPlqlmPQSQQLPPPPAQPPVltqSQSLPPPAASHPPTSGLHQVP-----------SQSPFPQHPFVPG 470
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 971402013   220 RLPTEVPGFAPPPPA----TGIGASSYPPPTGAPRPPPMPGPPLSGQTVAGPPMSQPNHVSSPPPP 281
Cdd:pfam03154  471 GPPPITPPSGPPTSTssamPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPP 536
PHA03247 PHA03247
large tegument protein UL36; Provisional
46-380 3.15e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 64.96  E-value: 3.15e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   46 GPMPGYQQPVPPQGSvralPTSGAPPPASGTSLPSGHQGYSQFGQgdvqngiPTSTAPMQRPPASQPFLPGSAPAPVSQP 125
Cdd:PHA03247 2693 GSLTSLADPPPPPPT----PEPAPHALVSATPLPPGPAAARQASP-------ALPAAPAPPAVPAGPATPGGPARPARPP 2761
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  126 STfqqYGPPpcsvqqlsnhmagmtigstsvSAPPPAGLGYGPP--TSVPPVSGSFSATGSGLYTPYTASPGPPPPSVPQG 203
Cdd:PHA03247 2762 TT---AGPP---------------------APAPPAAPAAGPPrrLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAA 2817
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  204 LPLAQPPFSGQPVPTQRLPTEVPGFAPPPPATGIGASSYPPPTGAPRPPPMPGPPLSGQTVAGPPMSQPNHVSSPPPPLT 283
Cdd:PHA03247 2818 LPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTES 2897
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  284 LSGPHPGPPMSGPPPPTHPPQPGYQMQQNgsfgqvrgPQPNyggayPGTPNYGSQPGPPPKRLDPDSIPSPIQVIEDDRN 363
Cdd:PHA03247 2898 FALPPDQPERPPQPQAPPPPQPQPQPPPP--------PQPQ-----PPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWL 2964
                         330
                  ....*....|....*..
gi 971402013  364 NRGSEPFVTGVRGQVPP 380
Cdd:PHA03247 2965 GALVPGRVAVPRFRVPQ 2981
Gelsolin pfam00626
Gelsolin repeat;
989-1061 3.60e-10

Gelsolin repeat;


Pssm-ID: 395501 [Multi-domain]  Cd Length: 76  Bit Score: 57.32  E-value: 3.60e-10
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 971402013   989 LPAAIRNSEERLSKGDIYLLENGLNIFVWVGVNvqQGLIQNLFGVSSFSQISST-LSTLPVLEN-PFSKKVRSII 1061
Cdd:pfam00626    4 LPPPVPLSQESLNSGDCYLLDNGFTIFLWVGKG--SSLLEKLFAALLAAQLDDDeRFPLPEVIRvPQGKEPARFL 76
PHA03247 PHA03247
large tegument protein UL36; Provisional
8-350 2.79e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 58.41  E-value: 2.79e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    8 HAGPPYGQPQPgyqgyqQPAYGGQPLPGVPHTQYGAYNGPMPGYQQPVPPQGSvrALPTSGAPPPASGTSlpsghqgysq 87
Cdd:PHA03247 2702 PPPPPTPEPAP------HALVSATPLPPGPAAARQASPALPAAPAPPAVPAGP--ATPGGPARPARPPTT---------- 2763
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   88 fgqgdvqNGIPTSTAPMQRPPASQPFL--PGSAPAPVSQPSTFQQYGPPPCSVQQLSNHMAGMTIGS-TSVSAPPPAGLG 164
Cdd:PHA03247 2764 -------AGPPAPAPPAAPAAGPPRRLtrPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASpAGPLPPPTSAQP 2836
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  165 YGPPTSVPPVSGSFS-----ATGSGLYTPYTASPGPPPPSVPQGLP---LAQPPFSGQPVPtQRLPTEVPGFAPPPPAtg 236
Cdd:PHA03247 2837 TAPPPPPGPPPPSLPlggsvAPGGDVRRRPPSRSPAAKPAAPARPPvrrLARPAVSRSTES-FALPPDQPERPPQPQA-- 2913
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  237 igassyppptgapRPPPMPGPPLSGQTVAGPPMSQPNHVSSPPPPLTlsgphpgppmsgPPPPTHPPQPGYQMQQNGSFG 316
Cdd:PHA03247 2914 -------------PPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTT------------DPAGAGEPSGAVPQPWLGALV 2968
                         330       340       350
                  ....*....|....*....|....*....|....
gi 971402013  317 QVRGPQPNYGGAYPGTPNYGSQPGPPPKRLDPDS 350
Cdd:PHA03247 2969 PGRVAVPRFRVPQPAPSREAPASSTPPLTGHSLS 3002
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
15-353 4.71e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 57.47  E-value: 4.71e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    15 QPQPGYQGYQQPAYGgQPLPGVPHTQYGAYNGPMPGyQQPVPPQGSVralPTSgaPPPASGTSLPSGHQGYSQFGQGDVQ 94
Cdd:pfam03154  168 QTQPPVLQAQSGAAS-PPSPPPPGTTQAATAGPTPS-APSVPPQGSP---ATS--QPPNQTQSTAAPHTLIQQTPTLHPQ 240
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    95 NgIPTSTAPMQrpPASQPFLPGS-APAPVSQPSTFQQYGPPPCSVQQLSNHMAgMTIGSTSVSAPPPAGLGYGPPTSVPP 173
Cdd:pfam03154  241 R-LPSPHPPLQ--PMTQPPPPSQvSPQPLPQPSLHGQMPPMPHSLQTGPSHMQ-HPVPPQPFPLTPQSSQSQVPPGPSPA 316
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   174 VSGSFSATgsgLYTPytaspgppppsVPQGLPLAQPPFSGQPVPTQrlPTEVPGFAPPP----PATGIGASSYPPPTGAP 249
Cdd:pfam03154  317 APGQSQQR---IHTP-----------PSQSQLQSQQPPREQPLPPA--PLSMPHIKPPPttpiPQLPNPQSHKHPPHLSG 380
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   250 RPPPMPGPPLSGQTVAGPPMSQPNH--VSSPPPPLTLSGPHPGPPMSGPPPPThppqpgyqMQQNGSFGQVRGPQPNYGG 327
Cdd:pfam03154  381 PSPFQMNSNLPPPPALKPLSSLSTHhpPSAHPPPLQLMPQSQQLPPPPAQPPV--------LTQSQSLPPPAASHPPTSG 452
                          330       340       350
                   ....*....|....*....|....*....|.
gi 971402013   328 AYPGT-----PNYGSQPGPPPKRLDPDSIPS 353
Cdd:pfam03154  453 LHQVPsqspfPQHPFVPGGPPPITPPSGPPT 483
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
9-234 9.61e-07

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 53.07  E-value: 9.61e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    9 AGPPYGQPQPGYQGYQQPAYGGQPLPGVPHTQYGAYNGPMPGYQQPVPPQGSVRALPTSGAPPPASGTSLPSGhqgysqf 88
Cdd:PRK07764  598 EGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWP------- 670
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   89 gqGDVQNGIPTSTAPMQRPPASqpflPGSAPAPVSQPSTFQQYGPPPCSVQQLSNHMAGmtiGSTSVSAPPPAGLGYGPP 168
Cdd:PRK07764  671 --AKAGGAAPAAPPPAPAPAAP----AAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQ---AAQGASAPSPAADDPVPL 741
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 971402013  169 TSVPPVSGSFSATGSGLYTPytaspgppppsvpQGLPLAQPPFSGQPVPTQRLPTEVPGFAPPPPA 234
Cdd:PRK07764  742 PPEPDDPPDPAGAPAQPPPP-------------PAPAPAAAPAAAPPPSPPSEEEEMAEDDAPSMD 794
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
5-135 5.00e-06

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 48.11  E-value: 5.00e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013     5 QHTHAGPPYGQPQPGyqGYQQ-PAYGGQPLPGVPHTQYGayNGPMPGYQQPVPPQGSvralPTSGAPPPasgtslPSGHQ 83
Cdd:pfam15240   57 QPPASDDPPGPPPPG--GPQQpPPQGGKQKPQGPPPQGG--PRPPPGKPQGPPPQGG----NQQQGPPP------PGKPQ 122
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 971402013    84 GYSQFGQGDvqngiPTSTAPMQRPPASQPFLPGSAPAPVSQPStfQQYGPPP 135
Cdd:pfam15240  123 GPPPQGGGP-----PPQGGNQQGPPPPPPGNPQGPPQRPPQPG--NPQGPPQ 167
PHA03247 PHA03247
large tegument protein UL36; Provisional
5-137 1.50e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 49.55  E-value: 1.50e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    5 QHTHAGPPYGQPQPGYQGYQQPAYGGQPLPGVPHTQYGAY------------NGPMPGYQQPVPPQGSVRALPTSG---- 68
Cdd:PHA03247 2928 QPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALvpgrvavprfrvPQPAPSREAPASSTPPLTGHSLSRvssw 3007
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   69 ----------APPPAS---------------GTSLPSGHQGYSQFGQGDVQNGIPTStapmqrPPASQPFLPgsAPAPVS 123
Cdd:PHA03247 3008 asslalheetDPPPVSlkqtlwppddtedsdADSLFDSDSERSDLEALDPLPPEPHD------PFAHEPDPA--TPEAGA 3079
                         170
                  ....*....|....
gi 971402013  124 QPSTFQQYGPPPCS 137
Cdd:PHA03247 3080 RESPSSQFGPPPLS 3093
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
8-143 1.57e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 49.26  E-value: 1.57e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013     8 HAGPPYGQPQPGYQGYQQPAyggqplpgvPHTQYGAYNGPMPGYQQPVPPQG-SVRAL--PTSGAPPPASGTSLPSGHQG 84
Cdd:pfam09770  218 PAQPPAAPPAQQAQQQQQFP---------PQIQQQQQPQQQPQQPQQHPGQGhPVTILqrPQSPQPDPAQPSIQPQAQQF 288
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 971402013    85 YSQFGQGDVQngiPT------STAPMQRPPASQPFLPGSAPAP--VSQPSTFQQYGPPPCSV--QQLSN 143
Cdd:pfam09770  289 HQQPPPVPVQ---PTqilqnpNRLSAARVGYPQNPQPGVQPAPahQAHRQQGSFGRQAPIIThpQQLAQ 354
PHA03377 PHA03377
EBNA-3C; Provisional
9-173 1.84e-05

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 48.90  E-value: 1.84e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    9 AGPPygQPQPGYQGYQQPayggqplpgvPHTQygaynGPMPGYQQPVPPQG--------SVRALPTSGAPPPASGTSLPS 80
Cdd:PHA03377  739 APPP--SHQAPYSGHEEP----------QAQQ-----APYPGYWEPRPPQApylgyqepQAQGVQVSSYPGYAGPWGLRA 801
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   81 GHQGYSQ---FGQGDVQNGiPTSTAPMQRPPASQPFLPGSAPAPVSQPSTFQ----QYGPPP---CSVQQLSNHMAGMTI 150
Cdd:PHA03377  802 QHPRYRHswaYWSQYPGHG-HPQGPWAPRPPHLPPQWDGSAGHGQDQVSQFPhlqsETGPPRlqlSQVPQLPYSQTLVSS 880
                         170       180
                  ....*....|....*....|...
gi 971402013  151 GSTSVSAPPPAGLGYGPPTSVPP 173
Cdd:PHA03377  881 SAPSWSSPQPRAPIRPIPTRFPP 903
PRK10263 PRK10263
DNA translocase FtsK; Provisional
15-160 3.42e-05

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 48.16  E-value: 3.42e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   15 QPQPGYQGyQQPAYGGQPLPGVPHTQYG----AYNGPmpgYQQPVPPQGSVRALPTSGAPPPASGTSLPSGHQGYSQFGQ 90
Cdd:PRK10263  361 QPVPGPQT-GEPVIAPAPEGYPQQSQYAqpavQYNEP---LQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAP 436
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   91 GDVQngiPTSTAPMQRPPASQPFlpgsAPAPVSQPstFQQYGPPpcsVQQLSNHMAGMTIGSTSVSAPPP 160
Cdd:PRK10263  437 APEQ---PVAGNAWQAEEQQSTF----APQSTYQT--EQTYQQP---AAQEPLYQQPQPVEQQPVVEPEP 494
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
12-140 7.21e-05

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 44.64  E-value: 7.21e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    12 PYGQPQPGYQGYQQPAYGGQPLPgvphtqygaynGPMPGYQQPVPPQGSVRALPTSGAPPPASGTSLPSGHQGYSQFGQG 91
Cdd:pfam15240   35 EEGQSQQGGQGPQGPPPGGFPPQ-----------PPASDDPPGPPPPGGPQQPPPQGGKQKPQGPPPQGGPRPPPGKPQG 103
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*....
gi 971402013    92 DVQNGIPTSTAPMQRPPASQPFLPGSAPAPvsQPSTFQQYGPPPCSVQQ 140
Cdd:pfam15240  104 PPPQGGNQQQGPPPPGKPQGPPPQGGGPPP--QGGNQQGPPPPPPGNPQ 150
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
7-234 7.34e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 46.90  E-value: 7.34e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    7 THAGPPYGQPQPGYQGYQQPAYGGQPlpgvphTQYGAYNGPMPGYQQPVPPQGSVRALPTSGAPPPASGTSLPSGHQGYS 86
Cdd:PRK07764  590 PAPGAAGGEGPPAPASSGPPEEAARP------AAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDAS 663
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   87 QfgQGDVQNGIPTSTAPMQRPPASQPFLP-GSAPAPVSQPSTFQQYGPPPCSVQQLSNHMAGmtiGSTSVSAPPPAGLGY 165
Cdd:PRK07764  664 D--GGDGWPAKAGGAAPAAPPPAPAPAAPaAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQ---AAQGASAPSPAADDP 738
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 971402013  166 GPPTSVPPvsgsfsatgsglytpytaspgppppsvPQGLPLAQPPFSGQPVPTQRLPTEVPGFAPPPPA 234
Cdd:PRK07764  739 VPLPPEPD---------------------------DPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPS 780
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
34-279 7.82e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 47.07  E-value: 7.82e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    34 PGVPHTQYGAYNGPMPGYQQPVPPQGSVRALPTSGAPPPASGTSLPSghqgysqfgqgdvqngiptsTAPMQRPPASQPF 113
Cdd:pfam03154  146 PSIPSPQDNESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTT--------------------QAATAGPTPSAPS 205
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   114 LPGSAPAPVSQPSTFQQYGPPPCSVQQlsnhmAGMTIGSTSVSAPPPAGLGYGPPTsvPPVSGSFSATGS-GLYTPytas 192
Cdd:pfam03154  206 VPPQGSPATSQPPNQTQSTAAPHTLIQ-----QTPTLHPQRLPSPHPPLQPMTQPP--PPSQVSPQPLPQpSLHGQ---- 274
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   193 pgppppSVPQGLPL-AQPPFSGQPVPTQRLPTEVP---GFAPPPPATGI-GASSYPPPTGAPRPPPMPGPPLSGQTVAGP 267
Cdd:pfam03154  275 ------MPPMPHSLqTGPSHMQHPVPPQPFPLTPQssqSQVPPGPSPAApGQSQQRIHTPPSQSQLQSQQPPREQPLPPA 348
                          250
                   ....*....|..
gi 971402013   268 PMSQPnHVSSPP 279
Cdd:pfam03154  349 PLSMP-HIKPPP 359
SSDP pfam04503
Single-stranded DNA binding protein, SSDP; This is a family of eukaryotic single-stranded DNA ...
46-176 1.19e-04

Single-stranded DNA binding protein, SSDP; This is a family of eukaryotic single-stranded DNA binding proteins with specificity to a pyrimidine-rich element found in the promoter region of the alpha2(I) collagen gene.


Pssm-ID: 461334 [Multi-domain]  Cd Length: 293  Bit Score: 45.33  E-value: 1.19e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    46 GPMPGYQQPVPPqGSVRALPTSGAPPPASGTSLPSGHQGYSQFGQGDVQNGIPTSTAPMQR-----PPASQPFLPGSApA 120
Cdd:pfam04503   37 GPMPPGFFQSPP-SHPSSQPSPHAQPPPHNPATMMGPHSQPFMGPRYPGGPRPSVRMPQQGndfngPPGQQPMMPNSM-D 114
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 971402013   121 PVSQPSTFQQYGPPPCSVQQLSNHMAGMTIGSTSVS--APPP---AGLGYGPPTSVPPVSG 176
Cdd:pfam04503  115 PTRPGGHPNMGGPMQRMNPPRGPGMGPMGPQSYGPGmrGPPPnstDGPGGMPPMNMGPGGR 175
PHA03378 PHA03378
EBNA-3B; Provisional
5-173 1.32e-04

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 46.21  E-value: 1.32e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    5 QHTHAGPPYGQPQPGYQGYQQPAyGGQPLPGVPHTQYGAYNGPMPGYQQPVPPQGSVRALPTSGAPPPASGTSL------ 78
Cdd:PHA03378  735 RPPAAAPGRARPPAAAPGRARPP-AAAPGRARPPAAAPGAPTPQPPPQAPPAPQQRPRGAPTPQPPPQAGPTSMqlmpra 813
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   79 PSGHQGYS-----QFGQGDVQNGIPTSTAPM----QRPPASQPFLPGSAPAPVSQPSTFQqygPPPCSVQQLSNHMAGMT 149
Cdd:PHA03378  814 APGQQGPTkqilrQLLTGGVKRGRPSLKKPAalerQAAAGPTPSPGSGTSDKIVQAPVFY---PPVLQPIQVMRQLGSVR 890
                         170       180       190
                  ....*....|....*....|....*....|.
gi 971402013  150 IGSTSVSAPPPA-------GLGYGPPTSVPP 173
Cdd:PHA03378  891 AAAASTVTQAPTeytgerrGVGPMHPTDIPP 921
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
2-148 1.44e-04

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 45.95  E-value: 1.44e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013     2 SVNQHTHAGPPYGQPQPGYQgyqqpaYGGQPLpGVPHTQygayngPMPGYQQPVPPQGsvralPTSGAPPPASGTSlpsg 81
Cdd:TIGR01628  389 SPMGGAMGQPPYYGQGPQQQ------FNGQPL-GWPRMS------MMPTPMGPGGPLR-----PNGLAPMNAVRAP---- 446
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 971402013    82 hqgysqfGQGDVQNGIPTSTAPMQRPPASQPfLPGSAPAPVSQPSTFQQyGPPPCSVQQLSNHMAGM 148
Cdd:TIGR01628  447 -------SRNAQNAAQKPPMQPVMYPPNYQS-LPLSQDLPQPQSTASQG-GQNKKLAQVLASATPQM 504
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
40-236 2.00e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 45.64  E-value: 2.00e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   40 QYGAYNGPMPGYQQPVPPQGSVRALPTSGAPPPASGTSLPSGHQGYSQFGQGDVQNGIPTSTAPMQRPPASQPFLPG--- 116
Cdd:PRK12323  367 QSGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGpgg 446
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  117 -SAPAPVSQPSTFQQYGPPPCSVQQLSNHMAGMTIGSTSVSAPPPAGLGYGPPTSVPPVSGSFSATGSGLYTPYTASPGP 195
Cdd:PRK12323  447 aPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASPAPAQPDAAPAGWVAESI 526
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....
gi 971402013  196 PPPSVPQ---GLPLAQPPFSGQPVPTQRLPTEvPGFAPPPPATG 236
Cdd:PRK12323  527 PDPATADpddAFETLAPAPAAAPAPRAAAATE-PVVAPRPPRAS 569
PHA02682 PHA02682
ORF080 virion core protein; Provisional
24-135 4.74e-04

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 43.31  E-value: 4.74e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   24 QQPAyGGQPLPGVPHTqygayngPMPGYQQPVPPQGSVRALPTSGAP----PPASGTSLPSGHQGYSQfgqGDVQNGIPT 99
Cdd:PHA02682   74 QRPS-GQSPLAPSPAC-------AAPAPACPACAPAAPAPAVTCPAPapacPPATAPTCPPPAVCPAP---ARPAPACPP 142
                          90       100       110
                  ....*....|....*....|....*....|....*.
gi 971402013  100 STApmQRPPAsqPFLPGSAPAPVSQPSTFQQYGPPP 135
Cdd:PHA02682  143 STR--QCPPA--PPLPTPKPAPAAKPIFLHNQLPPP 174
PHA03378 PHA03378
EBNA-3B; Provisional
47-283 5.59e-04

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 44.29  E-value: 5.59e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   47 PMPGYQQ-PVPPQGSVRALPTSGAPPPASGTSLPSGHqGYSQFGQGDVQNGIPTSTAPMQRPPASQP------------F 113
Cdd:PHA03378  565 PAPGLGPlQIQPLTSPTTSQLASSAPSYAQTPWPVPH-PSQTPEPPTTQSHIPETSAPRQWPMPLRPipmrplrmqpitF 643
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  114 LPGSAPAPVSQP--------STFQQYGPPPCSVQQlSNHMAGMTIGSTSVSA-PPPAGLGYGPPTSVPPVSGSFSATGSG 184
Cdd:PHA03378  644 NVLVFPTPHQPPqveitpykPTWTQIGHIPYQPSP-TGANTMLPIQWAPGTMqPPPRAPTPMRPPAAPPGRAQRPAAATG 722
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  185 LYTPYTASPGPPPPSvpQGLPLAQPPFSGQPVPTQRlPTEVPGFAPPPPATGiGASSypPPTGAPRPPPMPGPPLSGQTV 264
Cdd:PHA03378  723 RARPPAAAPGRARPP--AAAPGRARPPAAAPGRARP-PAAAPGRARPPAAAP-GAPT--PQPPPQAPPAPQQRPRGAPTP 796
                         250
                  ....*....|....*....
gi 971402013  265 AGPPMSQPNHVSSPPPPLT 283
Cdd:PHA03378  797 QPPPQAGPTSMQLMPRAAP 815
PHA03247 PHA03247
large tegument protein UL36; Provisional
46-285 5.63e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.54  E-value: 5.63e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   46 GPMPGYQQPVPPQGSVRALPTSGAPPPASGTSLPSGH-------QGYSQFGQGDVQNGIPTSTAPMQRPPAS---QPFLP 115
Cdd:PHA03247 2550 DPPPPLPPAAPPAAPDRSVPPPRPAPRPSEPAVTSRArrpdappQSARPRAPVDDRGDPRGPAPPSPLPPDThapDPPPP 2629
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  116 GSAPAPVSQPSTFQQYGPPPC------SVQQLSNHMAGMTIGSTSVSAPPPAGlgYGPPTSVPPVsGSFSATGSGLYTPY 189
Cdd:PHA03247 2630 SPSPAANEPDPHPPPTVPPPErprddpAPGRVSRPRRARRLGRAAQASSPPQR--PRRRAARPTV-GSLTSLADPPPPPP 2706
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  190 TASPGPPPPSVPQGLPLAqPPFSGQPVPTqrlPTEVPgfAPPPPATGIGASSyppptgaprpppMPGPPLSGQTVAGPPM 269
Cdd:PHA03247 2707 TPEPAPHALVSATPLPPG-PAAARQASPA---LPAAP--APPAVPAGPATPG------------GPARPARPPTTAGPPA 2768
                         250
                  ....*....|....*.
gi 971402013  270 SQPNHVSSPPPPLTLS 285
Cdd:PHA03247 2769 PAPPAAPAAGPPRRLT 2784
PHA03247 PHA03247
large tegument protein UL36; Provisional
18-354 6.08e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.16  E-value: 6.08e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   18 PGYQGYQQPAYGGQPlpgvphtqYGAYNGPMPGYQQPVPPQgsvrALPTSGAPPPASGTSLPSGH----------QGYSQ 87
Cdd:PHA03247 2475 PGAPVYRRPAEARFP--------FAAGAAPDPGGGGPPDPD----APPAPSRLAPAILPDEPVGEpvhprmltwiRGLEE 2542
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   88 FGQGDVqNGIPTSTAPMQRPPASQPFLPGSAPAP-VSQPSTFQQYGPPPCSVQQLSnhmaGMTIGSTSVSAPPPAglgyg 166
Cdd:PHA03247 2543 LASDDA-GDPPPPLPPAAPPAAPDRSVPPPRPAPrPSEPAVTSRARRPDAPPQSAR----PRAPVDDRGDPRGPA----- 2612
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  167 PPTSVPPVSGsfsatgsglytpytaspgppppsvpqglPLAQPPFSGQPVPTQrLPTEVPGFAPPPPATGIGASsypppt 246
Cdd:PHA03247 2613 PPSPLPPDTH----------------------------APDPPPPSPSPAANE-PDPHPPPTVPPPERPRDDPA------ 2657
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  247 GAPRPPPMPGPPLSGQTVAGPPMSQPNHVSSPPP--PLTlSGPHPGPPMSGPPPPTHPPQPGYQMQQNGSFGQVRGPQPN 324
Cdd:PHA03247 2658 PGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTvgSLT-SLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALP 2736
                         330       340       350
                  ....*....|....*....|....*....|...
gi 971402013  325 YGGAYPGTPNYGSQPG---PPPKRLDPDSIPSP 354
Cdd:PHA03247 2737 AAPAPPAVPAGPATPGgpaRPARPPTTAGPPAP 2769
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
11-138 6.51e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 43.99  E-value: 6.51e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    11 PPYGQPQPGY---QGYQQPAYGGQPlPGVPHTQY---GAYNGPMPGYQQPVPPQ-----------GSVRALPTSGaPPPA 73
Cdd:pfam03154  407 PPSAHPPPLQlmpQSQQLPPPPAQP-PVLTQSQSlppPAASHPPTSGLHQVPSQspfpqhpfvpgGPPPITPPSG-PPTS 484
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 971402013    74 SGTSLPSGHQ--GYSQFGQGDVQNGIPTSTAPMQ--RPPASQPFLPGSAPAPVSQPStfqqygPPPCSV 138
Cdd:pfam03154  485 TSSAMPGIQPpsSASVSSSGPVPAAVSCPLPPVQikEEALDEAEEPESPPPPPRSPS------PEPTVV 547
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
46-235 2.00e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 42.28  E-value: 2.00e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   46 GPMPGYQQPVPPQGSVRALPTSGAPPPASGTSLPSGHQGYSQFGQGDVQNGIPTSTAPMQRPPASQPFLPGSAPAPVSQP 125
Cdd:PRK07764  599 GPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAP 678
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  126 StfqQYGPPPCSVQQLSNHMAGMTIGSTSVSAPPPAGLGYGPPTSVPPVSGSFSATGsglytPYTASPGPPPPSVpqGLP 205
Cdd:PRK07764  679 A---APPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPS-----PAADDPVPLPPEP--DDP 748
                         170       180       190
                  ....*....|....*....|....*....|
gi 971402013  206 LAQPPFSGQPVPTQRLPTEVPGFAPPPPAT 235
Cdd:PRK07764  749 PDPAGAPAQPPPPPAPAPAAAPAAAPPPSP 778
PHA03378 PHA03378
EBNA-3B; Provisional
12-356 3.54e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 41.59  E-value: 3.54e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   12 PYGQPQPGYQG---YQQPAYGGQPLPGVPHTQYGAYNGPMPGYQQP------------VPPQGSVRALPTSGAPPPASGT 76
Cdd:PHA03378  457 PPTQPLEGPTGplsVQAPLEPWQPLPHPQVTPVILHQPPAQGVQAHgsmldllekddeDMEQRVMATLLPPSPPQPRAGR 536
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   77 SLPSGHQgysqfGQGDVQNGIPTSTAPMQRPPASQPFLPGSAPAPVSQPSTFQ------QYGPPPCSVQQLSNHMAGMTI 150
Cdd:PHA03378  537 RAPCVYT-----EDLDIESDEPASTEPVHDQLLPAPGLGPLQIQPLTSPTTSQlassapSYAQTPWPVPHPSQTPEPPTT 611
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  151 GS--TSVSAPPPAGLGYGPPTSVPPVSGSFSATGSGLYTPYTASPGPPPPSVPQGLPLAQPPFsgQPVP----TQRLPTE 224
Cdd:PHA03378  612 QShiPETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPPQVEITPYKPTWTQIGHIPY--QPSPtganTMLPIQW 689
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  225 VPGFAPPPPATgigassyppptgaprpppmpGPPLSGQTVAGPPMSQPNHVSSP-PPPLTLSGPHPGPPMSGPPPPTHPP 303
Cdd:PHA03378  690 APGTMQPPPRA--------------------PTPMRPPAAPPGRAQRPAAATGRaRPPAAAPGRARPPAAAPGRARPPAA 749
                         330       340       350       360       370
                  ....*....|....*....|....*....|....*....|....*....|....
gi 971402013  304 QPGYQMQQNGSFGQVRGPQpnyggAYPGTPNYGSQP-GPPPKRLDPDSIPSPIQ 356
Cdd:PHA03378  750 APGRARPPAAAPGRARPPA-----AAPGAPTPQPPPqAPPAPQQRPRGAPTPQP 798
dnaA PRK14086
chromosomal replication initiator protein DnaA;
9-162 3.66e-03

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 41.35  E-value: 3.66e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    9 AGPPYGQP--QPGYQGYQQPAYGGQPLPGVPHTQ--------------------YGAYN-----GPMPGYQQPVPPQgSV 61
Cdd:PRK14086   92 AGEPAPPPphARRTSEPELPRPGRRPYEGYGGPRaddrppglprqdqlptarpaYPAYQqrpepGAWPRAADDYGWQ-QQ 170
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   62 RALPTSGAPPPASGTSLPSGHQGYSQFGQGDVQNGIPTSTAPMQRPPASQPF-----LPGSAPAPVSQPSTFQQYGPPPC 136
Cdd:PRK14086  171 RLGFPPRAPYASPASYAPEQERDREPYDAGRPEYDQRRRDYDHPRPDWDRPRrdrtdRPEPPPGAGHVHRGGPGPPERDD 250
                         170       180
                  ....*....|....*....|....*.
gi 971402013  137 SVQQLSNHMAGMTIGSTSVSAPPPAG 162
Cdd:PRK14086  251 APVVPIRPSAPGPLAAQPAPAPGPGE 276
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
14-215 3.84e-03

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 41.47  E-value: 3.84e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    14 GQPQPGY--QGYQQPAYGGQPLPGVPHTQYG-AYNGPMPGYQQPvPPQGSVRALPTSGAPP--------PASGTSLPSGH 82
Cdd:pfam03157  363 GQGQPGYypTSQQQPQQGQQPEQGQQGQQQGqGQQGQQPGQGQQ-PGQGQPGYYPTSPQQSgqgqpgyyPTSPQQSGQGQ 441
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    83 Q-------GYSQFGQGD-----VQNGIPTSTAPMQRPPASQpflPGSAPAPVSQPSTFQQYGPPPCSVQQLSNHMAgmti 150
Cdd:pfam03157  442 QpgqgqqpGQEQPGQGQqpgqgQQGQQPGQPEQGQQPGQGQ---PGYYPTSPQQSGQGQQLGQWQQQGQGQPGYYP---- 514
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 971402013   151 gsTSVSAPPPAGLGYGPPTSVPPVSGSFSATgsgLYTPYTASPGPPPPSVPQGLPLAQPPFSGQP 215
Cdd:pfam03157  515 --TSPLQPGQGQPGYYPTSPQQPGQGQQLGQ---LQQPTQGQQGQQSGQGQQGQQPGQGQQGQQP 574
PHA02682 PHA02682
ORF080 virion core protein; Provisional
64-237 5.27e-03

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 40.23  E-value: 5.27e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   64 LPTSGAPPPASGTSLPSGHQGYSQFGQGdVQNGIPTSTAPMQRPPASQPFLPgSAPAPVSQPSTFQQYGPPPCSvqqlsn 143
Cdd:PHA02682   34 IPAPAAPCPPDADVDPLDKYSVKEAGRY-YQSRLKANSACMQRPSGQSPLAP-SPACAAPAPACPACAPAAPAP------ 105
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  144 hmagmtigSTSVSAPPPAGLGYGPPTSVPPVSGSFSATGSGLYTPYTAS-PGPPPPSVPQGLPLAQPPFSGQPVPTQRLP 222
Cdd:PHA02682  106 --------AVTCPAPAPACPPATAPTCPPPAVCPAPARPAPACPPSTRQcPPAPPLPTPKPAPAAKPIFLHNQLPPPDYP 177
                         170
                  ....*....|....*.
gi 971402013  223 -TEVPGFAPPPPATGI 237
Cdd:PHA02682  178 aASCPTIETAPAASPV 193
GGN pfam15685
Gametogenetin; GGN is a family of proteins largely found in mammals. It reacts with POG in the ...
54-379 5.81e-03

Gametogenetin; GGN is a family of proteins largely found in mammals. It reacts with POG in the maturation of sperm and is expressed virtually only in the testis. It is found to be associated with the intracellular membrane, binds with GGNBP1 and may be involved in vesicular trafficking.


Pssm-ID: 434857 [Multi-domain]  Cd Length: 668  Bit Score: 40.91  E-value: 5.81e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013    54 PV-PPQGSVRALPTSGAPPPASGTSLPS----------------GHQGYSQFGQGDvqngiPTSTAPMQRPPASQPFLPG 116
Cdd:pfam15685   83 PVtPPPEEAAAAAVSTAPPPAVGSLLPApskwrkptgtavarirGLLEASHRGQGD-----PLSLRPLLPLLPRQLIEKD 157
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   117 SAP-APVSQPSTF---QQYGPPPCSVQQLSN--------------------HMA-GMTIGSTSVSAPPPAGLG------- 164
Cdd:pfam15685  158 PAPgAPAPPPPTPlepRKPPPLPPSDRQPPNrgitpalatsatsptdsqakHIAeGKTAGGACGGAPPQAGEGemarfaa 237
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   165 --------------YGPPTSVPPVSGSFSAT-------GSGLYTpyTASPGPPPPSVPQGlPLaqPPfsGQPvptqRLPT 223
Cdd:pfam15685  238 sesglsllckvtfkSAAPLCPAAASGPLAAKaslggggGGGLFA--ASGAISCAEVLKQG-PL--AP--GAA----RPLG 306
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   224 EVPGFAPPPPAtGIGassyppptgaprpppmpgpplSGQTVAGPPMSQPNHVSS-PPPPLTLSGPHPGPPMSGPPPPTHP 302
Cdd:pfam15685  307 EVPRAALETEG-GEG---------------------DGEGCSGGPAAPASHARAlPPPAYTTFPGSKPKFDWVSPPDGPE 364
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   303 PqpgyQMQQNGSFGQVRGPQPNYGG-----AYPGTPNYGSQPGPPPKRLDPDSIPSPIQVIEDDRNNRGSEPFVTGVRGQ 377
Cdd:pfam15685  365 R----HFRFNGAGGGIGAPRRRAAAlsgpwGSPPPPPGKAHPIPGPRRPAPALLAPPMFIFPAPTNGEPVRPGPPAPQAL 440

                   ..
gi 971402013   378 VP 379
Cdd:pfam15685  441 LP 442
PRK12727 PRK12727
flagellar biosynthesis protein FlhF;
99-285 6.25e-03

flagellar biosynthesis protein FlhF;


Pssm-ID: 237182 [Multi-domain]  Cd Length: 559  Bit Score: 40.36  E-value: 6.25e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   99 TSTAPMQRPPASQPFLPGSAPAPVSQPSTFQQYGPPPCSVQQLSNHM-AGMTI-GSTSVSAPPPAGLGYGPPTSVPPVSG 176
Cdd:PRK12727   62 TPATAAAPAPAPQAPTKPAAPVHAPLKLSANANMSQRQRVASAAEDMiAAMALrQPVSVPRQAPAAAPVRAASIPSPAAQ 141
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  177 SFSATGSGLYTPYTASPGPPPPSVPQGLPLAQPPFSGQPVPtqrlPTEVPGFAPPPPATGIGASSYPPPTGAPRPPPMPG 256
Cdd:PRK12727  142 ALAHAAAVRTAPRQEHALSAVPEQLFADFLTTAPVPRAPVQ----APVVAAPAPVPAIAAALAAHAAYAQDDDEQLDDDG 217
                         170       180       190
                  ....*....|....*....|....*....|
gi 971402013  257 PPL-SGQTVAGPPMSQPNHVSSPPPPLTLS 285
Cdd:PRK12727  218 FDLdDALPQILPPAALPPIVVAPAAPAALA 247
PPE COG5651
PPE-repeat protein [Function unknown];
63-285 7.42e-03

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 39.88  E-value: 7.42e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013   63 ALPTSGAPPPASGTslPSGHQGYSQFG--QGDVQNGIPTSTAPMQRPPASQPFLPGSAPAPVSqpstfqqygPPPCSVQQ 140
Cdd:COG5651   163 ALTPFTQPPPTITN--PGGLLGAQNAGsgNTSSNPGFANLGLTGLNQVGIGGLNSGSGPIGLN---------SGPGNTGF 231
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 971402013  141 LSNHMAGMTIGSTSVSAPPPAGLGYGPPTSVPPVSGSFSATGSGLYTPYTASPGPPPPSVPQGLPLAQPPFSGQPVPTQR 220
Cdd:COG5651   232 AGTGAAAGAAAAAAAAAAAAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAATGLGLG 311
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 971402013  221 LPTEVPGFAPPPPATGIGASSYPPPTGAPRPPPMPGPPLSGQTVAGPPMSQPNHVSSPPPPLTLS 285
Cdd:COG5651   312 AGGAAGAAGATGAGAALGAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGAAAG 376
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH