|
Name |
Accession |
Description |
Interval |
E-value |
| COG5028 |
COG5028 |
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ... |
182-1028 |
4.46e-167 |
|
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];
Pssm-ID: 227361 [Multi-domain] Cd Length: 861 Bit Score: 512.80 E-value: 4.46e-167
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 182 GASPLPLPMYRPDG--LSGPPPPNAQYQPpplpgQTLGAGYPPQQAANSGPQMAGAQL---SYPGGF----PGGPAQMAG 252
Cdd:COG5028 7 GVYPQAQSQVHTGAasSKKSARPHRAYAN-----FSAGQMGMPPYTTPPLQQQSRRQIdqaATAMHNtganNPAPSVMSP 81
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 253 PPQPQKK-------------LDPDSIPSPIQVIENDRASRGGQVYATnTRGQIPPLvTTDCMIQDQGNASPRFIRCTTYC 319
Cdd:COG5028 82 AFQSQQKfsspyggsmadgtAPKPTNPLVPVDLFEDQPPPISDLFLP-PPPIVPPL-TTNFVGSEQSNCSPKYVRSTMYA 159
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 320 FPCTSDMAKQAQIPLAAVIKPFATIPSNESPLYLVNHGEsgPVRCNRCKAYMCPFMQFIEGGRRYQCGFCNCVNDVPPFY 399
Cdd:COG5028 160 IPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNICRSKNDVPEGF 237
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 400 FQHLDHIGRRLDHYEKPELSLGSYEYVATLDYcrKSKPPNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEQE 479
Cdd:COG5028 238 DNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILENLDQIPNFDPR 315
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 480 etsaIRVGFITYNKVLHFFNVKSNLaQPQMMVVTDVGEVFVPLLDG-FLVNYQESQSVIHNLLDQIPDMFADSNENETVF 558
Cdd:COG5028 316 ----TKIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIFQDNKSPKNAL 390
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 559 APVIQAGMEALKAadCPGKLFIFHSSLPTAeAPGKLKNRDDKklvntdkEKILFQPQTNVYDSLAKDCVAHGCSVTLFLF 638
Cdd:COG5028 391 GPALKAAKSLIGG--TGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSLLSCKDSFYKEFAIECSKVGISVDLFLT 460
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 639 PSQYVDVASLGLVPQLTGGTLYKYNNFQM--HLDRQQFLNDLRNDIEKKIGFDAIMRVRTSTGFRATDFFGGILMNNTTD 716
Cdd:COG5028 461 SEDYIDVATLSHLCRYTGGQTYFYPNFSAtrPNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLRVSSFYGNFFNRSSDL 540
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 717 VEMAAIDCDKAVTVEFKHDDKLSeDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFK 796
Cdd:COG5028 541 CAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASADQLAIACILAKKAST 619
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 797 AVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNcVLLSRPEISTDERAYQRQLVMT 876
Cdd:COG5028 620 KALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLALLKS-SAFRSGSTPSDIRISALNRLTS 698
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 877 MGVADSQLFFYPQLLPIHTL-------DVKSTMLPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIFNVPS 949
Cdd:COG5028 699 LPLKQLMRNIYPTLYALHDMpieaglpDEGLLVLPSPINATSSLLESGGLYLIDTGQKIFLWFGKDAVPSLLQDLFGVDS 778
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 950 FAHINTDMTLLPEVGNPYSQQLRMIMGIIQQKRPYS-MKLTIVKQREQP--EMVFRQFLVEDKgLYGGSSYVDFLCCVHK 1026
Cdd:COG5028 779 LSDIPSGKFTLPPTGNEFNERVRNIIGELRSVNDDStLPLVLVRGGGDPslRLWFFSTLVEDK-TLNIPSYLDYLQILHE 857
|
..
gi 968121920 1027 EI 1028
Cdd:COG5028 858 KI 859
|
|
| Sec24-like |
cd01479 |
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ... |
438-697 |
7.00e-119 |
|
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.
Pssm-ID: 238756 [Multi-domain] Cd Length: 244 Bit Score: 364.29 E-value: 7.00e-119
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 438 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 517
Cdd:cd01479 1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGDD----PRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 518 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKaaDCPGKLFIFHSSLPTAEApGKLKNR 597
Cdd:cd01479 77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 598 DDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQmhldrqqflND 677
Cdd:cd01479 154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYPSFN---------FS 224
|
250 260
....*....|....*....|
gi 968121920 678 LRNDIEKKIGFDAIMRVRTS 697
Cdd:cd01479 225 APNDVEKLVNELARYLTRKI 244
|
|
| Sec23_trunk |
pfam04811 |
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ... |
438-682 |
6.44e-111 |
|
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.
Pssm-ID: 398467 [Multi-domain] Cd Length: 241 Bit Score: 343.46 E-value: 6.44e-111
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 438 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEeqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 517
Cdd:pfam04811 1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGD-----PRARVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 518 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEAPGKLKNR 597
Cdd:pfam04811 76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 598 DDKKLVNTDKEKILFQPQTN-VYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFLN 676
Cdd:pfam04811 156 LDESHHGTDKEKAKLVKKADkFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235
|
....*.
gi 968121920 677 DLRNDI 682
Cdd:pfam04811 236 DLQRYF 241
|
|
| PTZ00395 |
PTZ00395 |
Sec24-related protein; Provisional |
440-1026 |
1.29e-50 |
|
Sec24-related protein; Provisional
Pssm-ID: 185594 [Multi-domain] Cd Length: 1560 Bit Score: 195.68 E-value: 1.29e-50
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 440 PPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIpkeeqeETSAIRVGFITYNKVLHFFNVKSNLAQP------------ 507
Cdd:PTZ00395 952 PPYFVFVVECSYNAIYNNITYTILEGIRYAVQNV------KCPQTKIAIITFNSSIYFYHCKGGKGVSgeegdggggsgn 1025
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 508 -QMMVVTDVGEVFVPL-LDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSL 585
Cdd:PTZ00395 1026 hQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSCGNSALKIAMDMLKERNGLGSICMFYTTT 1105
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 586 PTAeAPGKLKnrddkKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVA--SLGLVPQLTGGTLYKYN 663
Cdd:PTZ00395 1106 PNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFIISSNNVRVCvpSLQYVAQNTGGKILFVE 1179
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 664 NFQMHLDRQQ-FLNDLRNDIEKKIGFDAIMRVRTSTG------FRATDFFGGILMNNTtdVEMAAIDCDKAVTVEFKHDD 736
Cdd:PTZ00395 1180 NFLWQKDYKEiYMNIMDTLTSEDIAYCCELKLRYSHHmsvkklFCCNNNFNSIISVDT--IKIPKIRHDQTFAFLLNYSD 1257
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 737 KLSEDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFKAVLHQplKVIREILVNQTAH 816
Cdd:PTZ00395 1258 ISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNILIKQLCTNILHN--DNYSKIIIDNLAA 1335
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 817 MLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVllSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIH-- 894
Cdd:PTZ00395 1336 ILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNV--TKKEILHDLKVYSLIKLLSMPIISSLLYVYPVMYVIHik 1413
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 895 -------TLDVKSTM-LPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIF-NVPSFAHINTdmtlLPEVGN 965
Cdd:PTZ00395 1414 gktneidSMDVDDDLfIPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANFAKEIVgDIPTEKNAHE----LNLTDT 1489
|
570 580 590 600 610 620
....*....|....*....|....*....|....*....|....*....|....*....|...
gi 968121920 966 PYSQQLRMIMGIIQQKRPYS--MKLTIVKQREQPEMVFRQFLVEDKGlYGGSSYVDFLCCVHK 1026
Cdd:PTZ00395 1490 PNAQKVQRIIKNLSRIHHFNkyVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYVNFLCFIHK 1551
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
8-294 |
2.05e-10 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 65.34 E-value: 2.05e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 8 ATPPYSQPQPGIGLSPPHYGhygDPSHTASPTGMMKPAGPlGATATRGMLPPGPPPPGPhqfgqnGAHATGHPPQRFPGP 87
Cdd:PHA03247 2718 ATPLPPGPAAARQASPALPA---APAPPAVPAGPATPGGP-ARPARPPTTAGPPAPAPP------AAPAAGPPRRLTRPA 2787
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 88 PPVNNVASSHAPyQPSAQSSYPGPISTSSVTQLGSQLSAmqinsygSGMAPPSQGPPgplsaTSLQTPPRPPQPSiLQPG 167
Cdd:PHA03247 2788 VASLSESRESLP-SPWDPADPPAAVLAPAAALPPAASPA-------GPLPPPTSAQP-----TAPPPPPGPPPPS-LPLG 2853
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 168 SQVLPPPPTTLNGPGASPLPLPMYRP----DGLSGPPPPNaQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGF 243
Cdd:PHA03247 2854 GSVAPGGDVRRRPPSRSPAAKPAAPArppvRRLARPAVSR-STESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|.
gi 968121920 244 PGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRASRGGQVYAtnTRGQIPP 294
Cdd:PHA03247 2933 PPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAV--PRFRVPQ 2981
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
96-260 |
1.45e-09 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 62.48 E-value: 1.45e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 96 SHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSygsgmAPPSQGPPGPLSATSLQTP-PRPPQPSILQPGSQVLPPP 174
Cdd:pfam03154 143 STSPSIPSPQDNESDSDSSAQQQILQTQPPVLQAQS-----GAASPPSPPPPGTTQAATAgPTPSAPSVPPQGSPATSQP 217
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 175 PTTLNGPgASPLPL----PMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGP------QMAGAQLSYPG--- 241
Cdd:pfam03154 218 PNQTQST-AAPHTLiqqtPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPpmphslQTGPSHMQHPVppq 296
|
170 180
....*....|....*....|.
gi 968121920 242 GFPGGP--AQMAGPPQPQKKL 260
Cdd:pfam03154 297 PFPLTPqsSQSQVPPGPSPAA 317
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
102-232 |
9.60e-06 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 49.42 E-value: 9.60e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 102 PSAQSSYPGPISTS-SVTQLGSQLSAMQINSY-----GSGMAPPSQGppgplsatslqtPPRPPQPSILQPGSQVLPPPP 175
Cdd:TIGR01628 381 RMRQLPMGSPMGGAmGQPPYYGQGPQQQFNGQplgwpRMSMMPTPMG------------PGGPLRPNGLAPMNAVRAPSR 448
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*..
gi 968121920 176 TTLNGPGASPLPLPMYRPDGLSGPPPPNAQyQPPPLPGQTLGAGYPPQQAANSGPQM 232
Cdd:TIGR01628 449 NAQNAAQKPPMQPVMYPPNYQSLPLSQDLP-QPQSTASQGGQNKKLAQVLASATPQM 504
|
|
| BimA_first |
NF040984 |
trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular ... |
94-211 |
2.43e-04 |
|
trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular motility protein A) is a trimeric autotransporter, homologous in its C-terminal half to a number of trimeric autotransporter adhesins. It is a virulence factor that nucleates actin, so that actin polymerization can drive escape by B. pseudomallei out of one cell and into a neighboring cell. HMM NF040983 describes a homolog with similar activity but substantial difference in sequence architecture in the N-terminal region.
Pssm-ID: 468914 [Multi-domain] Cd Length: 517 Bit Score: 44.86 E-value: 2.43e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 94 ASSHAPYQPSaqssyPGPISTSSVTQLGSQLSAMQINSYGSgmappsqgPPGPLSATSLQTPPrpPQPSilqPGSQVLPP 173
Cdd:NF040984 6 SSSHAPDAPK-----PSSIATTLCRALASLSLGLSMDAEAN--------PPEPPGGTNIPVPP--PMPG---GGANIPVP 67
|
90 100 110
....*....|....*....|....*....|....*...
gi 968121920 174 PPTTLNGPGASPLPLPmyrPDGLSGPPPpnaqyQPPPL 211
Cdd:NF040984 68 PPMPGGGANIPPPPPP---PGGIGGATP-----SPPPL 97
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
153-260 |
9.21e-04 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 43.26 E-value: 9.21e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 153 QTPPRPPQPSILQP--GSQVLPP----------PPTTLNGPGASPLPLPMY--RPDGLSGPPPPNAQYQPP----PLPGQ 214
Cdd:TIGR01628 377 QLQPRMRQLPMGSPmgGAMGQPPyygqgpqqqfNGQPLGWPRMSMMPTPMGpgGPLRPNGLAPMNAVRAPSrnaqNAAQK 456
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|..
gi 968121920 215 TLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGP----AQM--AGPPQPQKKL 260
Cdd:TIGR01628 457 PPMQPVMYPPNYQSLPLSQDLPQPQSTASQGGQnkklAQVlaSATPQMQKQV 508
|
|
| BimA_second |
NF040983 |
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia ... |
139-247 |
1.35e-03 |
|
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia intracellular motility A), WP_004266405.1-like proteins in Burkholderia mallei or B. pseudomallei. The term BimA has also been used for WP_011205626.1-like homologs that have a very different N-terminal half.
Pssm-ID: 468913 [Multi-domain] Cd Length: 382 Bit Score: 42.20 E-value: 1.35e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 139 PSQGPPGPlsatslqtPPRPPQPSILQPGSQVLPPPPttlngPGASPLPLPmyrpdglsgPPPPNAQYQPPPlPGQTLGA 218
Cdd:NF040983 86 PNKVPPPP--------PPPPPPPPPPPTPPPPPPPPP-----PPPPPSPPP---------PPPPSPPPSPPP-PTTTPPT 142
|
90 100
....*....|....*....|....*....
gi 968121920 219 GYPPQQAANSgPQMAGAQlsyPGGFPGGP 247
Cdd:NF040983 143 RTTPSTTTPT-PSMHPIQ---PTQLPSIP 167
|
|
| PPE |
COG5651 |
PPE-repeat protein [Function unknown]; |
70-254 |
2.87e-03 |
|
PPE-repeat protein [Function unknown];
Pssm-ID: 444372 [Multi-domain] Cd Length: 385 Bit Score: 41.42 E-value: 2.87e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 70 GQNGAHATGhPPQRFPGPPPVNNVASSHAPYQPSAQSSYPGPISTSS-VTQLGSQLSAMQINSYGSGMAPPSQGPPGPLS 148
Cdd:COG5651 179 GLLGAQNAG-SGNTSSNPGFANLGLTGLNQVGIGGLNSGSGPIGLNSgPGNTGFAGTGAAAGAAAAAAAAAAAAGAGASA 257
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 149 AtslqtpprpPQPSILQPGSQVLPPPPTTLNGPGASPLPLPmyrPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANS 228
Cdd:COG5651 258 A---------LASLAATLLNASSLGLAATAASSAATNLGLA---GSPLGLAGGGAGAAAATGLGLGAGGAAGAAGATGAG 325
|
170 180
....*....|....*....|....*.
gi 968121920 229 GPQMAGAQLSYPGGFPGGPAQMAGPP 254
Cdd:COG5651 326 AALGAGAAAAAAGAAAGAGAAAAAAA 351
|
|
| SAV_2336_NTERM |
NF041121 |
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ... |
138-205 |
5.41e-03 |
|
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.
Pssm-ID: 469044 [Multi-domain] Cd Length: 473 Bit Score: 40.37 E-value: 5.41e-03
10 20 30 40 50 60
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 968121920 138 PPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDglsGPPPPNAQ 205
Cdd:NF041121 39 PPPAAPPSPPGDPPEPPAPEPAPLPAPYPGSLAPPPPPPPGPAGAAPGAALPVRVPA---PPALPNPL 103
|
|
| SAV_2336_NTERM |
NF041121 |
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ... |
137-214 |
7.50e-03 |
|
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.
Pssm-ID: 469044 [Multi-domain] Cd Length: 473 Bit Score: 39.99 E-value: 7.50e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 137 APPSQGPPGPLSATSLQTPPRPPQPSiLQPGSQVLPPPPTTLNGPGASP--LPLPMYRPDGLSGPPPPNAQ----YQPPP 210
Cdd:NF041121 20 APPSPEGPAPTAASQPATPPPPAAPP-SPPGDPPEPPAPEPAPLPAPYPgsLAPPPPPPPGPAGAAPGAALpvrvPAPPA 98
|
....
gi 968121920 211 LPGQ 214
Cdd:NF041121 99 LPNP 102
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| COG5028 |
COG5028 |
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ... |
182-1028 |
4.46e-167 |
|
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];
Pssm-ID: 227361 [Multi-domain] Cd Length: 861 Bit Score: 512.80 E-value: 4.46e-167
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 182 GASPLPLPMYRPDG--LSGPPPPNAQYQPpplpgQTLGAGYPPQQAANSGPQMAGAQL---SYPGGF----PGGPAQMAG 252
Cdd:COG5028 7 GVYPQAQSQVHTGAasSKKSARPHRAYAN-----FSAGQMGMPPYTTPPLQQQSRRQIdqaATAMHNtganNPAPSVMSP 81
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 253 PPQPQKK-------------LDPDSIPSPIQVIENDRASRGGQVYATnTRGQIPPLvTTDCMIQDQGNASPRFIRCTTYC 319
Cdd:COG5028 82 AFQSQQKfsspyggsmadgtAPKPTNPLVPVDLFEDQPPPISDLFLP-PPPIVPPL-TTNFVGSEQSNCSPKYVRSTMYA 159
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 320 FPCTSDMAKQAQIPLAAVIKPFATIPSNESPLYLVNHGEsgPVRCNRCKAYMCPFMQFIEGGRRYQCGFCNCVNDVPPFY 399
Cdd:COG5028 160 IPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNICRSKNDVPEGF 237
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 400 FQHLDHIGRRLDHYEKPELSLGSYEYVATLDYcrKSKPPNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEQE 479
Cdd:COG5028 238 DNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILENLDQIPNFDPR 315
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 480 etsaIRVGFITYNKVLHFFNVKSNLaQPQMMVVTDVGEVFVPLLDG-FLVNYQESQSVIHNLLDQIPDMFADSNENETVF 558
Cdd:COG5028 316 ----TKIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIFQDNKSPKNAL 390
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 559 APVIQAGMEALKAadCPGKLFIFHSSLPTAeAPGKLKNRDDKklvntdkEKILFQPQTNVYDSLAKDCVAHGCSVTLFLF 638
Cdd:COG5028 391 GPALKAAKSLIGG--TGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSLLSCKDSFYKEFAIECSKVGISVDLFLT 460
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 639 PSQYVDVASLGLVPQLTGGTLYKYNNFQM--HLDRQQFLNDLRNDIEKKIGFDAIMRVRTSTGFRATDFFGGILMNNTTD 716
Cdd:COG5028 461 SEDYIDVATLSHLCRYTGGQTYFYPNFSAtrPNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLRVSSFYGNFFNRSSDL 540
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 717 VEMAAIDCDKAVTVEFKHDDKLSeDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFK 796
Cdd:COG5028 541 CAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASADQLAIACILAKKAST 619
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 797 AVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNcVLLSRPEISTDERAYQRQLVMT 876
Cdd:COG5028 620 KALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLALLKS-SAFRSGSTPSDIRISALNRLTS 698
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 877 MGVADSQLFFYPQLLPIHTL-------DVKSTMLPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIFNVPS 949
Cdd:COG5028 699 LPLKQLMRNIYPTLYALHDMpieaglpDEGLLVLPSPINATSSLLESGGLYLIDTGQKIFLWFGKDAVPSLLQDLFGVDS 778
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 950 FAHINTDMTLLPEVGNPYSQQLRMIMGIIQQKRPYS-MKLTIVKQREQP--EMVFRQFLVEDKgLYGGSSYVDFLCCVHK 1026
Cdd:COG5028 779 LSDIPSGKFTLPPTGNEFNERVRNIIGELRSVNDDStLPLVLVRGGGDPslRLWFFSTLVEDK-TLNIPSYLDYLQILHE 857
|
..
gi 968121920 1027 EI 1028
Cdd:COG5028 858 KI 859
|
|
| Sec24-like |
cd01479 |
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ... |
438-697 |
7.00e-119 |
|
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.
Pssm-ID: 238756 [Multi-domain] Cd Length: 244 Bit Score: 364.29 E-value: 7.00e-119
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 438 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 517
Cdd:cd01479 1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGDD----PRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 518 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKaaDCPGKLFIFHSSLPTAEApGKLKNR 597
Cdd:cd01479 77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 598 DDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQmhldrqqflND 677
Cdd:cd01479 154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYPSFN---------FS 224
|
250 260
....*....|....*....|
gi 968121920 678 LRNDIEKKIGFDAIMRVRTS 697
Cdd:cd01479 225 APNDVEKLVNELARYLTRKI 244
|
|
| Sec23_trunk |
pfam04811 |
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ... |
438-682 |
6.44e-111 |
|
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.
Pssm-ID: 398467 [Multi-domain] Cd Length: 241 Bit Score: 343.46 E-value: 6.44e-111
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 438 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEeqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 517
Cdd:pfam04811 1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGD-----PRARVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 518 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEAPGKLKNR 597
Cdd:pfam04811 76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 598 DDKKLVNTDKEKILFQPQTN-VYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFLN 676
Cdd:pfam04811 156 LDESHHGTDKEKAKLVKKADkFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235
|
....*.
gi 968121920 677 DLRNDI 682
Cdd:pfam04811 236 DLQRYF 241
|
|
| trunk_domain |
cd01468 |
trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi ... |
438-678 |
7.24e-98 |
|
trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Some members of this family possess a partial MIDAS motif that is a characteristic feature of most vWA domain proteins.
Pssm-ID: 238745 [Multi-domain] Cd Length: 239 Bit Score: 308.79 E-value: 7.24e-98
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 438 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEeqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 517
Cdd:cd01468 1 PQPPVFVFVIDVSYEAIKEGLLQALKESLLASLDLLPGD-----PRARVGLITYDSTVHFYNLSSDLAQPKMYVVSDLKD 75
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 518 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFAD--SNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEaPGKLK 595
Cdd:cd01468 76 VFLPLPDRFLVPLSECKKVIHDLLEQLPPMFWPvpTHRPERCLGPALQAAFLLLKGTFAGGRIIVFQGGLPTVG-PGKLK 154
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 596 NRDDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFL 675
Cdd:cd01468 155 SREDKEPIRSHDEAQLLKPATKFYKSLAKECVKSGICVDLFAFSLDYVDVATLKQLAKSTGGQVYLYDSFQAPNDGSKFK 234
|
...
gi 968121920 676 NDL 678
Cdd:cd01468 235 QDL 237
|
|
| PTZ00395 |
PTZ00395 |
Sec24-related protein; Provisional |
440-1026 |
1.29e-50 |
|
Sec24-related protein; Provisional
Pssm-ID: 185594 [Multi-domain] Cd Length: 1560 Bit Score: 195.68 E-value: 1.29e-50
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 440 PPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIpkeeqeETSAIRVGFITYNKVLHFFNVKSNLAQP------------ 507
Cdd:PTZ00395 952 PPYFVFVVECSYNAIYNNITYTILEGIRYAVQNV------KCPQTKIAIITFNSSIYFYHCKGGKGVSgeegdggggsgn 1025
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 508 -QMMVVTDVGEVFVPL-LDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSL 585
Cdd:PTZ00395 1026 hQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSCGNSALKIAMDMLKERNGLGSICMFYTTT 1105
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 586 PTAeAPGKLKnrddkKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVA--SLGLVPQLTGGTLYKYN 663
Cdd:PTZ00395 1106 PNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFIISSNNVRVCvpSLQYVAQNTGGKILFVE 1179
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 664 NFQMHLDRQQ-FLNDLRNDIEKKIGFDAIMRVRTSTG------FRATDFFGGILMNNTtdVEMAAIDCDKAVTVEFKHDD 736
Cdd:PTZ00395 1180 NFLWQKDYKEiYMNIMDTLTSEDIAYCCELKLRYSHHmsvkklFCCNNNFNSIISVDT--IKIPKIRHDQTFAFLLNYSD 1257
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 737 KLSEDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFKAVLHQplKVIREILVNQTAH 816
Cdd:PTZ00395 1258 ISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNILIKQLCTNILHN--DNYSKIIIDNLAA 1335
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 817 MLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVllSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIH-- 894
Cdd:PTZ00395 1336 ILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNV--TKKEILHDLKVYSLIKLLSMPIISSLLYVYPVMYVIHik 1413
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 895 -------TLDVKSTM-LPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIF-NVPSFAHINTdmtlLPEVGN 965
Cdd:PTZ00395 1414 gktneidSMDVDDDLfIPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANFAKEIVgDIPTEKNAHE----LNLTDT 1489
|
570 580 590 600 610 620
....*....|....*....|....*....|....*....|....*....|....*....|...
gi 968121920 966 PYSQQLRMIMGIIQQKRPYS--MKLTIVKQREQPEMVFRQFLVEDKGlYGGSSYVDFLCCVHK 1026
Cdd:PTZ00395 1490 PNAQKVQRIIKNLSRIHHFNkyVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYVNFLCFIHK 1551
|
|
| Sec23_helical |
pfam04815 |
Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic ... |
784-884 |
1.62e-32 |
|
Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices.
Pssm-ID: 461441 [Multi-domain] Cd Length: 103 Bit Score: 121.46 E-value: 1.62e-32
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 784 DALINFFAKSAFKAVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVLLSRPEIS 863
Cdd:pfam04815 3 EAIAVLLAKKAVEKALSSSLSDAREALDNKLVDILAAYRKYCASSSSPGQLILPESLKLLPLYMLALLKSPALRGGNSSP 82
|
90 100
....*....|....*....|.
gi 968121920 864 TDERAYQRQLVMTMGVADSQL 884
Cdd:pfam04815 83 SDERAYARHLLLSLPVEELLL 103
|
|
| Sec23_BS |
pfam08033 |
Sec23/Sec24 beta-sandwich domain; |
687-771 |
4.52e-29 |
|
Sec23/Sec24 beta-sandwich domain;
Pssm-ID: 429794 [Multi-domain] Cd Length: 86 Bit Score: 111.09 E-value: 4.52e-29
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 687 GFDAIMRVRTSTGFRATDFFGGILMNNTTD-VEMAAIDCDKAVTVEFKHDDKLSEDSGALIQCAVLYTTISGQRRLRIHN 765
Cdd:pfam08033 1 GFNAVLRVRTSKGLKVSGFIGNFVSRSSGDtWKLPSLDPDTSYAFEFDIDEPLPNGSNAYIQFALLYTHSSGERRIRVTT 80
|
....*.
gi 968121920 766 LGLNCS 771
Cdd:pfam08033 81 VALPVT 86
|
|
| SEC23 |
COG5047 |
Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion]; |
313-901 |
1.70e-21 |
|
Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion];
Pssm-ID: 227380 [Multi-domain] Cd Length: 755 Bit Score: 100.73 E-value: 1.70e-21
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 313 IRCTTYCFPCTSDMAKQAQIPLAAVIKPFatipsNESPLYLVNHGEsgPVRCNR-CKAYMCPFMQFIEGGRRYQCGFCNC 391
Cdd:COG5047 12 IRLTWNVFPATRGDATRTVIPIACLYTPL-----HEDDALTVNYYE--PVKCTApCKAVLNPYCHIDERNQSWICPFCNQ 84
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 392 VNDVPPFYfqhldhigRRLDHYEKP-ELSLGS--YEYVAtldycrkSKPPN-PPAFIFMIDVSYSNIKNGLVKlicEELK 467
Cdd:COG5047 85 RNTLPPQY--------RDISNANLPlELLPQSstIEYTL-------SKPVIlPPVFFFVVDACCDEEELTALK---DSLI 146
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 468 TMLEKIPKEeqeetsAIrVGFITYNKVLHFFNV------KSNLAQP----QMMVVTDVGEVFVPLLDG------------ 525
Cdd:COG5047 147 VSLSLLPPE------AL-VGLITYGTSIQVHELnaenhrRSYVFSGnkeyTKENLQELLALSKPTKSGgfeskisgigqf 219
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 526 ----FLVNYQESQSVIHNLLDQI-PDMFADSNENE----TVFAPVIQAGMEALKAADCPGKLFIFHSSlPTAEAPGKLKN 596
Cdd:COG5047 220 assrFLLPTQQCEFKLLNILEQLqPDPWPVPAGKRplrcTGSALNIASSLLEQCFPNAGCHIVLFAGG-PCTVGPGTVVS 298
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 597 RDDKK------LVNTDKEKiLFQPQTNVYDSLAKDCVAHGCSVTLFLfpSQYVDVASLGLVP--QLTGGTLYKYNNFQMH 668
Cdd:COG5047 299 TELKEpmrshhDIESDSAQ-HSKKATKFYKGLAERVANQGHALDIFA--GCLDQIGIMEMEPltTSTGGALVLSDSFTTS 375
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 669 LDRQQFLN--DLRNDIEKKIGFDAIMRVRTSTGFRATDFFG---------------GILMNNTTDVEMAAIDCDKAVTVE 731
Cdd:COG5047 376 IFKQSFQRifNRDSEGYLKMGFNANMEVKTSKNLKIKGLIGhavsvkkkannisdsEIGIGATNSWKMASLSPKSNYALY 455
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 732 FK-----HDDKLSEDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADL-YKSCETDALINFFAK-SAFKAVLHQPLK 804
Cdd:COG5047 456 FEialgaASGSAQRPAEAYIQFITTYQHSSGTYRIRVTTVARMFTDGGLPKiNRSFDQEAAAVFMARiAAFKAETEDIID 535
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 805 VIREI---LVNQTAHmLACYRKNcaspsAASQLILPDSMKVLPVYMnCLLKNCVLLSRPEISTDERAYQRQLVMTMGVAD 881
Cdd:COG5047 536 VFRWIdrnLIRLCQK-FADYRKD-----DPSSFRLDPNFTLYPQFM-YHLRRSPFLSVFNNSPDETAFYRHMLNNADVND 608
|
650 660
....*....|....*....|....*...
gi 968121920 882 SQLFFYPQLLPIH--------TLDVKST 901
Cdd:COG5047 609 SLIMIQPTLQSYSfekggvpvLLDSVSV 636
|
|
| zf-Sec23_Sec24 |
pfam04810 |
Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum ... |
361-397 |
1.03e-15 |
|
Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain.
Pssm-ID: 461437 [Multi-domain] Cd Length: 38 Bit Score: 71.71 E-value: 1.03e-15
10 20 30
....*....|....*....|....*....|....*..
gi 968121920 361 PVRCNRCKAYMCPFMQFIEGGRRYQCGFCNCVNDVPP 397
Cdd:pfam04810 1 PVRCRRCRAYLNPFCQFDFGGKKWTCNFCGTRNPVPP 37
|
|
| PLN00162 |
PLN00162 |
transport protein sec23; Provisional |
313-657 |
6.19e-11 |
|
transport protein sec23; Provisional
Pssm-ID: 215083 [Multi-domain] Cd Length: 761 Bit Score: 66.50 E-value: 6.19e-11
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 313 IRCTTYCFPCTSDMAKQAQIPLAAVIKPFAtiPSNESPL--YlvnhgesGPVRCNRCKAYMCPFMQFIEGGRRYQCGFCN 390
Cdd:PLN00162 12 VRMSWNVWPSSKIEASKCVIPLAALYTPLK--PLPELPVlpY-------DPLRCRTCRAVLNPYCRVDFQAKIWICPFCF 82
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 391 CVNDVPPFYFQhldhIGrrlDHYEKPELslgsYEYVATLDY---CRKSKPPNPPAFIFMIDVSYSNIKNGLVKlicEELK 467
Cdd:PLN00162 83 QRNHFPPHYSS----IS---ETNLPAEL----FPQYTTVEYtlpPGSGGAPSPPVFVFVVDTCMIEEELGALK---SALL 148
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 468 TMLEKIPkeeqeETSaiRVGFITY----------------------------NKVLHFFNVKSNLAQPQMMVVTDVGEVF 519
Cdd:PLN00162 149 QAIALLP-----ENA--LVGLITFgthvhvhelgfsecsksyvfrgnkevskDQILEQLGLGGKKRRPAGGGIAGARDGL 221
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 520 VPL-LDGFLVNYQESQSVIHNLLDQI-PDMFADSNENE----TVFAPVIQAGMEALKAADCPGKLFIFHSSlPTAEAPGK 593
Cdd:PLN00162 222 SSSgVNRFLLPASECEFTLNSALEELqKDPWPVPPGHRparcTGAALSVAAGLLGACVPGTGARIMAFVGG-PCTEGPGA 300
|
330 340 350 360 370 380
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 968121920 594 LKNRDDKKLVNTDKEKI-----LFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGG 657
Cdd:PLN00162 301 IVSKDLSEPIRSHKDLDkdaapYYKKAVKFYEGLAKQLVAQGHVLDVFACSLDQVGVAEMKVAVERTGG 369
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
8-294 |
2.05e-10 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 65.34 E-value: 2.05e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 8 ATPPYSQPQPGIGLSPPHYGhygDPSHTASPTGMMKPAGPlGATATRGMLPPGPPPPGPhqfgqnGAHATGHPPQRFPGP 87
Cdd:PHA03247 2718 ATPLPPGPAAARQASPALPA---APAPPAVPAGPATPGGP-ARPARPPTTAGPPAPAPP------AAPAAGPPRRLTRPA 2787
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 88 PPVNNVASSHAPyQPSAQSSYPGPISTSSVTQLGSQLSAmqinsygSGMAPPSQGPPgplsaTSLQTPPRPPQPSiLQPG 167
Cdd:PHA03247 2788 VASLSESRESLP-SPWDPADPPAAVLAPAAALPPAASPA-------GPLPPPTSAQP-----TAPPPPPGPPPPS-LPLG 2853
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 168 SQVLPPPPTTLNGPGASPLPLPMYRP----DGLSGPPPPNaQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGF 243
Cdd:PHA03247 2854 GSVAPGGDVRRRPPSRSPAAKPAAPArppvRRLARPAVSR-STESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|.
gi 968121920 244 PGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRASRGGQVYAtnTRGQIPP 294
Cdd:PHA03247 2933 PPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAV--PRFRVPQ 2981
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
96-270 |
2.76e-10 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 64.70 E-value: 2.76e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 96 SHAPyQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPsQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPP-P 174
Cdd:PHA03378 613 SHIP-ETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPP-QVEITPYKPTWTQIGHIPYQPSPTGANTMLPIQwA 690
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 175 PTTLNGPGASPLPL--------PMYRPDGLSGP-PPPNAQYQPPPLPGQTLGAGYPPQQA--ANSGPQMAGAQLSYPGGF 243
Cdd:PHA03378 691 PGTMQPPPRAPTPMrppaappgRAQRPAAATGRaRPPAAAPGRARPPAAAPGRARPPAAApgRARPPAAAPGRARPPAAA 770
|
170 180 190
....*....|....*....|....*....|
gi 968121920 244 PGGPAQM---AGPPQPQKKldPDSIPSPIQ 270
Cdd:PHA03378 771 PGAPTPQpppQAPPAPQQR--PRGAPTPQP 798
|
|
| Gelsolin |
pfam00626 |
Gelsolin repeat; |
900-975 |
1.06e-09 |
|
Gelsolin repeat;
Pssm-ID: 395501 [Multi-domain] Cd Length: 76 Bit Score: 55.78 E-value: 1.06e-09
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 968121920 900 STMLPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSppELIQGIFNVPSFAHINTDM-TLLPEVGN-PYSQQLRMIM 975
Cdd:pfam00626 1 KFVLPPPVPLSQESLNSGDCYLLDNGFTIFLWVGKGS--SLLEKLFAALLAAQLDDDErFPLPEVIRvPQGKEPARFL 76
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
93-260 |
1.45e-09 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 62.39 E-value: 1.45e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 93 VASSHAPYQPSaqssypgPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLS-ATSLQTPPRPPQ--PSILQPGSQ 169
Cdd:PHA03378 667 TQIGHIPYQPS-------PTGANTMLPIQWAPGTMQPPPRAPTPMRPPAAPPGRAQrPAAATGRARPPAaaPGRARPPAA 739
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 170 V---LPPP---PTTLNGPGASPLPLPmyRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPqmAGAQLSypggf 243
Cdd:PHA03378 740 ApgrARPPaaaPGRARPPAAAPGRAR--PPAAAPGAPTPQPPPQAPPAPQQRPRGAPTPQPPPQAGP--TSMQLM----- 810
|
170
....*....|....*..
gi 968121920 244 pggPAQMAGPPQPQKKL 260
Cdd:PHA03378 811 ---PRAAPGQQGPTKQI 824
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
96-260 |
1.45e-09 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 62.48 E-value: 1.45e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 96 SHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSygsgmAPPSQGPPGPLSATSLQTP-PRPPQPSILQPGSQVLPPP 174
Cdd:pfam03154 143 STSPSIPSPQDNESDSDSSAQQQILQTQPPVLQAQS-----GAASPPSPPPPGTTQAATAgPTPSAPSVPPQGSPATSQP 217
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 175 PTTLNGPgASPLPL----PMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGP------QMAGAQLSYPG--- 241
Cdd:pfam03154 218 PNQTQST-AAPHTLiqqtPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPpmphslQTGPSHMQHPVppq 296
|
170 180
....*....|....*....|.
gi 968121920 242 GFPGGP--AQMAGPPQPQKKL 260
Cdd:pfam03154 297 PFPLTPqsSQSQVPPGPSPAA 317
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
93-274 |
5.69e-09 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 60.17 E-value: 5.69e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 93 VASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPS---QGP-------PGPLSATSLQTPPRPPQPS 162
Cdd:pfam03154 182 SPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTliqQTPtlhpqrlPSPHPPLQPMTQPPPPSQV 261
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 163 ILQPGSQV-----LPPPPTTLN-GPGASPLPLPmyrPDGLsGPPPPNAQYQPPPLPgQTLGAGYPPQQAANSGPQMAGAQ 236
Cdd:pfam03154 262 SPQPLPQPslhgqMPPMPHSLQtGPSHMQHPVP---PQPF-PLTPQSSQSQVPPGP-SPAAPGQSQQRIHTPPSQSQLQS 336
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|...
gi 968121920 237 LSYPGGFPGGPAQMAGP-------------PQPQKKLDPD--SIPSPIQVIEN 274
Cdd:pfam03154 337 QQPPREQPLPPAPLSMPhikpppttpipqlPNPQSHKHPPhlSGPSPFQMNSN 389
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
99-290 |
3.08e-08 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 57.85 E-value: 3.08e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 99 PYQPSAQS-SYPGPISTSSVTQLGSQLS---AMQINSYGSGMAPPSQGPPgPLS--ATSLQTPPRPPQPSILQPgSQVLP 172
Cdd:pfam03154 364 PQLPNPQShKHPPHLSGPSPFQMNSNLPpppALKPLSSLSTHHPPSAHPP-PLQlmPQSQQLPPPPAQPPVLTQ-SQSLP 441
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 173 P-----PPTTLNGPGASPLPLPMYrPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQA--ANSGPQMAGAQLSYPggfpg 245
Cdd:pfam03154 442 PpaashPPTSGLHQVPSQSPFPQH-PFVPGGPPPITPPSGPPTSTSSAMPGIQPPSSAsvSSSGPVPAAVSCPLP----- 515
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|..
gi 968121920 246 gPAQMAGPP-----QPQKKLDPDSIPSPIQVIEN--DRASRGGQVYATNTRG 290
Cdd:pfam03154 516 -PVQIKEEAldeaeEPESPPPPPRSPSPEPTVVNtpSHASQSARFYKHLDRG 566
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
10-262 |
3.47e-08 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 58.03 E-value: 3.47e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 10 PPYSQPQPGIGLSPPHYGHYGDPSHTASPTGMMKPAGPLGATATRGMLPPGPPPPGPHQFGQNGAHATGHPPQRFPGPPP 89
Cdd:PHA03247 2741 PPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPP 2820
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 90 VNNVASSHAP---YQPSAQSSYPGPISTSSVTQlGSQLSAMQINSYGSGMAPPSQ-------------GPPGPLSATSLQ 153
Cdd:PHA03247 2821 AASPAGPLPPptsAQPTAPPPPPGPPPPSLPLG-GSVAPGGDVRRRPPSRSPAAKpaaparppvrrlaRPAVSRSTESFA 2899
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 154 TPPRPPQPsilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLsgPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPQMA 233
Cdd:PHA03247 2900 LPPDQPER----PPQPQAPPPPQPQPQPPPPPQPQPPPPPPPR--PQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVA 2973
|
250 260
....*....|....*....|....*....
gi 968121920 234 GAQLSYPGGFPGGPAQMAGPPQPQKKLDP 262
Cdd:PHA03247 2974 VPRFRVPQPAPSREAPASSTPPLTGHSLS 3002
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
138-350 |
2.70e-07 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 54.94 E-value: 2.70e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 138 PPSQGP---PGPLSATS-LQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPP--PL 211
Cdd:PHA03247 2701 PPPPPPtpePAPHALVSaTPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAagPP 2780
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 212 PGQTLGAGYPPQQAANSGP---QMAGAQLSYPGGFPG-----GPAQMAGPPQPQKKLDPDSIPSPIQVIENDRAS--RGG 281
Cdd:PHA03247 2781 RRLTRPAVASLSESRESLPspwDPADPPAAVLAPAAAlppaaSPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSvaPGG 2860
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 282 QVYATNTRGQIPPLVTTDCMIQDQGNASPRFIRcTTYCFPCTSD-MAKQAQIPLAAVIKPFATIPSNESP 350
Cdd:PHA03247 2861 DVRRRPPSRSPAAKPAAPARPPVRRLARPAVSR-STESFALPPDqPERPPQPQAPPPPQPQPQPPPPPQP 2929
|
|
| Pro-rich |
pfam15240 |
Proline-rich protein; This family includes several eukaryotic proline-rich proteins. |
103-249 |
1.27e-06 |
|
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
Pssm-ID: 464580 [Multi-domain] Cd Length: 167 Bit Score: 49.65 E-value: 1.27e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 103 SAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRP--PQPSILQPGS---QVLPPPPTT 177
Cdd:pfam15240 15 SAQSSSEDVSQEDSPSLISEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPggPQQPPPQGGKqkpQGPPPQGGP 94
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 968121920 178 LNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQ-QAANSGPQMAGAQLSYPGGFPGGPAQ 249
Cdd:pfam15240 95 RPPPGKPQGPPPQGGNQQQGPPPPGKPQGPPPQGGGPPPQGGNQQGpPPPPPGNPQGPPQRPPQPGNPQGPPQ 167
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
93-255 |
2.20e-06 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 51.96 E-value: 2.20e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 93 VASSHAPYQPSAQSSYPGPIS-TSSVTQLGSQLSAMQINS---YGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGS 168
Cdd:pfam09770 171 AAPAPAPQPAAQPASLPAPSRkMMSLEEVEAAMRAQAKKPaqqPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQP 250
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 169 QVLPP------PPTTLNGPGASPLPLPMYRPdglsGPPPPNAQYQPPPLPGQTL------------GAGYPPQQAANSGP 230
Cdd:pfam09770 251 QQPQQhpgqghPVTILQRPQSPQPDPAQPSI----QPQAQQFHQQPPPVPVQPTqilqnpnrlsaaRVGYPQNPQPGVQP 326
|
170 180
....*....|....*....|....*
gi 968121920 231 QMAGAQLSYPGGFPGGPAQMAGPPQ 255
Cdd:pfam09770 327 APAHQAHRQQGSFGRQAPIITHPQQ 351
|
|
| Pro-rich |
pfam15240 |
Proline-rich protein; This family includes several eukaryotic proline-rich proteins. |
122-270 |
2.96e-06 |
|
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
Pssm-ID: 464580 [Multi-domain] Cd Length: 167 Bit Score: 48.50 E-value: 2.96e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 122 SQLSAMQINSYGSGMAPPSQGPPGPLSatslQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPlplpmyrPDGLSGPPP 201
Cdd:pfam15240 30 SLISEEEGQSQQGGQGPQGPPPGGFPP----QPPASDDPPGPPPPGGPQQPPPQGGKQKPQGPP-------PQGGPRPPP 98
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 968121920 202 PNAQYQPPPLPGQTLGAGYPPQQaaNSGPQMAGAQLSYPGGFPGGPAQMAGPPQ--PQKKLDPDSIPSPIQ 270
Cdd:pfam15240 99 GKPQGPPPQGGNQQQGPPPPGKP--QGPPPQGGGPPPQGGNQQGPPPPPPGNPQgpPQRPPQPGNPQGPPQ 167
|
|
| dnaA |
PRK14086 |
chromosomal replication initiator protein DnaA; |
98-262 |
3.74e-06 |
|
chromosomal replication initiator protein DnaA;
Pssm-ID: 237605 [Multi-domain] Cd Length: 617 Bit Score: 50.98 E-value: 3.74e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 98 APYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRP----PQPSILQPGSQVLPP 173
Cdd:PRK14086 95 PAPPPPHARRTSEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTARPAYPAYQQRPEPgawpRAADDYGWQQQRLGF 174
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 174 PPTTLNGPGASPLPLPMY--------RPDGLSGPPP---PNAQYQPP-----PLPGQTLGAGYPPQqAANSGPQMAGAQL 237
Cdd:PRK14086 175 PPRAPYASPASYAPEQERdrepydagRPEYDQRRRDydhPRPDWDRPrrdrtDRPEPPPGAGHVHR-GGPGPPERDDAPV 253
|
170 180 190
....*....|....*....|....*....|
gi 968121920 238 -----SYPGGFPGGPAQMAGPPQPQKKLDP 262
Cdd:PRK14086 254 vpirpSAPGPLAAQPAPAPGPGEPTARLNP 283
|
|
| SOBP |
pfam15279 |
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ... |
90-231 |
5.06e-06 |
|
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.
Pssm-ID: 464609 [Multi-domain] Cd Length: 325 Bit Score: 49.81 E-value: 5.06e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 90 VNNVASSHAPYQPSAQSSYPGPISTSSVT----------QLGSQLSAMQINSYGSGMAPPSQGPPGPLSAT----SLQTP 155
Cdd:pfam15279 119 VASSSKLLAPKPHEPPSLPPPPLPPKKGRrhrpglhpplGRPPGSPPMSMTPRGLLGKPQQHPPPSPLPAFmepsSMPPP 198
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 156 PRPPQPSILQPGSQVLPP------PPTTLNGPGASPlPLPMYRPD-GLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANS 228
Cdd:pfam15279 199 FLRPPPSIPQPNSPLSNPmlpgigPPPKPPRNLGPP-SNPMHRPPfSPHHPPPPPTPPGPPPGLPPPPPRGFTPPFGPPF 277
|
...
gi 968121920 229 GPQ 231
Cdd:pfam15279 278 PPV 280
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
94-227 |
9.42e-06 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 49.65 E-value: 9.42e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 94 ASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSA---------TSLQTPPRPPQPSIL 164
Cdd:pfam09770 205 AQAKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGhpvtilqrpQSPQPDPAQPSIQPQ 284
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 968121920 165 QPGSQVLPPP----PTTL----NGPGASPLPLPMYRPDGlSGPPPPNAQYQPPPLPGQTLGAGYPPQQAAN 227
Cdd:pfam09770 285 AQQFHQQPPPvpvqPTQIlqnpNRLSAARVGYPQNPQPG-VQPAPAHQAHRQQGSFGRQAPIITHPQQLAQ 354
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
102-232 |
9.60e-06 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 49.42 E-value: 9.60e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 102 PSAQSSYPGPISTS-SVTQLGSQLSAMQINSY-----GSGMAPPSQGppgplsatslqtPPRPPQPSILQPGSQVLPPPP 175
Cdd:TIGR01628 381 RMRQLPMGSPMGGAmGQPPYYGQGPQQQFNGQplgwpRMSMMPTPMG------------PGGPLRPNGLAPMNAVRAPSR 448
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*..
gi 968121920 176 TTLNGPGASPLPLPMYRPDGLSGPPPPNAQyQPPPLPGQTLGAGYPPQQAANSGPQM 232
Cdd:TIGR01628 449 NAQNAAQKPPMQPVMYPPNYQSLPLSQDLP-QPQSTASQGGQNKKLAQVLASATPQM 504
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
91-262 |
1.18e-05 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 49.24 E-value: 1.18e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 91 NNVASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMA------PPSQGPPGPLSATSLQTPPRPPQpsIL 164
Cdd:pfam09606 122 NLLASLGRPQMPMGGAGFPSQMSRVGRMQPGGQAGGMMQPSSGQPGSgtpnqmGPNGGPGQGQAGGMNGGQQGPMG--GQ 199
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 165 QPGSQVLPPPPTTLNGPGAsplplpMYRPDGLSGPPPPNAQYQPPPlpgQTLGAGYPPQQAANSGP-QMAGAQLS-YPGG 242
Cdd:pfam09606 200 MPPQMGVPGMPGPADAGAQ------MGQQAQANGGMNPQQMGGAPN---QVAMQQQQPQQQGQQSQlGMGINQMQqMPQG 270
|
170 180
....*....|....*....|....*
gi 968121920 243 FP-----GGPAQMAGPPQPQKKLDP 262
Cdd:pfam09606 271 VGggagqGGPGQPMGPPGQQPGAMP 295
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
133-268 |
1.28e-05 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 49.21 E-value: 1.28e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 133 GSGMAPPSQGPPGPlSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLP 212
Cdd:PRK07764 623 APAAPAPAGAAAAP-AEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQP 701
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*.
gi 968121920 213 GQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSP 268
Cdd:PRK07764 702 APAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQ 757
|
|
| PRK13729 |
PRK13729 |
conjugal transfer pilus assembly protein TraB; Provisional |
117-230 |
1.45e-05 |
|
conjugal transfer pilus assembly protein TraB; Provisional
Pssm-ID: 184281 [Multi-domain] Cd Length: 475 Bit Score: 48.67 E-value: 1.45e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 117 VTQLGSQLSAMQINSYGSGMAPP-SQGPPGPLSATSLQTPPRPPQPSilqPGSQVLPPppttlNGPGASPLPLPMYrpDG 195
Cdd:PRK13729 106 IEKLGQDNAALAEQVKALGANPVtATGEPVPQMPASPPGPEGEPQPG---NTPVSFPP-----QGSVAVPPPTAFY--PG 175
|
90 100 110
....*....|....*....|....*....|....*..
gi 968121920 196 LSGPPPPNAQYQPPPLPG--QTLGAGYPPQQAANSGP 230
Cdd:PRK13729 176 NGVTPPPQVTYQSVPVPNriQRKTFTYNEGKKGPSLP 212
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
30-285 |
1.83e-05 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 48.83 E-value: 1.83e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 30 GDPSHTASPTGMMKPAGPLGATATRGMLPPGPPPPGPHQFGQNGAHATGHPPQRFPGPPPVNNVASSHAPYQPSAQSSYP 109
Cdd:PRK07764 589 GPAPGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDG 668
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 110 GPISTSSVTQLgsqlsamqinsyGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPttlnGPGASPLPLP 189
Cdd:PRK07764 669 WPAKAGGAAPA------------APPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPP----QAAQGASAPS 732
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 190 MYRPDGLSGPPPPnaqyQPPPLPGQTLGAGYPPQQAANSGPqmagaqlsypggfPGGPAQMAGPPQPQKKLDPDsIPSPi 269
Cdd:PRK07764 733 PAADDPVPLPPEP----DDPPDPAGAPAQPPPPPAPAPAAA-------------PAAAPPPSPPSEEEEMAEDD-APSM- 793
|
250
....*....|....*.
gi 968121920 270 qvieNDRASRGGQVYA 285
Cdd:PRK07764 794 ----DDEDRRDAEEVA 805
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
134-294 |
1.91e-05 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 48.72 E-value: 1.91e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 134 SGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGA---SPLPLPMYRPDGLSGPPPPNAqyqPPP 210
Cdd:PRK12323 376 TAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARrspAPEALAAARQASARGPGGAPA---PAP 452
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 211 LPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLdPDSIPSPiQVIENDRASRGGQVYATNTRG 290
Cdd:PRK12323 453 APAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEEL-PPEFASP-APAQPDAAPAGWVAESIPDPA 530
|
....
gi 968121920 291 QIPP 294
Cdd:PRK12323 531 TADP 534
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
86-268 |
2.03e-05 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 48.78 E-value: 2.03e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 86 GPPPVNNVASSHAPYQPSA--------QSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPP----GPLSATSLQ 153
Cdd:PHA03247 2598 PRAPVDDRGDPRGPAPPSPlppdthapDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPrrarRLGRAAQAS 2677
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 154 TPPRPPQPSILQPGSQVL-------PPPPTTLNGPGA--SPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQ 224
Cdd:PHA03247 2678 SPPQRPRRRAARPTVGSLtsladppPPPPTPEPAPHAlvSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARP 2757
|
170 180 190 200
....*....|....*....|....*....|....*....|....*.
gi 968121920 225 AANSGPqmAGAQLSYPGGFPGGPAQMAGPPQPQKKLDP--DSIPSP 268
Cdd:PHA03247 2758 ARPPTT--AGPPAPAPPAAPAAGPPRRLTRPAVASLSEsrESLPSP 2801
|
|
| MISS |
pfam15822 |
MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic ... |
139-256 |
2.34e-05 |
|
MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic MAPK-interacting and spindle-stabilising protein-like proteins. MISS is rich in prolines and has four potential MAPK-phosphorylation sites, a MAPK-docking site, a PEST sequence (PEST motif) and a bipartite nuclear localization signal. The endogenous protein accumulates during mouse meiotic maturation and is found as discrete dots on the MII spindle. MISS is the first example of a physiological MAPK-substrate that is stabilized in MII that specifically regulates MII spindle integrity during the CSF arrest.
Pssm-ID: 318115 [Multi-domain] Cd Length: 238 Bit Score: 46.90 E-value: 2.34e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 139 PSQGPPGPLSATSLQTPPRPPQ--PSILQPGSQVLPPPPT----TLNGPGASPLPLPMYRPDGLSGPPPpNAQYQPPPLP 212
Cdd:pfam15822 26 PPQGWPGSNPWNNPSAPPAVPSglPPSTAPSTVPFGPAPTgmypSIPLTGPSPGPPAPFPPSGPSCPPP-GGPYPAPTVP 104
|
90 100 110 120
....*....|....*....|....*....|....*....|....
gi 968121920 213 GQTLGAGYPPqqaansgPQMAGAQLSYPGGFPGGPAQmAGPPQP 256
Cdd:pfam15822 105 GPGPIGPYPT-------PNMPFPELPRPYGAPTDPAA-AAPSGP 140
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
104-294 |
2.57e-05 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 48.44 E-value: 2.57e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 104 AQSSYPGPISTSSVTQLGSQLsamQINsYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSilQPGSQVLPPPPTTLNGPGA 183
Cdd:PRK07764 562 ASPGNAEVLVTALAEELGGDW---QVE-AVVGPAPGAAGGEGPPAPASSGPPEEAARPA--APAAPAAPAAPAPAGAAAA 635
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 184 SPLPLPMYRPDGLSGPPPPNAQYQPPPlPGQTLGAGYPPQQAANSGPQMAGAqlsyPGGFPGGPAQMAGPPQPQKKLDPD 263
Cdd:PRK07764 636 PAEASAAPAPGVAAPEHHPKHVAVPDA-SDGGDGWPAKAGGAAPAAPPPAPA----PAAPAAPAGAAPAQPAPAPAATPP 710
|
170 180 190
....*....|....*....|....*....|.
gi 968121920 264 SIPSPIQVIENDRASRGGQVYATNTRGQIPP 294
Cdd:PRK07764 711 AGQADDPAAQPPQAAQGASAPSPAADDPVPL 741
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
137-281 |
2.99e-05 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 48.24 E-value: 2.99e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 137 APPSQGPPGPLSATSLQTPPRPPQPsilqpgsqvLPPPPTTlnGPGASplplPMYRPDGLSGPPPPNAQYQPPPLPGQTL 216
Cdd:PHA03307 99 SPAREGSPTPPGPSSPDPPPPTPPP---------ASPPPSP--APDLS----EMLRPVGSPGPPPAASPPAAGASPAAVA 163
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 968121920 217 GAGYPPQQAA---NSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRASRGG 281
Cdd:PHA03307 164 SDAASSRQAAlplSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADD 231
|
|
| FAP |
pfam07174 |
Fibronectin-attachment protein (FAP); This family contains bacterial fibronectin-attachment ... |
137-228 |
4.59e-05 |
|
Fibronectin-attachment protein (FAP); This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix.
Pssm-ID: 429334 Cd Length: 301 Bit Score: 46.46 E-value: 4.59e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 137 APPSQGPPGPLSATSLQTPPRPPQPsilqPGSQVLPPPPTTLNGPGASPlplpmyrpdglsGPPPPNAQYQPPPLPgqtl 216
Cdd:pfam07174 39 ADPEPAPPPPSTATAPPAPPPPPPA----PAAPAPPPPPAAPNAPNAPP------------PPADPNAPPPPPADP---- 98
|
90
....*....|..
gi 968121920 217 GAGYPPQQAANS 228
Cdd:pfam07174 99 NAPPPPAVDPNA 110
|
|
| DUF3824 |
pfam12868 |
Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It ... |
172-256 |
4.92e-05 |
|
Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It is proline-rich, and the function is not known.
Pssm-ID: 372351 [Multi-domain] Cd Length: 145 Bit Score: 44.35 E-value: 4.92e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 172 PPPPTtlnGPGASPLPlpmYRPDGLSGPPPPNAQYQPPPLPgqTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQMA 251
Cdd:pfam12868 62 PPSPA---GPYASQGQ---YYPETNYFPPPPGSTPQPPVDP--QPNAPPPPYNPADYPPPPGAAPPPQPYQYPPPPGPDP 133
|
....*
gi 968121920 252 GPPQP 256
Cdd:pfam12868 134 YAPRP 138
|
|
| DUF3729 |
pfam12526 |
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ... |
136-221 |
7.18e-05 |
|
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.
Pssm-ID: 372164 [Multi-domain] Cd Length: 115 Bit Score: 43.14 E-value: 7.18e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 136 MAPPSQGPPGPLSATSLQTPPRPPQPSilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPppnaqyQPPPLPGQT 215
Cdd:pfam12526 39 PPPPVGDPRPPVVDTPPPVSAVWVLPP---PSEPAAPEPDLVPPVTGPAGPPSPLAPPAPAQKPP------LPPPRPQRR 109
|
....*.
gi 968121920 216 LGAGYP 221
Cdd:pfam12526 110 LLHTYP 115
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
123-257 |
7.30e-05 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 46.93 E-value: 7.30e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 123 QLSAMQINSYGSGMAPPSQGPPGPLSAtsLQTPP----RPPQPSILQPGSQvlPPPPTTLNGPG-ASPLPLPMYRPD--- 194
Cdd:pfam09606 59 QQQQPQGGQGNGGMGGGQQGMPDPINA--LQNLAgqgtRPQMMGPMGPGPG--GPMGQQMGGPGtASNLLASLGRPQmpm 134
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 968121920 195 GLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANS------GPQMAGAQLSYPGGFPGGPAQMAGPPQPQ 257
Cdd:pfam09606 135 GGAGFPSQMSRVGRMQPGGQAGGMMQPSSGQPGSgtpnqmGPNGGPGQGQAGGMNGGQQGPMGGQMPPQ 203
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
148-294 |
8.95e-05 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 46.54 E-value: 8.95e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 148 SATSLQTPPRPPQPSILQPGSQVLPPPPTTL-NGPGASPLPlPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAA 226
Cdd:pfam09606 56 KAAQQQQPQGGQGNGGMGGGQQGMPDPINALqNLAGQGTRP-QMMGPMGPGPGGPMGQQMGGPGTASNLLASLGRPQMPM 134
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 968121920 227 NSG----PQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSPIQViendrASRGGQVYATNTRGQIPP 294
Cdd:pfam09606 135 GGAgfpsQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQG-----QAGGMNGGQQGPMGGQMP 201
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
94-268 |
1.34e-04 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 46.02 E-value: 1.34e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 94 ASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPP 173
Cdd:PRK12323 400 AAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAA 479
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 174 PPttLNGPGASPLPLPmyrpdglSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPggFPGGPAQMAGP 253
Cdd:PRK12323 480 PA--RAAPAAAPAPAD-------DDPPPWEELPPEFASPAPAQPDAAPAGWVAESIPDPATADPDDA--FETLAPAPAAA 548
|
170
....*....|....*
gi 968121920 254 PQPQKKLDPDSIPSP 268
Cdd:PRK12323 549 PAPRAAAATEPVVAP 563
|
|
| PHA02682 |
PHA02682 |
ORF080 virion core protein; Provisional |
98-264 |
1.39e-04 |
|
ORF080 virion core protein; Provisional
Pssm-ID: 177464 [Multi-domain] Cd Length: 280 Bit Score: 44.85 E-value: 1.39e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 98 APYQPSAQSSYPGPISTSSVTQLGSQL-SAMQINS----YGSGMAPPSQGPPGPLSATSLQTP-PRPPQPSILQPGSQVL 171
Cdd:PHA02682 36 APAAPCPPDADVDPLDKYSVKEAGRYYqSRLKANSacmqRPSGQSPLAPSPACAAPAPACPACaPAAPAPAVTCPAPAPA 115
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 172 PPPPTTLNGPGASPLPLPMyRPDglSGPPPPNAQYQP-PPLP-GQTLGAGYP---------PQQAANSGPQMAGAQLSYP 240
Cdd:PHA02682 116 CPPATAPTCPPPAVCPAPA-RPA--PACPPSTRQCPPaPPLPtPKPAPAAKPiflhnqlppPDYPAASCPTIETAPAASP 192
|
170 180
....*....|....*....|....
gi 968121920 241 ggfpggpaqMAGPPQPQKKLDPDS 264
Cdd:PHA02682 193 ---------VLEPRIPDKIIDADN 207
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
133-265 |
1.46e-04 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 45.75 E-value: 1.46e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 133 GSGMAPPSQGPPGPLSATSLQTPPRPPQPsilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLP 212
Cdd:PRK07764 389 GGAGAPAAAAPSAAAAAPAAAPAPAAAAP----AAAAAPAPAAAPQPAPAPAPAPAPPSPAGNAPAGGAPSPPPAAAPSA 464
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|...
gi 968121920 213 GQTLGAGYPPQQAANSGPQMAGAQLSypggfPGGPAQMAGPPQPQKKLDPDSI 265
Cdd:PRK07764 465 QPAPAPAAAPEPTAAPAPAPPAAPAP-----AAAPAAPAAPAAPAGADDAATL 512
|
|
| DUF2076 |
pfam09849 |
Uncharacterized protein conserved in bacteria (DUF2076); This domain, found in various ... |
179-251 |
1.47e-04 |
|
Uncharacterized protein conserved in bacteria (DUF2076); This domain, found in various hypothetical prokaryotic proteins, has no known function. The domain, however, is found in various periplasmic ligand-binding sensor proteins.
Pssm-ID: 430876 Cd Length: 263 Bit Score: 44.73 E-value: 1.47e-04
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 968121920 179 NGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPlPGQtlGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQMA 251
Cdd:pfam09849 97 GGSQSRPPPPPQARPAWPAGQAPGQPQPYPGQ-PGY--AQQGQPQYGQPAQPPRGPWGPGGGGGFLGGALQTA 166
|
|
| COG3416 |
COG3416 |
Uncharacterized conserved protein, DUF2076 domain [Function unknown]; |
117-251 |
1.89e-04 |
|
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
Pssm-ID: 442642 [Multi-domain] Cd Length: 237 Bit Score: 44.24 E-value: 1.89e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 117 VTQLGSQLSAMQinsygsgmAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPttlngpgasplplpmyrpdgl 196
Cdd:COG3416 64 IQELEAQLAQLQ--------QQQPQSSGGFLSGLFGGGQRPPPAPQPSQPGPQQQPAPP--------------------- 114
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*
gi 968121920 197 sgPPPPNAQYQPPPLPGQtlgagypPQQAAnsgPQMAGAQlsyPGGFPGGPAQMA 251
Cdd:COG3416 115 --SGPWGQAAPQQPGYGQ-------PQYGQ---PAAGPSG---GGGFLGGALQTA 154
|
|
| Drf_FH1 |
pfam06346 |
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ... |
143-212 |
2.41e-04 |
|
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.
Pssm-ID: 461881 [Multi-domain] Cd Length: 157 Bit Score: 42.55 E-value: 2.41e-04
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 143 PPGPLSATSLQTPPRPPQPsilqpGSQVLPPPPTTLngPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLP 212
Cdd:pfam06346 83 PPPPLPGGAGIPPPPPPLP-----GGAGVPPPPPPL--PGGPGIPPPPPFPGGPGIPPPPPGMGMPPPPP 145
|
|
| BimA_first |
NF040984 |
trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular ... |
94-211 |
2.43e-04 |
|
trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular motility protein A) is a trimeric autotransporter, homologous in its C-terminal half to a number of trimeric autotransporter adhesins. It is a virulence factor that nucleates actin, so that actin polymerization can drive escape by B. pseudomallei out of one cell and into a neighboring cell. HMM NF040983 describes a homolog with similar activity but substantial difference in sequence architecture in the N-terminal region.
Pssm-ID: 468914 [Multi-domain] Cd Length: 517 Bit Score: 44.86 E-value: 2.43e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 94 ASSHAPYQPSaqssyPGPISTSSVTQLGSQLSAMQINSYGSgmappsqgPPGPLSATSLQTPPrpPQPSilqPGSQVLPP 173
Cdd:NF040984 6 SSSHAPDAPK-----PSSIATTLCRALASLSLGLSMDAEAN--------PPEPPGGTNIPVPP--PMPG---GGANIPVP 67
|
90 100 110
....*....|....*....|....*....|....*...
gi 968121920 174 PPTTLNGPGASPLPLPmyrPDGLSGPPPpnaqyQPPPL 211
Cdd:NF040984 68 PPMPGGGANIPPPPPP---PGGIGGATP-----SPPPL 97
|
|
| PBP1 |
COG5180 |
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ... |
99-276 |
2.62e-04 |
|
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];
Pssm-ID: 444064 [Multi-domain] Cd Length: 548 Bit Score: 45.05 E-value: 2.62e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 99 PYQPSAQSSyPGPISTSSvtqlgSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSIL-----QPGSQVLPP 173
Cdd:COG5180 278 PGLPVLEAG-SEPQSDAP-----EAETARPIDVKGVASAPPATRPVRPPGGARDPGTPRPGQPTERpagvpEAASDAGQP 351
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 174 PPTTLNGPGASPL-PLPMYRPD------------GLSGPPPPNAQYQPPPLPGQTLGAGYPP----QQAANSGPQMAGAQ 236
Cdd:COG5180 352 PSAYPPAEEAVPGkPLEQGAPRpgssggdgapfqPPNGAPQPGLGRRGAPGPPMGAGDLVQAaldgGGRETASLGGAAGG 431
|
170 180 190 200
....*....|....*....|....*....|....*....|
gi 968121920 237 LSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDR 276
Cdd:COG5180 432 AGQGPKADFVPGDAESVSGPAGLADQAGAAASTAMADFVA 471
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
98-333 |
3.69e-04 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 44.66 E-value: 3.69e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 98 APYQPSAQSSYPGPISTSSVTQLGSQlsamqinsygSGMAP--PSQGPPGPLSATSL--QTP-----PRP-PQPSILQPG 167
Cdd:PHA03379 434 ATSHGSAQVPEPPPVHDLEPGPLHDQ----------HSMAPcpVAQLPPGPLQDLEPgdQLPgvvqdGRPaCAPVPAPAG 503
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 168 SQVLPPPPTTLNGPGASPLPlpmYRPDGLSGP--PPPNAQYQPPPLPGQTLGAGYPPqqAANSGPQMAGAQLSYPGGFPg 245
Cdd:PHA03379 504 PIVRPWEASLSQVPGVAFAP---VMPQPMPVEpvPVPTVALERPVCPAPPLIAMQGP--GETSGIVRVRERWRPAPWTP- 577
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 246 gpaqmaGPPQPqkkldpdsiPSPIQVieNDRASRG---GQVYATNTRGQIPPLVTTDCMIQDQGNASPRFIRCTTYCFPC 322
Cdd:PHA03379 578 ------NPPRS---------PSQMSV--RDRLARLraeAQPYQASVEVQPPQLTQVSPQQPMEYPLEPEQQMFPGSPFSQ 640
|
250
....*....|.
gi 968121920 323 TSDMAKQAQIP 333
Cdd:PHA03379 641 VADVMRAGGVP 651
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
100-260 |
6.01e-04 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 43.92 E-value: 6.01e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 100 YQPSAQSsYPGPISTSSVTQLGSQLSAMQINSYG----SGMAPPS--------------QGPPGPLSATSLQTPPRPPQP 161
Cdd:PRK10263 680 YQHDVPV-NAEDADAAAEAELARQFAQTQQQRYSgeqpAGANPFSlddfefspmkalldDGPHEPLFTPIVEPVQQPQQP 758
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 162 SILQPGSQVL--PPPPTTLNGPGASPLPlPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQlsy 239
Cdd:PRK10263 759 VAPQQQYQQPqqPVAPQPQYQQPQQPVA-PQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQ--- 834
|
170 180
....*....|....*....|.
gi 968121920 240 pggfpggpAQMAgpPQPQKKL 260
Cdd:PRK10263 835 --------QPVA--PQPQDTL 845
|
|
| DUF3729 |
pfam12526 |
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ... |
139-214 |
7.20e-04 |
|
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.
Pssm-ID: 372164 [Multi-domain] Cd Length: 115 Bit Score: 40.45 E-value: 7.20e-04
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 968121920 139 PSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQ 214
Cdd:pfam12526 29 FSPPESAHPDPPPPVGDPRPPVVDTPPPVSAVWVLPPPSEPAAPEPDLVPPVTGPAGPPSPLAPPAPAQKPPLPPP 104
|
|
| Gag_spuma |
pfam03276 |
Spumavirus gag protein; |
134-272 |
7.80e-04 |
|
Spumavirus gag protein;
Pssm-ID: 460872 [Multi-domain] Cd Length: 614 Bit Score: 43.58 E-value: 7.80e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 134 SGMAPPSQGPPGPLSATSlqtppRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPG 213
Cdd:pfam03276 176 AEISPGAQGGIPPGASFS-----GLPSLPAIGGIHLPAIPGIHARAPPGNIARSLGDDIMPSLGDAGMPQPRFAFHPGNP 250
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 214 QTLGAGYPPQQAANSGPQmagAQLSYPG-GFPGGPAQMAGPPQPQKKLDPDSIPSPIQVI 272
Cdd:pfam03276 251 FAEAEGHPFAEAEGERPR---DIPRAPRiDAPSAPAIPAIQPIAPPMIPPIGAPIPIPHG 307
|
|
| Pro-rich |
pfam15240 |
Proline-rich protein; This family includes several eukaryotic proline-rich proteins. |
160-262 |
8.98e-04 |
|
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
Pssm-ID: 464580 [Multi-domain] Cd Length: 167 Bit Score: 41.18 E-value: 8.98e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 160 QPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPqmaGAQLSY 239
Cdd:pfam15240 28 SPSLISEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPGGPQQPPPQGGKQKPQGPPPQGGPRPPP---GKPQGP 104
|
90 100
....*....|....*....|...
gi 968121920 240 PggfPGGPAQMAGPPQPQKKLDP 262
Cdd:pfam15240 105 P---PQGGNQQQGPPPPGKPQGP 124
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
153-260 |
9.21e-04 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 43.26 E-value: 9.21e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 153 QTPPRPPQPSILQP--GSQVLPP----------PPTTLNGPGASPLPLPMY--RPDGLSGPPPPNAQYQPP----PLPGQ 214
Cdd:TIGR01628 377 QLQPRMRQLPMGSPmgGAMGQPPyygqgpqqqfNGQPLGWPRMSMMPTPMGpgGPLRPNGLAPMNAVRAPSrnaqNAAQK 456
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|..
gi 968121920 215 TLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGP----AQM--AGPPQPQKKL 260
Cdd:TIGR01628 457 PPMQPVMYPPNYQSLPLSQDLPQPQSTASQGGQnkklAQVlaSATPQMQKQV 508
|
|
| SP6_N |
cd22544 |
N-terminal domain of transcription factor Specificity Protein (SP) 6; Specificity Proteins ... |
114-254 |
9.48e-04 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 6; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP6, also known as epiprofin, shows specific expression pattern in hair follicles and the apical ectodermal ridge (AER) of the developing limbs. SP6 null mice are nude and show defects in skin, teeth, limbs (syndactyly and oligodactyly), and lung alveoli. SP6 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP6.
Pssm-ID: 411693 [Multi-domain] Cd Length: 245 Bit Score: 42.22 E-value: 9.48e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 114 TSSVTQLGSQLSAMQINSYGSGMAPPSQ-----------GPPGPLSATSLQTPPRPPQPSILQPGSQVlPPPPTTLNGPG 182
Cdd:cd22544 3 TAVCGSLGNQHSETPRASPPTLDLQPLQpyqihsspeagDYPSPLQPTELQSLPLGPGVDFSARESYE-PHSSRRTCLDL 81
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 183 ASPLPLPMYRPdgLSGPPPPNAQ-YQP---PPLPGQTLGAG----------------YPPQQAANSGPQMAGAQLSYPGG 242
Cdd:cd22544 82 ESDLPLGPFPK--LLHPPPDMAHpYESwfrPPHPGGSGEEGgvpswwdlhagsswmdLQHGQGGLQSPGPPGGLQPPLGG 159
|
170
....*....|..
gi 968121920 243 FpGGPAQMAGPP 254
Cdd:cd22544 160 Y-GSEHQLCGPP 170
|
|
| PRK14951 |
PRK14951 |
DNA polymerase III subunits gamma and tau; Provisional |
93-217 |
1.08e-03 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237865 [Multi-domain] Cd Length: 618 Bit Score: 42.78 E-value: 1.08e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 93 VASSHAPYQPSAQSSYPGPIStssVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPsilqpGSQVLP 172
Cdd:PRK14951 375 PAEKKTPARPEAAAPAAAPVA---QAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAA-----APAAVA 446
|
90 100 110 120
....*....|....*....|....*....|....*....|....*...
gi 968121920 173 PPPTTLNGPGASPLPLPMY---RPDGLSGPPPPNAQYQPPPLPGQTLG 217
Cdd:PRK14951 447 LAPAPPAQAAPETVAIPVRvapEPAVASAAPAPAAAPAAARLTPTEEG 494
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
94-269 |
1.28e-03 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 42.85 E-value: 1.28e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 94 ASSHAPYQPSAQSSYPGPISTSSVTQL---GSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQT-PPRPPQPSILQPgsq 169
Cdd:PHA03307 101 AREGSPTPPGPSSPDPPPPTPPPASPPpspAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASdAASSRQAALPLS--- 177
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 170 vLPPPPTTLNGPGASPLPLpmyRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQ 249
Cdd:PHA03307 178 -SPEETARAPSSPPAEPPP---STPPAAASPRPPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSSESSGCGWGPEN 253
|
170 180
....*....|....*....|
gi 968121920 250 MAGPPQPQKKLDPDSIPSPI 269
Cdd:PHA03307 254 ECPLPRPAPITLPTRIWEAS 273
|
|
| BimA_second |
NF040983 |
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia ... |
139-247 |
1.35e-03 |
|
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia intracellular motility A), WP_004266405.1-like proteins in Burkholderia mallei or B. pseudomallei. The term BimA has also been used for WP_011205626.1-like homologs that have a very different N-terminal half.
Pssm-ID: 468913 [Multi-domain] Cd Length: 382 Bit Score: 42.20 E-value: 1.35e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 139 PSQGPPGPlsatslqtPPRPPQPSILQPGSQVLPPPPttlngPGASPLPLPmyrpdglsgPPPPNAQYQPPPlPGQTLGA 218
Cdd:NF040983 86 PNKVPPPP--------PPPPPPPPPPPTPPPPPPPPP-----PPPPPSPPP---------PPPPSPPPSPPP-PTTTPPT 142
|
90 100
....*....|....*....|....*....
gi 968121920 219 GYPPQQAANSgPQMAGAQlsyPGGFPGGP 247
Cdd:NF040983 143 RTTPSTTTPT-PSMHPIQ---PTQLPSIP 167
|
|
| PRK14971 |
PRK14971 |
DNA polymerase III subunit gamma/tau; |
121-226 |
1.51e-03 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237874 [Multi-domain] Cd Length: 614 Bit Score: 42.46 E-value: 1.51e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 121 GSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLP-PPPTTLNGPGASPLPLPMYRPDglSGP 199
Cdd:PRK14971 371 GGRGPKQHIKPVFTQPAAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQPAGtPPTVSVDPPAAVPVNPPSTAPQ--AVR 448
|
90 100 110
....*....|....*....|....*....|...
gi 968121920 200 PPPNAQYQPPPLPGQ------TLGAGYPPQQAA 226
Cdd:PRK14971 449 PAQFKEEKKIPVSKVsslgpsTLRPIQEKAEQA 481
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
132-257 |
1.79e-03 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 42.28 E-value: 1.79e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 132 YGSGMAPPSQGPPGPLSATslqTPPRPPQPsilqpgsqvlPPPPTTLNGPGASPLPLPMYRPDglSGPPPPNAQYQPPPL 211
Cdd:PRK07764 387 VAGGAGAPAAAAPSAAAAA---PAAAPAPA----------AAAPAAAAAPAPAAAPQPAPAPA--PAPAPPSPAGNAPAG 451
|
90 100 110 120
....*....|....*....|....*....|....*....|....*.
gi 968121920 212 PGQTLGAGypPQQAANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQ 257
Cdd:PRK07764 452 GAPSPPPA--AAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPA 495
|
|
| Prog_receptor |
pfam02161 |
Progesterone receptor; |
99-200 |
1.83e-03 |
|
Progesterone receptor;
Pssm-ID: 460470 Cd Length: 564 Bit Score: 42.22 E-value: 1.83e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 99 PYQPSAQSSYPGPIS------TSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLsatslqtPPRPPQPSILQPGSQVLP 172
Cdd:pfam02161 424 PLPPRAPSSRPGEAAvaaapaSASVSSASSSGSTLECILYKAEGAPPQQGPFAPP-------PCKPPGAGACLLPRDGLP 496
|
90 100
....*....|....*....|....*...
gi 968121920 173 PPPTTLNGPGASPlplPMYRPDGLSGPP 200
Cdd:pfam02161 497 STSASAAAAGAAP---ALYPPLGLNGLP 521
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
132-268 |
2.04e-03 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 42.47 E-value: 2.04e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 132 YGS-GMAPPSQGP---PGPLSATSLQTPPRPPQ-----PSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPP 202
Cdd:PHA03307 37 SGSqGQLVSDSAElaaVTVVAGAAACDRFEPPTgpppgPGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDP 116
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 203 NAQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKL----DPDSIPSP 268
Cdd:PHA03307 117 PPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQAALplssPEETARAP 186
|
|
| PRK13729 |
PRK13729 |
conjugal transfer pilus assembly protein TraB; Provisional |
181-257 |
2.41e-03 |
|
conjugal transfer pilus assembly protein TraB; Provisional
Pssm-ID: 184281 [Multi-domain] Cd Length: 475 Bit Score: 41.73 E-value: 2.41e-03
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 968121920 181 PGASP-LPLPMYRPDGLSGPPPPNAQYQPPPLPgqtlgAGYPPQQAANSGPQMAgaqlSYPGGFPGGPAQMAGPPQPQ 257
Cdd:PRK13729 123 LGANPvTATGEPVPQMPASPPGPEGEPQPGNTP-----VSFPPQGSVAVPPPTA----FYPGNGVTPPPQVTYQSVPV 191
|
|
| PHA03377 |
PHA03377 |
EBNA-3C; Provisional |
101-271 |
2.83e-03 |
|
EBNA-3C; Provisional
Pssm-ID: 177614 [Multi-domain] Cd Length: 1000 Bit Score: 41.58 E-value: 2.83e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 101 QPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSqgPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNG 180
Cdd:PHA03377 671 QPATQSTPPRPSWLPSVFVLPSVDAGRAQPSEESHLSSMS--PTQPISHEEQPRYEDPDDPLDLSLHPDQAPPPSHQAPY 748
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 181 PGASPLPLPMYRPDGLSGPPPPNAQY---QPPPLPGQTLgAGYPpqqaANSGPQMAGAQL-----------SYPG-GFPG 245
Cdd:PHA03377 749 SGHEEPQAQQAPYPGYWEPRPPQAPYlgyQEPQAQGVQV-SSYP----GYAGPWGLRAQHpryrhswaywsQYPGhGHPQ 823
|
170 180
....*....|....*....|....*.
gi 968121920 246 GPAQMAgPPQPQKKLDPDSIPSPIQV 271
Cdd:PHA03377 824 GPWAPR-PPHLPPQWDGSAGHGQDQV 848
|
|
| PPE |
COG5651 |
PPE-repeat protein [Function unknown]; |
70-254 |
2.87e-03 |
|
PPE-repeat protein [Function unknown];
Pssm-ID: 444372 [Multi-domain] Cd Length: 385 Bit Score: 41.42 E-value: 2.87e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 70 GQNGAHATGhPPQRFPGPPPVNNVASSHAPYQPSAQSSYPGPISTSS-VTQLGSQLSAMQINSYGSGMAPPSQGPPGPLS 148
Cdd:COG5651 179 GLLGAQNAG-SGNTSSNPGFANLGLTGLNQVGIGGLNSGSGPIGLNSgPGNTGFAGTGAAAGAAAAAAAAAAAAGAGASA 257
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 149 AtslqtpprpPQPSILQPGSQVLPPPPTTLNGPGASPLPLPmyrPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANS 228
Cdd:COG5651 258 A---------LASLAATLLNASSLGLAATAASSAATNLGLA---GSPLGLAGGGAGAAAATGLGLGAGGAAGAAGATGAG 325
|
170 180
....*....|....*....|....*.
gi 968121920 229 GPQMAGAQLSYPGGFPGGPAQMAGPP 254
Cdd:COG5651 326 AALGAGAAAAAAGAAAGAGAAAAAAA 351
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
137-226 |
3.97e-03 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 41.46 E-value: 3.97e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 137 APPSQGPPGPLSATSLQTPPRPPQPSIlqpgsqvlPPPPTTLNGPGASPL----PLPMYRPDGLSGPPPPNAQYQPPPLP 212
Cdd:PHA03247 388 ARHAATPFARGPGGDDQTRPAAPVPAS--------VPTPAPTPVPASAPPppatPLPSAEPGSDDGPAPPPERQPPAPAT 459
|
90
....*....|....
gi 968121920 213 GQTLGAGYPPQQAA 226
Cdd:PHA03247 460 EPAPDDPDDATRKA 473
|
|
| Drf_FH1 |
pfam06346 |
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ... |
143-254 |
4.52e-03 |
|
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.
Pssm-ID: 461881 [Multi-domain] Cd Length: 157 Bit Score: 39.08 E-value: 4.52e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 143 PPGPLSATSLQTPPRPPQPSIlqpgsqvlpPPPTTLNGPGASPLPLPMYRPDGLSGPPP-PNAQY--QPPPLPGQTLGAG 219
Cdd:pfam06346 1 PPPPPLPGDSSTIPLPPGACI---------PTPPPLPGGGGPPPPPPLPGSAAIPPPPPlPGGTSipPPPPLPGAASIPP 71
|
90 100 110
....*....|....*....|....*....|....*
gi 968121920 220 YPPQQAANSGPQmagaqlsyPGGFPGGPAQMAGPP 254
Cdd:pfam06346 72 PPPLPGSTGIPP--------PPPLPGGAGIPPPPP 98
|
|
| SAV_2336_NTERM |
NF041121 |
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ... |
138-205 |
5.41e-03 |
|
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.
Pssm-ID: 469044 [Multi-domain] Cd Length: 473 Bit Score: 40.37 E-value: 5.41e-03
10 20 30 40 50 60
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 968121920 138 PPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDglsGPPPPNAQ 205
Cdd:NF041121 39 PPPAAPPSPPGDPPEPPAPEPAPLPAPYPGSLAPPPPPPPGPAGAAPGAALPVRVPA---PPALPNPL 103
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
3-231 |
7.12e-03 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 40.38 E-value: 7.12e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 3 QQGYVATPPYSQPQPGIGLSPPHYGHYGDPShtASPTGMmkPAGPLGATATR-----------GMLPPGPPPPGPHQFGQ 71
Cdd:pfam09606 243 MQQQQPQQQGQQSQLGMGINQMQQMPQGVGG--GAGQGG--PGQPMGPPGQQpgampnvmsigDQNNYQQQQTRQQQQQQ 318
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 72 NGAHATGHPPQRFPGPPPVNNVASshAPYQPSAQSSYPGPISTSSVTQLGSQLSAMqinsygsgMAPPSQGPPGPL-SAT 150
Cdd:pfam09606 319 GGNHPAAHQQQMNQSVGQGGQVVA--LGGLNHLETWNPGNFGGLGANPMQRGQPGM--------MSSPSPVPGQQVrQVT 388
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 151 SLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPplPGQTLGAgyPPQQAANS-- 228
Cdd:pfam09606 389 PNQFMRQSPQPSVPSPQGPGSQPPQSHPGGMIPSPALIPSPSPQMSQQPAQQRTIGQDS--PGGSLNT--PGQSAVNSpl 464
|
...
gi 968121920 229 GPQ 231
Cdd:pfam09606 465 NPQ 467
|
|
| SAV_2336_NTERM |
NF041121 |
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ... |
137-214 |
7.50e-03 |
|
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.
Pssm-ID: 469044 [Multi-domain] Cd Length: 473 Bit Score: 39.99 E-value: 7.50e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 137 APPSQGPPGPLSATSLQTPPRPPQPSiLQPGSQVLPPPPTTLNGPGASP--LPLPMYRPDGLSGPPPPNAQ----YQPPP 210
Cdd:NF041121 20 APPSPEGPAPTAASQPATPPPPAAPP-SPPGDPPEPPAPEPAPLPAPYPgsLAPPPPPPPGPAGAAPGAALpvrvPAPPA 98
|
....
gi 968121920 211 LPGQ 214
Cdd:NF041121 99 LPNP 102
|
|
| PHA03201 |
PHA03201 |
uracil DNA glycosylase; Provisional |
156-225 |
7.99e-03 |
|
uracil DNA glycosylase; Provisional
Pssm-ID: 165468 Cd Length: 318 Bit Score: 39.49 E-value: 7.99e-03
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 968121920 156 PRPPQPSilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPL-------PGQTLGAGYPPQQA 225
Cdd:PHA03201 4 ARSRSPS---PPRRPSPPRPTPPRSPDASPEETPPSPPGPGAEPPPGRAAGPAAPRrrprgcpAGVTFSSSAPPRPP 77
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
94-279 |
8.67e-03 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 40.04 E-value: 8.67e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 94 ASSHAPYQ----PSAQSSYPGPISTSS------VTQL----------GSQLSAmqINSYGSGMAPPSQGPPGPL----SA 149
Cdd:PHA03379 434 ATSHGSAQvpepPPVHDLEPGPLHDQHsmapcpVAQLppgplqdlepGDQLPG--VVQDGRPACAPVPAPAGPIvrpwEA 511
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920 150 TSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLP--LPMYRPDGLSG---------PPP--PNAQYQPPPLPGQT- 215
Cdd:PHA03379 512 SLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPplIAMQGPGETSGivrvrerwrPAPwtPNPPRSPSQMSVRDr 591
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 968121920 216 LGAGYPPQQAANSGPQMAGAQLsyPGGFPGGPaqMAGPPQPQKKLDPDSIPSpiQVIENDRASR 279
Cdd:PHA03379 592 LARLRAEAQPYQASVEVQPPQL--TQVSPQQP--MEYPLEPEQQMFPGSPFS--QVADVMRAGG 649
|
|
|