NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|968121920|ref|NP_001304995|]
View 

protein transport protein Sec24D isoform 2 [Homo sapiens]

Protein Classification

SEC24 family transport protein( domain architecture ID 1001573)

SEC24 family transport protein is a component of the coat protein complex II (COPII) which promotes the formation of transport vesicles from the endoplasmic reticulum (ER)

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
COG5028 super family cl34873
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
182-1028 4.46e-167

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


The actual alignment was detected with superfamily member COG5028:

Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 512.80  E-value: 4.46e-167
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  182 GASPLPLPMYRPDG--LSGPPPPNAQYQPpplpgQTLGAGYPPQQAANSGPQMAGAQL---SYPGGF----PGGPAQMAG 252
Cdd:COG5028     7 GVYPQAQSQVHTGAasSKKSARPHRAYAN-----FSAGQMGMPPYTTPPLQQQSRRQIdqaATAMHNtganNPAPSVMSP 81
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  253 PPQPQKK-------------LDPDSIPSPIQVIENDRASRGGQVYATnTRGQIPPLvTTDCMIQDQGNASPRFIRCTTYC 319
Cdd:COG5028    82 AFQSQQKfsspyggsmadgtAPKPTNPLVPVDLFEDQPPPISDLFLP-PPPIVPPL-TTNFVGSEQSNCSPKYVRSTMYA 159
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  320 FPCTSDMAKQAQIPLAAVIKPFATIPSNESPLYLVNHGEsgPVRCNRCKAYMCPFMQFIEGGRRYQCGFCNCVNDVPPFY 399
Cdd:COG5028   160 IPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNICRSKNDVPEGF 237
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  400 FQHLDHIGRRLDHYEKPELSLGSYEYVATLDYcrKSKPPNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEQE 479
Cdd:COG5028   238 DNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILENLDQIPNFDPR 315
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  480 etsaIRVGFITYNKVLHFFNVKSNLaQPQMMVVTDVGEVFVPLLDG-FLVNYQESQSVIHNLLDQIPDMFADSNENETVF 558
Cdd:COG5028   316 ----TKIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIFQDNKSPKNAL 390
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  559 APVIQAGMEALKAadCPGKLFIFHSSLPTAeAPGKLKNRDDKklvntdkEKILFQPQTNVYDSLAKDCVAHGCSVTLFLF 638
Cdd:COG5028   391 GPALKAAKSLIGG--TGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSLLSCKDSFYKEFAIECSKVGISVDLFLT 460
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  639 PSQYVDVASLGLVPQLTGGTLYKYNNFQM--HLDRQQFLNDLRNDIEKKIGFDAIMRVRTSTGFRATDFFGGILMNNTTD 716
Cdd:COG5028   461 SEDYIDVATLSHLCRYTGGQTYFYPNFSAtrPNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLRVSSFYGNFFNRSSDL 540
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  717 VEMAAIDCDKAVTVEFKHDDKLSeDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFK 796
Cdd:COG5028   541 CAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASADQLAIACILAKKAST 619
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  797 AVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNcVLLSRPEISTDERAYQRQLVMT 876
Cdd:COG5028   620 KALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLALLKS-SAFRSGSTPSDIRISALNRLTS 698
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  877 MGVADSQLFFYPQLLPIHTL-------DVKSTMLPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIFNVPS 949
Cdd:COG5028   699 LPLKQLMRNIYPTLYALHDMpieaglpDEGLLVLPSPINATSSLLESGGLYLIDTGQKIFLWFGKDAVPSLLQDLFGVDS 778
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  950 FAHINTDMTLLPEVGNPYSQQLRMIMGIIQQKRPYS-MKLTIVKQREQP--EMVFRQFLVEDKgLYGGSSYVDFLCCVHK 1026
Cdd:COG5028   779 LSDIPSGKFTLPPTGNEFNERVRNIIGELRSVNDDStLPLVLVRGGGDPslRLWFFSTLVEDK-TLNIPSYLDYLQILHE 857

                  ..
gi 968121920 1027 EI 1028
Cdd:COG5028   858 KI 859
PHA03247 super family cl33720
large tegument protein UL36; Provisional
8-294 2.05e-10

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 65.34  E-value: 2.05e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    8 ATPPYSQPQPGIGLSPPHYGhygDPSHTASPTGMMKPAGPlGATATRGMLPPGPPPPGPhqfgqnGAHATGHPPQRFPGP 87
Cdd:PHA03247 2718 ATPLPPGPAAARQASPALPA---APAPPAVPAGPATPGGP-ARPARPPTTAGPPAPAPP------AAPAAGPPRRLTRPA 2787
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   88 PPVNNVASSHAPyQPSAQSSYPGPISTSSVTQLGSQLSAmqinsygSGMAPPSQGPPgplsaTSLQTPPRPPQPSiLQPG 167
Cdd:PHA03247 2788 VASLSESRESLP-SPWDPADPPAAVLAPAAALPPAASPA-------GPLPPPTSAQP-----TAPPPPPGPPPPS-LPLG 2853
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  168 SQVLPPPPTTLNGPGASPLPLPMYRP----DGLSGPPPPNaQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGF 243
Cdd:PHA03247 2854 GSVAPGGDVRRRPPSRSPAAKPAAPArppvRRLARPAVSR-STESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|.
gi 968121920  244 PGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRASRGGQVYAtnTRGQIPP 294
Cdd:PHA03247 2933 PPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAV--PRFRVPQ 2981
 
Name Accession Description Interval E-value
COG5028 COG5028
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
182-1028 4.46e-167

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 512.80  E-value: 4.46e-167
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  182 GASPLPLPMYRPDG--LSGPPPPNAQYQPpplpgQTLGAGYPPQQAANSGPQMAGAQL---SYPGGF----PGGPAQMAG 252
Cdd:COG5028     7 GVYPQAQSQVHTGAasSKKSARPHRAYAN-----FSAGQMGMPPYTTPPLQQQSRRQIdqaATAMHNtganNPAPSVMSP 81
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  253 PPQPQKK-------------LDPDSIPSPIQVIENDRASRGGQVYATnTRGQIPPLvTTDCMIQDQGNASPRFIRCTTYC 319
Cdd:COG5028    82 AFQSQQKfsspyggsmadgtAPKPTNPLVPVDLFEDQPPPISDLFLP-PPPIVPPL-TTNFVGSEQSNCSPKYVRSTMYA 159
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  320 FPCTSDMAKQAQIPLAAVIKPFATIPSNESPLYLVNHGEsgPVRCNRCKAYMCPFMQFIEGGRRYQCGFCNCVNDVPPFY 399
Cdd:COG5028   160 IPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNICRSKNDVPEGF 237
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  400 FQHLDHIGRRLDHYEKPELSLGSYEYVATLDYcrKSKPPNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEQE 479
Cdd:COG5028   238 DNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILENLDQIPNFDPR 315
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  480 etsaIRVGFITYNKVLHFFNVKSNLaQPQMMVVTDVGEVFVPLLDG-FLVNYQESQSVIHNLLDQIPDMFADSNENETVF 558
Cdd:COG5028   316 ----TKIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIFQDNKSPKNAL 390
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  559 APVIQAGMEALKAadCPGKLFIFHSSLPTAeAPGKLKNRDDKklvntdkEKILFQPQTNVYDSLAKDCVAHGCSVTLFLF 638
Cdd:COG5028   391 GPALKAAKSLIGG--TGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSLLSCKDSFYKEFAIECSKVGISVDLFLT 460
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  639 PSQYVDVASLGLVPQLTGGTLYKYNNFQM--HLDRQQFLNDLRNDIEKKIGFDAIMRVRTSTGFRATDFFGGILMNNTTD 716
Cdd:COG5028   461 SEDYIDVATLSHLCRYTGGQTYFYPNFSAtrPNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLRVSSFYGNFFNRSSDL 540
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  717 VEMAAIDCDKAVTVEFKHDDKLSeDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFK 796
Cdd:COG5028   541 CAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASADQLAIACILAKKAST 619
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  797 AVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNcVLLSRPEISTDERAYQRQLVMT 876
Cdd:COG5028   620 KALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLALLKS-SAFRSGSTPSDIRISALNRLTS 698
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  877 MGVADSQLFFYPQLLPIHTL-------DVKSTMLPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIFNVPS 949
Cdd:COG5028   699 LPLKQLMRNIYPTLYALHDMpieaglpDEGLLVLPSPINATSSLLESGGLYLIDTGQKIFLWFGKDAVPSLLQDLFGVDS 778
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  950 FAHINTDMTLLPEVGNPYSQQLRMIMGIIQQKRPYS-MKLTIVKQREQP--EMVFRQFLVEDKgLYGGSSYVDFLCCVHK 1026
Cdd:COG5028   779 LSDIPSGKFTLPPTGNEFNERVRNIIGELRSVNDDStLPLVLVRGGGDPslRLWFFSTLVEDK-TLNIPSYLDYLQILHE 857

                  ..
gi 968121920 1027 EI 1028
Cdd:COG5028   858 KI 859
Sec24-like cd01479
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ...
438-697 7.00e-119

Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.


Pssm-ID: 238756 [Multi-domain]  Cd Length: 244  Bit Score: 364.29  E-value: 7.00e-119
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  438 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 517
Cdd:cd01479     1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGDD----PRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  518 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKaaDCPGKLFIFHSSLPTAEApGKLKNR 597
Cdd:cd01479    77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  598 DDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQmhldrqqflND 677
Cdd:cd01479   154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYPSFN---------FS 224
                         250       260
                  ....*....|....*....|
gi 968121920  678 LRNDIEKKIGFDAIMRVRTS 697
Cdd:cd01479   225 APNDVEKLVNELARYLTRKI 244
Sec23_trunk pfam04811
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
438-682 6.44e-111

Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.


Pssm-ID: 398467 [Multi-domain]  Cd Length: 241  Bit Score: 343.46  E-value: 6.44e-111
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   438 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEeqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 517
Cdd:pfam04811    1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGD-----PRARVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   518 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEAPGKLKNR 597
Cdd:pfam04811   76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   598 DDKKLVNTDKEKILFQPQTN-VYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFLN 676
Cdd:pfam04811  156 LDESHHGTDKEKAKLVKKADkFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235

                   ....*.
gi 968121920   677 DLRNDI 682
Cdd:pfam04811  236 DLQRYF 241
PTZ00395 PTZ00395
Sec24-related protein; Provisional
440-1026 1.29e-50

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 195.68  E-value: 1.29e-50
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  440 PPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIpkeeqeETSAIRVGFITYNKVLHFFNVKSNLAQP------------ 507
Cdd:PTZ00395  952 PPYFVFVVECSYNAIYNNITYTILEGIRYAVQNV------KCPQTKIAIITFNSSIYFYHCKGGKGVSgeegdggggsgn 1025
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  508 -QMMVVTDVGEVFVPL-LDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSL 585
Cdd:PTZ00395 1026 hQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSCGNSALKIAMDMLKERNGLGSICMFYTTT 1105
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  586 PTAeAPGKLKnrddkKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVA--SLGLVPQLTGGTLYKYN 663
Cdd:PTZ00395 1106 PNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFIISSNNVRVCvpSLQYVAQNTGGKILFVE 1179
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  664 NFQMHLDRQQ-FLNDLRNDIEKKIGFDAIMRVRTSTG------FRATDFFGGILMNNTtdVEMAAIDCDKAVTVEFKHDD 736
Cdd:PTZ00395 1180 NFLWQKDYKEiYMNIMDTLTSEDIAYCCELKLRYSHHmsvkklFCCNNNFNSIISVDT--IKIPKIRHDQTFAFLLNYSD 1257
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  737 KLSEDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFKAVLHQplKVIREILVNQTAH 816
Cdd:PTZ00395 1258 ISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNILIKQLCTNILHN--DNYSKIIIDNLAA 1335
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  817 MLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVllSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIH-- 894
Cdd:PTZ00395 1336 ILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNV--TKKEILHDLKVYSLIKLLSMPIISSLLYVYPVMYVIHik 1413
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  895 -------TLDVKSTM-LPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIF-NVPSFAHINTdmtlLPEVGN 965
Cdd:PTZ00395 1414 gktneidSMDVDDDLfIPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANFAKEIVgDIPTEKNAHE----LNLTDT 1489
                         570       580       590       600       610       620
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 968121920  966 PYSQQLRMIMGIIQQKRPYS--MKLTIVKQREQPEMVFRQFLVEDKGlYGGSSYVDFLCCVHK 1026
Cdd:PTZ00395 1490 PNAQKVQRIIKNLSRIHHFNkyVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYVNFLCFIHK 1551
PHA03247 PHA03247
large tegument protein UL36; Provisional
8-294 2.05e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 65.34  E-value: 2.05e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    8 ATPPYSQPQPGIGLSPPHYGhygDPSHTASPTGMMKPAGPlGATATRGMLPPGPPPPGPhqfgqnGAHATGHPPQRFPGP 87
Cdd:PHA03247 2718 ATPLPPGPAAARQASPALPA---APAPPAVPAGPATPGGP-ARPARPPTTAGPPAPAPP------AAPAAGPPRRLTRPA 2787
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   88 PPVNNVASSHAPyQPSAQSSYPGPISTSSVTQLGSQLSAmqinsygSGMAPPSQGPPgplsaTSLQTPPRPPQPSiLQPG 167
Cdd:PHA03247 2788 VASLSESRESLP-SPWDPADPPAAVLAPAAALPPAASPA-------GPLPPPTSAQP-----TAPPPPPGPPPPS-LPLG 2853
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  168 SQVLPPPPTTLNGPGASPLPLPMYRP----DGLSGPPPPNaQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGF 243
Cdd:PHA03247 2854 GSVAPGGDVRRRPPSRSPAAKPAAPArppvRRLARPAVSR-STESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|.
gi 968121920  244 PGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRASRGGQVYAtnTRGQIPP 294
Cdd:PHA03247 2933 PPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAV--PRFRVPQ 2981
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
96-260 1.45e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 62.48  E-value: 1.45e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    96 SHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSygsgmAPPSQGPPGPLSATSLQTP-PRPPQPSILQPGSQVLPPP 174
Cdd:pfam03154  143 STSPSIPSPQDNESDSDSSAQQQILQTQPPVLQAQS-----GAASPPSPPPPGTTQAATAgPTPSAPSVPPQGSPATSQP 217
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   175 PTTLNGPgASPLPL----PMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGP------QMAGAQLSYPG--- 241
Cdd:pfam03154  218 PNQTQST-AAPHTLiqqtPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPpmphslQTGPSHMQHPVppq 296
                          170       180
                   ....*....|....*....|.
gi 968121920   242 GFPGGP--AQMAGPPQPQKKL 260
Cdd:pfam03154  297 PFPLTPqsSQSQVPPGPSPAA 317
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
102-232 9.60e-06

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 49.42  E-value: 9.60e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   102 PSAQSSYPGPISTS-SVTQLGSQLSAMQINSY-----GSGMAPPSQGppgplsatslqtPPRPPQPSILQPGSQVLPPPP 175
Cdd:TIGR01628  381 RMRQLPMGSPMGGAmGQPPYYGQGPQQQFNGQplgwpRMSMMPTPMG------------PGGPLRPNGLAPMNAVRAPSR 448
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 968121920   176 TTLNGPGASPLPLPMYRPDGLSGPPPPNAQyQPPPLPGQTLGAGYPPQQAANSGPQM 232
Cdd:TIGR01628  449 NAQNAAQKPPMQPVMYPPNYQSLPLSQDLP-QPQSTASQGGQNKKLAQVLASATPQM 504
BimA_first NF040984
trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular ...
94-211 2.43e-04

trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular motility protein A) is a trimeric autotransporter, homologous in its C-terminal half to a number of trimeric autotransporter adhesins. It is a virulence factor that nucleates actin, so that actin polymerization can drive escape by B. pseudomallei out of one cell and into a neighboring cell. HMM NF040983 describes a homolog with similar activity but substantial difference in sequence architecture in the N-terminal region.


Pssm-ID: 468914 [Multi-domain]  Cd Length: 517  Bit Score: 44.86  E-value: 2.43e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   94 ASSHAPYQPSaqssyPGPISTSSVTQLGSQLSAMQINSYGSgmappsqgPPGPLSATSLQTPPrpPQPSilqPGSQVLPP 173
Cdd:NF040984    6 SSSHAPDAPK-----PSSIATTLCRALASLSLGLSMDAEAN--------PPEPPGGTNIPVPP--PMPG---GGANIPVP 67
                          90       100       110
                  ....*....|....*....|....*....|....*...
gi 968121920  174 PPTTLNGPGASPLPLPmyrPDGLSGPPPpnaqyQPPPL 211
Cdd:NF040984   68 PPMPGGGANIPPPPPP---PGGIGGATP-----SPPPL 97
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
153-260 9.21e-04

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 43.26  E-value: 9.21e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   153 QTPPRPPQPSILQP--GSQVLPP----------PPTTLNGPGASPLPLPMY--RPDGLSGPPPPNAQYQPP----PLPGQ 214
Cdd:TIGR01628  377 QLQPRMRQLPMGSPmgGAMGQPPyygqgpqqqfNGQPLGWPRMSMMPTPMGpgGPLRPNGLAPMNAVRAPSrnaqNAAQK 456
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 968121920   215 TLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGP----AQM--AGPPQPQKKL 260
Cdd:TIGR01628  457 PPMQPVMYPPNYQSLPLSQDLPQPQSTASQGGQnkklAQVlaSATPQMQKQV 508
BimA_second NF040983
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia ...
139-247 1.35e-03

trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia intracellular motility A), WP_004266405.1-like proteins in Burkholderia mallei or B. pseudomallei. The term BimA has also been used for WP_011205626.1-like homologs that have a very different N-terminal half.


Pssm-ID: 468913 [Multi-domain]  Cd Length: 382  Bit Score: 42.20  E-value: 1.35e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  139 PSQGPPGPlsatslqtPPRPPQPSILQPGSQVLPPPPttlngPGASPLPLPmyrpdglsgPPPPNAQYQPPPlPGQTLGA 218
Cdd:NF040983   86 PNKVPPPP--------PPPPPPPPPPPTPPPPPPPPP-----PPPPPSPPP---------PPPPSPPPSPPP-PTTTPPT 142
                          90       100
                  ....*....|....*....|....*....
gi 968121920  219 GYPPQQAANSgPQMAGAQlsyPGGFPGGP 247
Cdd:NF040983  143 RTTPSTTTPT-PSMHPIQ---PTQLPSIP 167
PPE COG5651
PPE-repeat protein [Function unknown];
70-254 2.87e-03

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 41.42  E-value: 2.87e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   70 GQNGAHATGhPPQRFPGPPPVNNVASSHAPYQPSAQSSYPGPISTSS-VTQLGSQLSAMQINSYGSGMAPPSQGPPGPLS 148
Cdd:COG5651   179 GLLGAQNAG-SGNTSSNPGFANLGLTGLNQVGIGGLNSGSGPIGLNSgPGNTGFAGTGAAAGAAAAAAAAAAAAGAGASA 257
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  149 AtslqtpprpPQPSILQPGSQVLPPPPTTLNGPGASPLPLPmyrPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANS 228
Cdd:COG5651   258 A---------LASLAATLLNASSLGLAATAASSAATNLGLA---GSPLGLAGGGAGAAAATGLGLGAGGAAGAAGATGAG 325
                         170       180
                  ....*....|....*....|....*.
gi 968121920  229 GPQMAGAQLSYPGGFPGGPAQMAGPP 254
Cdd:COG5651   326 AALGAGAAAAAAGAAAGAGAAAAAAA 351
SAV_2336_NTERM NF041121
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ...
138-205 5.41e-03

SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.


Pssm-ID: 469044 [Multi-domain]  Cd Length: 473  Bit Score: 40.37  E-value: 5.41e-03
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 968121920  138 PPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDglsGPPPPNAQ 205
Cdd:NF041121   39 PPPAAPPSPPGDPPEPPAPEPAPLPAPYPGSLAPPPPPPPGPAGAAPGAALPVRVPA---PPALPNPL 103
SAV_2336_NTERM NF041121
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ...
137-214 7.50e-03

SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.


Pssm-ID: 469044 [Multi-domain]  Cd Length: 473  Bit Score: 39.99  E-value: 7.50e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  137 APPSQGPPGPLSATSLQTPPRPPQPSiLQPGSQVLPPPPTTLNGPGASP--LPLPMYRPDGLSGPPPPNAQ----YQPPP 210
Cdd:NF041121   20 APPSPEGPAPTAASQPATPPPPAAPP-SPPGDPPEPPAPEPAPLPAPYPgsLAPPPPPPPGPAGAAPGAALpvrvPAPPA 98

                  ....
gi 968121920  211 LPGQ 214
Cdd:NF041121   99 LPNP 102
 
Name Accession Description Interval E-value
COG5028 COG5028
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
182-1028 4.46e-167

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 512.80  E-value: 4.46e-167
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  182 GASPLPLPMYRPDG--LSGPPPPNAQYQPpplpgQTLGAGYPPQQAANSGPQMAGAQL---SYPGGF----PGGPAQMAG 252
Cdd:COG5028     7 GVYPQAQSQVHTGAasSKKSARPHRAYAN-----FSAGQMGMPPYTTPPLQQQSRRQIdqaATAMHNtganNPAPSVMSP 81
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  253 PPQPQKK-------------LDPDSIPSPIQVIENDRASRGGQVYATnTRGQIPPLvTTDCMIQDQGNASPRFIRCTTYC 319
Cdd:COG5028    82 AFQSQQKfsspyggsmadgtAPKPTNPLVPVDLFEDQPPPISDLFLP-PPPIVPPL-TTNFVGSEQSNCSPKYVRSTMYA 159
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  320 FPCTSDMAKQAQIPLAAVIKPFATIPSNESPLYLVNHGEsgPVRCNRCKAYMCPFMQFIEGGRRYQCGFCNCVNDVPPFY 399
Cdd:COG5028   160 IPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNICRSKNDVPEGF 237
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  400 FQHLDHIGRRLDHYEKPELSLGSYEYVATLDYcrKSKPPNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEQE 479
Cdd:COG5028   238 DNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILENLDQIPNFDPR 315
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  480 etsaIRVGFITYNKVLHFFNVKSNLaQPQMMVVTDVGEVFVPLLDG-FLVNYQESQSVIHNLLDQIPDMFADSNENETVF 558
Cdd:COG5028   316 ----TKIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIFQDNKSPKNAL 390
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  559 APVIQAGMEALKAadCPGKLFIFHSSLPTAeAPGKLKNRDDKklvntdkEKILFQPQTNVYDSLAKDCVAHGCSVTLFLF 638
Cdd:COG5028   391 GPALKAAKSLIGG--TGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSLLSCKDSFYKEFAIECSKVGISVDLFLT 460
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  639 PSQYVDVASLGLVPQLTGGTLYKYNNFQM--HLDRQQFLNDLRNDIEKKIGFDAIMRVRTSTGFRATDFFGGILMNNTTD 716
Cdd:COG5028   461 SEDYIDVATLSHLCRYTGGQTYFYPNFSAtrPNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLRVSSFYGNFFNRSSDL 540
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  717 VEMAAIDCDKAVTVEFKHDDKLSeDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFK 796
Cdd:COG5028   541 CAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASADQLAIACILAKKAST 619
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  797 AVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNcVLLSRPEISTDERAYQRQLVMT 876
Cdd:COG5028   620 KALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLALLKS-SAFRSGSTPSDIRISALNRLTS 698
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  877 MGVADSQLFFYPQLLPIHTL-------DVKSTMLPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIFNVPS 949
Cdd:COG5028   699 LPLKQLMRNIYPTLYALHDMpieaglpDEGLLVLPSPINATSSLLESGGLYLIDTGQKIFLWFGKDAVPSLLQDLFGVDS 778
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  950 FAHINTDMTLLPEVGNPYSQQLRMIMGIIQQKRPYS-MKLTIVKQREQP--EMVFRQFLVEDKgLYGGSSYVDFLCCVHK 1026
Cdd:COG5028   779 LSDIPSGKFTLPPTGNEFNERVRNIIGELRSVNDDStLPLVLVRGGGDPslRLWFFSTLVEDK-TLNIPSYLDYLQILHE 857

                  ..
gi 968121920 1027 EI 1028
Cdd:COG5028   858 KI 859
Sec24-like cd01479
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ...
438-697 7.00e-119

Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.


Pssm-ID: 238756 [Multi-domain]  Cd Length: 244  Bit Score: 364.29  E-value: 7.00e-119
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  438 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 517
Cdd:cd01479     1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGDD----PRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  518 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKaaDCPGKLFIFHSSLPTAEApGKLKNR 597
Cdd:cd01479    77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  598 DDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQmhldrqqflND 677
Cdd:cd01479   154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYPSFN---------FS 224
                         250       260
                  ....*....|....*....|
gi 968121920  678 LRNDIEKKIGFDAIMRVRTS 697
Cdd:cd01479   225 APNDVEKLVNELARYLTRKI 244
Sec23_trunk pfam04811
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
438-682 6.44e-111

Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.


Pssm-ID: 398467 [Multi-domain]  Cd Length: 241  Bit Score: 343.46  E-value: 6.44e-111
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   438 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEeqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 517
Cdd:pfam04811    1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGD-----PRARVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   518 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEAPGKLKNR 597
Cdd:pfam04811   76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   598 DDKKLVNTDKEKILFQPQTN-VYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFLN 676
Cdd:pfam04811  156 LDESHHGTDKEKAKLVKKADkFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235

                   ....*.
gi 968121920   677 DLRNDI 682
Cdd:pfam04811  236 DLQRYF 241
trunk_domain cd01468
trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi ...
438-678 7.24e-98

trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Some members of this family possess a partial MIDAS motif that is a characteristic feature of most vWA domain proteins.


Pssm-ID: 238745 [Multi-domain]  Cd Length: 239  Bit Score: 308.79  E-value: 7.24e-98
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  438 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEeqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 517
Cdd:cd01468     1 PQPPVFVFVIDVSYEAIKEGLLQALKESLLASLDLLPGD-----PRARVGLITYDSTVHFYNLSSDLAQPKMYVVSDLKD 75
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  518 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFAD--SNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEaPGKLK 595
Cdd:cd01468    76 VFLPLPDRFLVPLSECKKVIHDLLEQLPPMFWPvpTHRPERCLGPALQAAFLLLKGTFAGGRIIVFQGGLPTVG-PGKLK 154
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  596 NRDDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFL 675
Cdd:cd01468   155 SREDKEPIRSHDEAQLLKPATKFYKSLAKECVKSGICVDLFAFSLDYVDVATLKQLAKSTGGQVYLYDSFQAPNDGSKFK 234

                  ...
gi 968121920  676 NDL 678
Cdd:cd01468   235 QDL 237
PTZ00395 PTZ00395
Sec24-related protein; Provisional
440-1026 1.29e-50

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 195.68  E-value: 1.29e-50
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  440 PPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIpkeeqeETSAIRVGFITYNKVLHFFNVKSNLAQP------------ 507
Cdd:PTZ00395  952 PPYFVFVVECSYNAIYNNITYTILEGIRYAVQNV------KCPQTKIAIITFNSSIYFYHCKGGKGVSgeegdggggsgn 1025
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  508 -QMMVVTDVGEVFVPL-LDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSL 585
Cdd:PTZ00395 1026 hQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSCGNSALKIAMDMLKERNGLGSICMFYTTT 1105
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  586 PTAeAPGKLKnrddkKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVA--SLGLVPQLTGGTLYKYN 663
Cdd:PTZ00395 1106 PNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFIISSNNVRVCvpSLQYVAQNTGGKILFVE 1179
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  664 NFQMHLDRQQ-FLNDLRNDIEKKIGFDAIMRVRTSTG------FRATDFFGGILMNNTtdVEMAAIDCDKAVTVEFKHDD 736
Cdd:PTZ00395 1180 NFLWQKDYKEiYMNIMDTLTSEDIAYCCELKLRYSHHmsvkklFCCNNNFNSIISVDT--IKIPKIRHDQTFAFLLNYSD 1257
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  737 KLSEDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFKAVLHQplKVIREILVNQTAH 816
Cdd:PTZ00395 1258 ISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNILIKQLCTNILHN--DNYSKIIIDNLAA 1335
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  817 MLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVllSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIH-- 894
Cdd:PTZ00395 1336 ILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNV--TKKEILHDLKVYSLIKLLSMPIISSLLYVYPVMYVIHik 1413
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  895 -------TLDVKSTM-LPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIF-NVPSFAHINTdmtlLPEVGN 965
Cdd:PTZ00395 1414 gktneidSMDVDDDLfIPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANFAKEIVgDIPTEKNAHE----LNLTDT 1489
                         570       580       590       600       610       620
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 968121920  966 PYSQQLRMIMGIIQQKRPYS--MKLTIVKQREQPEMVFRQFLVEDKGlYGGSSYVDFLCCVHK 1026
Cdd:PTZ00395 1490 PNAQKVQRIIKNLSRIHHFNkyVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYVNFLCFIHK 1551
Sec23_helical pfam04815
Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic ...
784-884 1.62e-32

Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices.


Pssm-ID: 461441 [Multi-domain]  Cd Length: 103  Bit Score: 121.46  E-value: 1.62e-32
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   784 DALINFFAKSAFKAVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVLLSRPEIS 863
Cdd:pfam04815    3 EAIAVLLAKKAVEKALSSSLSDAREALDNKLVDILAAYRKYCASSSSPGQLILPESLKLLPLYMLALLKSPALRGGNSSP 82
                           90       100
                   ....*....|....*....|.
gi 968121920   864 TDERAYQRQLVMTMGVADSQL 884
Cdd:pfam04815   83 SDERAYARHLLLSLPVEELLL 103
Sec23_BS pfam08033
Sec23/Sec24 beta-sandwich domain;
687-771 4.52e-29

Sec23/Sec24 beta-sandwich domain;


Pssm-ID: 429794 [Multi-domain]  Cd Length: 86  Bit Score: 111.09  E-value: 4.52e-29
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   687 GFDAIMRVRTSTGFRATDFFGGILMNNTTD-VEMAAIDCDKAVTVEFKHDDKLSEDSGALIQCAVLYTTISGQRRLRIHN 765
Cdd:pfam08033    1 GFNAVLRVRTSKGLKVSGFIGNFVSRSSGDtWKLPSLDPDTSYAFEFDIDEPLPNGSNAYIQFALLYTHSSGERRIRVTT 80

                   ....*.
gi 968121920   766 LGLNCS 771
Cdd:pfam08033   81 VALPVT 86
SEC23 COG5047
Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion];
313-901 1.70e-21

Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion];


Pssm-ID: 227380 [Multi-domain]  Cd Length: 755  Bit Score: 100.73  E-value: 1.70e-21
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  313 IRCTTYCFPCTSDMAKQAQIPLAAVIKPFatipsNESPLYLVNHGEsgPVRCNR-CKAYMCPFMQFIEGGRRYQCGFCNC 391
Cdd:COG5047    12 IRLTWNVFPATRGDATRTVIPIACLYTPL-----HEDDALTVNYYE--PVKCTApCKAVLNPYCHIDERNQSWICPFCNQ 84
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  392 VNDVPPFYfqhldhigRRLDHYEKP-ELSLGS--YEYVAtldycrkSKPPN-PPAFIFMIDVSYSNIKNGLVKlicEELK 467
Cdd:COG5047    85 RNTLPPQY--------RDISNANLPlELLPQSstIEYTL-------SKPVIlPPVFFFVVDACCDEEELTALK---DSLI 146
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  468 TMLEKIPKEeqeetsAIrVGFITYNKVLHFFNV------KSNLAQP----QMMVVTDVGEVFVPLLDG------------ 525
Cdd:COG5047   147 VSLSLLPPE------AL-VGLITYGTSIQVHELnaenhrRSYVFSGnkeyTKENLQELLALSKPTKSGgfeskisgigqf 219
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  526 ----FLVNYQESQSVIHNLLDQI-PDMFADSNENE----TVFAPVIQAGMEALKAADCPGKLFIFHSSlPTAEAPGKLKN 596
Cdd:COG5047   220 assrFLLPTQQCEFKLLNILEQLqPDPWPVPAGKRplrcTGSALNIASSLLEQCFPNAGCHIVLFAGG-PCTVGPGTVVS 298
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  597 RDDKK------LVNTDKEKiLFQPQTNVYDSLAKDCVAHGCSVTLFLfpSQYVDVASLGLVP--QLTGGTLYKYNNFQMH 668
Cdd:COG5047   299 TELKEpmrshhDIESDSAQ-HSKKATKFYKGLAERVANQGHALDIFA--GCLDQIGIMEMEPltTSTGGALVLSDSFTTS 375
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  669 LDRQQFLN--DLRNDIEKKIGFDAIMRVRTSTGFRATDFFG---------------GILMNNTTDVEMAAIDCDKAVTVE 731
Cdd:COG5047   376 IFKQSFQRifNRDSEGYLKMGFNANMEVKTSKNLKIKGLIGhavsvkkkannisdsEIGIGATNSWKMASLSPKSNYALY 455
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  732 FK-----HDDKLSEDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADL-YKSCETDALINFFAK-SAFKAVLHQPLK 804
Cdd:COG5047   456 FEialgaASGSAQRPAEAYIQFITTYQHSSGTYRIRVTTVARMFTDGGLPKiNRSFDQEAAAVFMARiAAFKAETEDIID 535
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  805 VIREI---LVNQTAHmLACYRKNcaspsAASQLILPDSMKVLPVYMnCLLKNCVLLSRPEISTDERAYQRQLVMTMGVAD 881
Cdd:COG5047   536 VFRWIdrnLIRLCQK-FADYRKD-----DPSSFRLDPNFTLYPQFM-YHLRRSPFLSVFNNSPDETAFYRHMLNNADVND 608
                         650       660
                  ....*....|....*....|....*...
gi 968121920  882 SQLFFYPQLLPIH--------TLDVKST 901
Cdd:COG5047   609 SLIMIQPTLQSYSfekggvpvLLDSVSV 636
zf-Sec23_Sec24 pfam04810
Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
361-397 1.03e-15

Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain.


Pssm-ID: 461437 [Multi-domain]  Cd Length: 38  Bit Score: 71.71  E-value: 1.03e-15
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 968121920   361 PVRCNRCKAYMCPFMQFIEGGRRYQCGFCNCVNDVPP 397
Cdd:pfam04810    1 PVRCRRCRAYLNPFCQFDFGGKKWTCNFCGTRNPVPP 37
PLN00162 PLN00162
transport protein sec23; Provisional
313-657 6.19e-11

transport protein sec23; Provisional


Pssm-ID: 215083 [Multi-domain]  Cd Length: 761  Bit Score: 66.50  E-value: 6.19e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  313 IRCTTYCFPCTSDMAKQAQIPLAAVIKPFAtiPSNESPL--YlvnhgesGPVRCNRCKAYMCPFMQFIEGGRRYQCGFCN 390
Cdd:PLN00162   12 VRMSWNVWPSSKIEASKCVIPLAALYTPLK--PLPELPVlpY-------DPLRCRTCRAVLNPYCRVDFQAKIWICPFCF 82
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  391 CVNDVPPFYFQhldhIGrrlDHYEKPELslgsYEYVATLDY---CRKSKPPNPPAFIFMIDVSYSNIKNGLVKlicEELK 467
Cdd:PLN00162   83 QRNHFPPHYSS----IS---ETNLPAEL----FPQYTTVEYtlpPGSGGAPSPPVFVFVVDTCMIEEELGALK---SALL 148
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  468 TMLEKIPkeeqeETSaiRVGFITY----------------------------NKVLHFFNVKSNLAQPQMMVVTDVGEVF 519
Cdd:PLN00162  149 QAIALLP-----ENA--LVGLITFgthvhvhelgfsecsksyvfrgnkevskDQILEQLGLGGKKRRPAGGGIAGARDGL 221
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  520 VPL-LDGFLVNYQESQSVIHNLLDQI-PDMFADSNENE----TVFAPVIQAGMEALKAADCPGKLFIFHSSlPTAEAPGK 593
Cdd:PLN00162  222 SSSgVNRFLLPASECEFTLNSALEELqKDPWPVPPGHRparcTGAALSVAAGLLGACVPGTGARIMAFVGG-PCTEGPGA 300
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 968121920  594 LKNRDDKKLVNTDKEKI-----LFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGG 657
Cdd:PLN00162  301 IVSKDLSEPIRSHKDLDkdaapYYKKAVKFYEGLAKQLVAQGHVLDVFACSLDQVGVAEMKVAVERTGG 369
PHA03247 PHA03247
large tegument protein UL36; Provisional
8-294 2.05e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 65.34  E-value: 2.05e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    8 ATPPYSQPQPGIGLSPPHYGhygDPSHTASPTGMMKPAGPlGATATRGMLPPGPPPPGPhqfgqnGAHATGHPPQRFPGP 87
Cdd:PHA03247 2718 ATPLPPGPAAARQASPALPA---APAPPAVPAGPATPGGP-ARPARPPTTAGPPAPAPP------AAPAAGPPRRLTRPA 2787
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   88 PPVNNVASSHAPyQPSAQSSYPGPISTSSVTQLGSQLSAmqinsygSGMAPPSQGPPgplsaTSLQTPPRPPQPSiLQPG 167
Cdd:PHA03247 2788 VASLSESRESLP-SPWDPADPPAAVLAPAAALPPAASPA-------GPLPPPTSAQP-----TAPPPPPGPPPPS-LPLG 2853
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  168 SQVLPPPPTTLNGPGASPLPLPMYRP----DGLSGPPPPNaQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGF 243
Cdd:PHA03247 2854 GSVAPGGDVRRRPPSRSPAAKPAAPArppvRRLARPAVSR-STESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|.
gi 968121920  244 PGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRASRGGQVYAtnTRGQIPP 294
Cdd:PHA03247 2933 PPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAV--PRFRVPQ 2981
PHA03378 PHA03378
EBNA-3B; Provisional
96-270 2.76e-10

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 64.70  E-value: 2.76e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   96 SHAPyQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPsQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPP-P 174
Cdd:PHA03378  613 SHIP-ETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPP-QVEITPYKPTWTQIGHIPYQPSPTGANTMLPIQwA 690
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  175 PTTLNGPGASPLPL--------PMYRPDGLSGP-PPPNAQYQPPPLPGQTLGAGYPPQQA--ANSGPQMAGAQLSYPGGF 243
Cdd:PHA03378  691 PGTMQPPPRAPTPMrppaappgRAQRPAAATGRaRPPAAAPGRARPPAAAPGRARPPAAApgRARPPAAAPGRARPPAAA 770
                         170       180       190
                  ....*....|....*....|....*....|
gi 968121920  244 PGGPAQM---AGPPQPQKKldPDSIPSPIQ 270
Cdd:PHA03378  771 PGAPTPQpppQAPPAPQQR--PRGAPTPQP 798
Gelsolin pfam00626
Gelsolin repeat;
900-975 1.06e-09

Gelsolin repeat;


Pssm-ID: 395501 [Multi-domain]  Cd Length: 76  Bit Score: 55.78  E-value: 1.06e-09
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 968121920   900 STMLPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSppELIQGIFNVPSFAHINTDM-TLLPEVGN-PYSQQLRMIM 975
Cdd:pfam00626    1 KFVLPPPVPLSQESLNSGDCYLLDNGFTIFLWVGKGS--SLLEKLFAALLAAQLDDDErFPLPEVIRvPQGKEPARFL 76
PHA03378 PHA03378
EBNA-3B; Provisional
93-260 1.45e-09

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 62.39  E-value: 1.45e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   93 VASSHAPYQPSaqssypgPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLS-ATSLQTPPRPPQ--PSILQPGSQ 169
Cdd:PHA03378  667 TQIGHIPYQPS-------PTGANTMLPIQWAPGTMQPPPRAPTPMRPPAAPPGRAQrPAAATGRARPPAaaPGRARPPAA 739
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  170 V---LPPP---PTTLNGPGASPLPLPmyRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPqmAGAQLSypggf 243
Cdd:PHA03378  740 ApgrARPPaaaPGRARPPAAAPGRAR--PPAAAPGAPTPQPPPQAPPAPQQRPRGAPTPQPPPQAGP--TSMQLM----- 810
                         170
                  ....*....|....*..
gi 968121920  244 pggPAQMAGPPQPQKKL 260
Cdd:PHA03378  811 ---PRAAPGQQGPTKQI 824
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
96-260 1.45e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 62.48  E-value: 1.45e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    96 SHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSygsgmAPPSQGPPGPLSATSLQTP-PRPPQPSILQPGSQVLPPP 174
Cdd:pfam03154  143 STSPSIPSPQDNESDSDSSAQQQILQTQPPVLQAQS-----GAASPPSPPPPGTTQAATAgPTPSAPSVPPQGSPATSQP 217
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   175 PTTLNGPgASPLPL----PMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGP------QMAGAQLSYPG--- 241
Cdd:pfam03154  218 PNQTQST-AAPHTLiqqtPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPpmphslQTGPSHMQHPVppq 296
                          170       180
                   ....*....|....*....|.
gi 968121920   242 GFPGGP--AQMAGPPQPQKKL 260
Cdd:pfam03154  297 PFPLTPqsSQSQVPPGPSPAA 317
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
93-274 5.69e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 60.17  E-value: 5.69e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    93 VASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPS---QGP-------PGPLSATSLQTPPRPPQPS 162
Cdd:pfam03154  182 SPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTliqQTPtlhpqrlPSPHPPLQPMTQPPPPSQV 261
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   163 ILQPGSQV-----LPPPPTTLN-GPGASPLPLPmyrPDGLsGPPPPNAQYQPPPLPgQTLGAGYPPQQAANSGPQMAGAQ 236
Cdd:pfam03154  262 SPQPLPQPslhgqMPPMPHSLQtGPSHMQHPVP---PQPF-PLTPQSSQSQVPPGP-SPAAPGQSQQRIHTPPSQSQLQS 336
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|...
gi 968121920   237 LSYPGGFPGGPAQMAGP-------------PQPQKKLDPD--SIPSPIQVIEN 274
Cdd:pfam03154  337 QQPPREQPLPPAPLSMPhikpppttpipqlPNPQSHKHPPhlSGPSPFQMNSN 389
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
99-290 3.08e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 57.85  E-value: 3.08e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    99 PYQPSAQS-SYPGPISTSSVTQLGSQLS---AMQINSYGSGMAPPSQGPPgPLS--ATSLQTPPRPPQPSILQPgSQVLP 172
Cdd:pfam03154  364 PQLPNPQShKHPPHLSGPSPFQMNSNLPpppALKPLSSLSTHHPPSAHPP-PLQlmPQSQQLPPPPAQPPVLTQ-SQSLP 441
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   173 P-----PPTTLNGPGASPLPLPMYrPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQA--ANSGPQMAGAQLSYPggfpg 245
Cdd:pfam03154  442 PpaashPPTSGLHQVPSQSPFPQH-PFVPGGPPPITPPSGPPTSTSSAMPGIQPPSSAsvSSSGPVPAAVSCPLP----- 515
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|..
gi 968121920   246 gPAQMAGPP-----QPQKKLDPDSIPSPIQVIEN--DRASRGGQVYATNTRG 290
Cdd:pfam03154  516 -PVQIKEEAldeaeEPESPPPPPRSPSPEPTVVNtpSHASQSARFYKHLDRG 566
PHA03247 PHA03247
large tegument protein UL36; Provisional
10-262 3.47e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 58.03  E-value: 3.47e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   10 PPYSQPQPGIGLSPPHYGHYGDPSHTASPTGMMKPAGPLGATATRGMLPPGPPPPGPHQFGQNGAHATGHPPQRFPGPPP 89
Cdd:PHA03247 2741 PPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPP 2820
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   90 VNNVASSHAP---YQPSAQSSYPGPISTSSVTQlGSQLSAMQINSYGSGMAPPSQ-------------GPPGPLSATSLQ 153
Cdd:PHA03247 2821 AASPAGPLPPptsAQPTAPPPPPGPPPPSLPLG-GSVAPGGDVRRRPPSRSPAAKpaaparppvrrlaRPAVSRSTESFA 2899
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  154 TPPRPPQPsilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLsgPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPQMA 233
Cdd:PHA03247 2900 LPPDQPER----PPQPQAPPPPQPQPQPPPPPQPQPPPPPPPR--PQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVA 2973
                         250       260
                  ....*....|....*....|....*....
gi 968121920  234 GAQLSYPGGFPGGPAQMAGPPQPQKKLDP 262
Cdd:PHA03247 2974 VPRFRVPQPAPSREAPASSTPPLTGHSLS 3002
PHA03247 PHA03247
large tegument protein UL36; Provisional
138-350 2.70e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 54.94  E-value: 2.70e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  138 PPSQGP---PGPLSATS-LQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPP--PL 211
Cdd:PHA03247 2701 PPPPPPtpePAPHALVSaTPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAagPP 2780
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  212 PGQTLGAGYPPQQAANSGP---QMAGAQLSYPGGFPG-----GPAQMAGPPQPQKKLDPDSIPSPIQVIENDRAS--RGG 281
Cdd:PHA03247 2781 RRLTRPAVASLSESRESLPspwDPADPPAAVLAPAAAlppaaSPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSvaPGG 2860
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  282 QVYATNTRGQIPPLVTTDCMIQDQGNASPRFIRcTTYCFPCTSD-MAKQAQIPLAAVIKPFATIPSNESP 350
Cdd:PHA03247 2861 DVRRRPPSRSPAAKPAAPARPPVRRLARPAVSR-STESFALPPDqPERPPQPQAPPPPQPQPQPPPPPQP 2929
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
103-249 1.27e-06

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 49.65  E-value: 1.27e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   103 SAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRP--PQPSILQPGS---QVLPPPPTT 177
Cdd:pfam15240   15 SAQSSSEDVSQEDSPSLISEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPggPQQPPPQGGKqkpQGPPPQGGP 94
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 968121920   178 LNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQ-QAANSGPQMAGAQLSYPGGFPGGPAQ 249
Cdd:pfam15240   95 RPPPGKPQGPPPQGGNQQQGPPPPGKPQGPPPQGGGPPPQGGNQQGpPPPPPGNPQGPPQRPPQPGNPQGPPQ 167
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
93-255 2.20e-06

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 51.96  E-value: 2.20e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    93 VASSHAPYQPSAQSSYPGPIS-TSSVTQLGSQLSAMQINS---YGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGS 168
Cdd:pfam09770  171 AAPAPAPQPAAQPASLPAPSRkMMSLEEVEAAMRAQAKKPaqqPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQP 250
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   169 QVLPP------PPTTLNGPGASPLPLPMYRPdglsGPPPPNAQYQPPPLPGQTL------------GAGYPPQQAANSGP 230
Cdd:pfam09770  251 QQPQQhpgqghPVTILQRPQSPQPDPAQPSI----QPQAQQFHQQPPPVPVQPTqilqnpnrlsaaRVGYPQNPQPGVQP 326
                          170       180
                   ....*....|....*....|....*
gi 968121920   231 QMAGAQLSYPGGFPGGPAQMAGPPQ 255
Cdd:pfam09770  327 APAHQAHRQQGSFGRQAPIITHPQQ 351
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
122-270 2.96e-06

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 48.50  E-value: 2.96e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   122 SQLSAMQINSYGSGMAPPSQGPPGPLSatslQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPlplpmyrPDGLSGPPP 201
Cdd:pfam15240   30 SLISEEEGQSQQGGQGPQGPPPGGFPP----QPPASDDPPGPPPPGGPQQPPPQGGKQKPQGPP-------PQGGPRPPP 98
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 968121920   202 PNAQYQPPPLPGQTLGAGYPPQQaaNSGPQMAGAQLSYPGGFPGGPAQMAGPPQ--PQKKLDPDSIPSPIQ 270
Cdd:pfam15240   99 GKPQGPPPQGGNQQQGPPPPGKP--QGPPPQGGGPPPQGGNQQGPPPPPPGNPQgpPQRPPQPGNPQGPPQ 167
dnaA PRK14086
chromosomal replication initiator protein DnaA;
98-262 3.74e-06

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 50.98  E-value: 3.74e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   98 APYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRP----PQPSILQPGSQVLPP 173
Cdd:PRK14086   95 PAPPPPHARRTSEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTARPAYPAYQQRPEPgawpRAADDYGWQQQRLGF 174
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  174 PPTTLNGPGASPLPLPMY--------RPDGLSGPPP---PNAQYQPP-----PLPGQTLGAGYPPQqAANSGPQMAGAQL 237
Cdd:PRK14086  175 PPRAPYASPASYAPEQERdrepydagRPEYDQRRRDydhPRPDWDRPrrdrtDRPEPPPGAGHVHR-GGPGPPERDDAPV 253
                         170       180       190
                  ....*....|....*....|....*....|
gi 968121920  238 -----SYPGGFPGGPAQMAGPPQPQKKLDP 262
Cdd:PRK14086  254 vpirpSAPGPLAAQPAPAPGPGEPTARLNP 283
SOBP pfam15279
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ...
90-231 5.06e-06

Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.


Pssm-ID: 464609 [Multi-domain]  Cd Length: 325  Bit Score: 49.81  E-value: 5.06e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    90 VNNVASSHAPYQPSAQSSYPGPISTSSVT----------QLGSQLSAMQINSYGSGMAPPSQGPPGPLSAT----SLQTP 155
Cdd:pfam15279  119 VASSSKLLAPKPHEPPSLPPPPLPPKKGRrhrpglhpplGRPPGSPPMSMTPRGLLGKPQQHPPPSPLPAFmepsSMPPP 198
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   156 PRPPQPSILQPGSQVLPP------PPTTLNGPGASPlPLPMYRPD-GLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANS 228
Cdd:pfam15279  199 FLRPPPSIPQPNSPLSNPmlpgigPPPKPPRNLGPP-SNPMHRPPfSPHHPPPPPTPPGPPPGLPPPPPRGFTPPFGPPF 277

                   ...
gi 968121920   229 GPQ 231
Cdd:pfam15279  278 PPV 280
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
94-227 9.42e-06

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 49.65  E-value: 9.42e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    94 ASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSA---------TSLQTPPRPPQPSIL 164
Cdd:pfam09770  205 AQAKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGhpvtilqrpQSPQPDPAQPSIQPQ 284
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 968121920   165 QPGSQVLPPP----PTTL----NGPGASPLPLPMYRPDGlSGPPPPNAQYQPPPLPGQTLGAGYPPQQAAN 227
Cdd:pfam09770  285 AQQFHQQPPPvpvqPTQIlqnpNRLSAARVGYPQNPQPG-VQPAPAHQAHRQQGSFGRQAPIITHPQQLAQ 354
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
102-232 9.60e-06

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 49.42  E-value: 9.60e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   102 PSAQSSYPGPISTS-SVTQLGSQLSAMQINSY-----GSGMAPPSQGppgplsatslqtPPRPPQPSILQPGSQVLPPPP 175
Cdd:TIGR01628  381 RMRQLPMGSPMGGAmGQPPYYGQGPQQQFNGQplgwpRMSMMPTPMG------------PGGPLRPNGLAPMNAVRAPSR 448
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 968121920   176 TTLNGPGASPLPLPMYRPDGLSGPPPPNAQyQPPPLPGQTLGAGYPPQQAANSGPQM 232
Cdd:TIGR01628  449 NAQNAAQKPPMQPVMYPPNYQSLPLSQDLP-QPQSTASQGGQNKKLAQVLASATPQM 504
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
91-262 1.18e-05

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 49.24  E-value: 1.18e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    91 NNVASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMA------PPSQGPPGPLSATSLQTPPRPPQpsIL 164
Cdd:pfam09606  122 NLLASLGRPQMPMGGAGFPSQMSRVGRMQPGGQAGGMMQPSSGQPGSgtpnqmGPNGGPGQGQAGGMNGGQQGPMG--GQ 199
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   165 QPGSQVLPPPPTTLNGPGAsplplpMYRPDGLSGPPPPNAQYQPPPlpgQTLGAGYPPQQAANSGP-QMAGAQLS-YPGG 242
Cdd:pfam09606  200 MPPQMGVPGMPGPADAGAQ------MGQQAQANGGMNPQQMGGAPN---QVAMQQQQPQQQGQQSQlGMGINQMQqMPQG 270
                          170       180
                   ....*....|....*....|....*
gi 968121920   243 FP-----GGPAQMAGPPQPQKKLDP 262
Cdd:pfam09606  271 VGggagqGGPGQPMGPPGQQPGAMP 295
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
133-268 1.28e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 49.21  E-value: 1.28e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  133 GSGMAPPSQGPPGPlSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLP 212
Cdd:PRK07764  623 APAAPAPAGAAAAP-AEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQP 701
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 968121920  213 GQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSP 268
Cdd:PRK07764  702 APAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQ 757
PRK13729 PRK13729
conjugal transfer pilus assembly protein TraB; Provisional
117-230 1.45e-05

conjugal transfer pilus assembly protein TraB; Provisional


Pssm-ID: 184281 [Multi-domain]  Cd Length: 475  Bit Score: 48.67  E-value: 1.45e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  117 VTQLGSQLSAMQINSYGSGMAPP-SQGPPGPLSATSLQTPPRPPQPSilqPGSQVLPPppttlNGPGASPLPLPMYrpDG 195
Cdd:PRK13729  106 IEKLGQDNAALAEQVKALGANPVtATGEPVPQMPASPPGPEGEPQPG---NTPVSFPP-----QGSVAVPPPTAFY--PG 175
                          90       100       110
                  ....*....|....*....|....*....|....*..
gi 968121920  196 LSGPPPPNAQYQPPPLPG--QTLGAGYPPQQAANSGP 230
Cdd:PRK13729  176 NGVTPPPQVTYQSVPVPNriQRKTFTYNEGKKGPSLP 212
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
30-285 1.83e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.83  E-value: 1.83e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   30 GDPSHTASPTGMMKPAGPLGATATRGMLPPGPPPPGPHQFGQNGAHATGHPPQRFPGPPPVNNVASSHAPYQPSAQSSYP 109
Cdd:PRK07764  589 GPAPGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDG 668
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  110 GPISTSSVTQLgsqlsamqinsyGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPttlnGPGASPLPLP 189
Cdd:PRK07764  669 WPAKAGGAAPA------------APPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPP----QAAQGASAPS 732
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  190 MYRPDGLSGPPPPnaqyQPPPLPGQTLGAGYPPQQAANSGPqmagaqlsypggfPGGPAQMAGPPQPQKKLDPDsIPSPi 269
Cdd:PRK07764  733 PAADDPVPLPPEP----DDPPDPAGAPAQPPPPPAPAPAAA-------------PAAAPPPSPPSEEEEMAEDD-APSM- 793
                         250
                  ....*....|....*.
gi 968121920  270 qvieNDRASRGGQVYA 285
Cdd:PRK07764  794 ----DDEDRRDAEEVA 805
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
134-294 1.91e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 48.72  E-value: 1.91e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  134 SGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGA---SPLPLPMYRPDGLSGPPPPNAqyqPPP 210
Cdd:PRK12323  376 TAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARrspAPEALAAARQASARGPGGAPA---PAP 452
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  211 LPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLdPDSIPSPiQVIENDRASRGGQVYATNTRG 290
Cdd:PRK12323  453 APAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEEL-PPEFASP-APAQPDAAPAGWVAESIPDPA 530

                  ....
gi 968121920  291 QIPP 294
Cdd:PRK12323  531 TADP 534
PHA03247 PHA03247
large tegument protein UL36; Provisional
86-268 2.03e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.78  E-value: 2.03e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   86 GPPPVNNVASSHAPYQPSA--------QSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPP----GPLSATSLQ 153
Cdd:PHA03247 2598 PRAPVDDRGDPRGPAPPSPlppdthapDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPrrarRLGRAAQAS 2677
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  154 TPPRPPQPSILQPGSQVL-------PPPPTTLNGPGA--SPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQ 224
Cdd:PHA03247 2678 SPPQRPRRRAARPTVGSLtsladppPPPPTPEPAPHAlvSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARP 2757
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*.
gi 968121920  225 AANSGPqmAGAQLSYPGGFPGGPAQMAGPPQPQKKLDP--DSIPSP 268
Cdd:PHA03247 2758 ARPPTT--AGPPAPAPPAAPAAGPPRRLTRPAVASLSEsrESLPSP 2801
MISS pfam15822
MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic ...
139-256 2.34e-05

MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic MAPK-interacting and spindle-stabilising protein-like proteins. MISS is rich in prolines and has four potential MAPK-phosphorylation sites, a MAPK-docking site, a PEST sequence (PEST motif) and a bipartite nuclear localization signal. The endogenous protein accumulates during mouse meiotic maturation and is found as discrete dots on the MII spindle. MISS is the first example of a physiological MAPK-substrate that is stabilized in MII that specifically regulates MII spindle integrity during the CSF arrest.


Pssm-ID: 318115 [Multi-domain]  Cd Length: 238  Bit Score: 46.90  E-value: 2.34e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   139 PSQGPPGPLSATSLQTPPRPPQ--PSILQPGSQVLPPPPT----TLNGPGASPLPLPMYRPDGLSGPPPpNAQYQPPPLP 212
Cdd:pfam15822   26 PPQGWPGSNPWNNPSAPPAVPSglPPSTAPSTVPFGPAPTgmypSIPLTGPSPGPPAPFPPSGPSCPPP-GGPYPAPTVP 104
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....
gi 968121920   213 GQTLGAGYPPqqaansgPQMAGAQLSYPGGFPGGPAQmAGPPQP 256
Cdd:pfam15822  105 GPGPIGPYPT-------PNMPFPELPRPYGAPTDPAA-AAPSGP 140
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
104-294 2.57e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.44  E-value: 2.57e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  104 AQSSYPGPISTSSVTQLGSQLsamQINsYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSilQPGSQVLPPPPTTLNGPGA 183
Cdd:PRK07764  562 ASPGNAEVLVTALAEELGGDW---QVE-AVVGPAPGAAGGEGPPAPASSGPPEEAARPA--APAAPAAPAAPAPAGAAAA 635
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  184 SPLPLPMYRPDGLSGPPPPNAQYQPPPlPGQTLGAGYPPQQAANSGPQMAGAqlsyPGGFPGGPAQMAGPPQPQKKLDPD 263
Cdd:PRK07764  636 PAEASAAPAPGVAAPEHHPKHVAVPDA-SDGGDGWPAKAGGAAPAAPPPAPA----PAAPAAPAGAAPAQPAPAPAATPP 710
                         170       180       190
                  ....*....|....*....|....*....|.
gi 968121920  264 SIPSPIQVIENDRASRGGQVYATNTRGQIPP 294
Cdd:PRK07764  711 AGQADDPAAQPPQAAQGASAPSPAADDPVPL 741
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
137-281 2.99e-05

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 48.24  E-value: 2.99e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  137 APPSQGPPGPLSATSLQTPPRPPQPsilqpgsqvLPPPPTTlnGPGASplplPMYRPDGLSGPPPPNAQYQPPPLPGQTL 216
Cdd:PHA03307   99 SPAREGSPTPPGPSSPDPPPPTPPP---------ASPPPSP--APDLS----EMLRPVGSPGPPPAASPPAAGASPAAVA 163
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 968121920  217 GAGYPPQQAA---NSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRASRGG 281
Cdd:PHA03307  164 SDAASSRQAAlplSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADD 231
FAP pfam07174
Fibronectin-attachment protein (FAP); This family contains bacterial fibronectin-attachment ...
137-228 4.59e-05

Fibronectin-attachment protein (FAP); This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix.


Pssm-ID: 429334  Cd Length: 301  Bit Score: 46.46  E-value: 4.59e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   137 APPSQGPPGPLSATSLQTPPRPPQPsilqPGSQVLPPPPTTLNGPGASPlplpmyrpdglsGPPPPNAQYQPPPLPgqtl 216
Cdd:pfam07174   39 ADPEPAPPPPSTATAPPAPPPPPPA----PAAPAPPPPPAAPNAPNAPP------------PPADPNAPPPPPADP---- 98
                           90
                   ....*....|..
gi 968121920   217 GAGYPPQQAANS 228
Cdd:pfam07174   99 NAPPPPAVDPNA 110
DUF3824 pfam12868
Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It ...
172-256 4.92e-05

Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It is proline-rich, and the function is not known.


Pssm-ID: 372351 [Multi-domain]  Cd Length: 145  Bit Score: 44.35  E-value: 4.92e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   172 PPPPTtlnGPGASPLPlpmYRPDGLSGPPPPNAQYQPPPLPgqTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQMA 251
Cdd:pfam12868   62 PPSPA---GPYASQGQ---YYPETNYFPPPPGSTPQPPVDP--QPNAPPPPYNPADYPPPPGAAPPPQPYQYPPPPGPDP 133

                   ....*
gi 968121920   252 GPPQP 256
Cdd:pfam12868  134 YAPRP 138
DUF3729 pfam12526
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ...
136-221 7.18e-05

Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.


Pssm-ID: 372164 [Multi-domain]  Cd Length: 115  Bit Score: 43.14  E-value: 7.18e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   136 MAPPSQGPPGPLSATSLQTPPRPPQPSilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPppnaqyQPPPLPGQT 215
Cdd:pfam12526   39 PPPPVGDPRPPVVDTPPPVSAVWVLPP---PSEPAAPEPDLVPPVTGPAGPPSPLAPPAPAQKPP------LPPPRPQRR 109

                   ....*.
gi 968121920   216 LGAGYP 221
Cdd:pfam12526  110 LLHTYP 115
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
123-257 7.30e-05

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 46.93  E-value: 7.30e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   123 QLSAMQINSYGSGMAPPSQGPPGPLSAtsLQTPP----RPPQPSILQPGSQvlPPPPTTLNGPG-ASPLPLPMYRPD--- 194
Cdd:pfam09606   59 QQQQPQGGQGNGGMGGGQQGMPDPINA--LQNLAgqgtRPQMMGPMGPGPG--GPMGQQMGGPGtASNLLASLGRPQmpm 134
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 968121920   195 GLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANS------GPQMAGAQLSYPGGFPGGPAQMAGPPQPQ 257
Cdd:pfam09606  135 GGAGFPSQMSRVGRMQPGGQAGGMMQPSSGQPGSgtpnqmGPNGGPGQGQAGGMNGGQQGPMGGQMPPQ 203
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
148-294 8.95e-05

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 46.54  E-value: 8.95e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   148 SATSLQTPPRPPQPSILQPGSQVLPPPPTTL-NGPGASPLPlPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAA 226
Cdd:pfam09606   56 KAAQQQQPQGGQGNGGMGGGQQGMPDPINALqNLAGQGTRP-QMMGPMGPGPGGPMGQQMGGPGTASNLLASLGRPQMPM 134
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 968121920   227 NSG----PQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSPIQViendrASRGGQVYATNTRGQIPP 294
Cdd:pfam09606  135 GGAgfpsQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQG-----QAGGMNGGQQGPMGGQMP 201
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
94-268 1.34e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 46.02  E-value: 1.34e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   94 ASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPP 173
Cdd:PRK12323  400 AAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAA 479
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  174 PPttLNGPGASPLPLPmyrpdglSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPggFPGGPAQMAGP 253
Cdd:PRK12323  480 PA--RAAPAAAPAPAD-------DDPPPWEELPPEFASPAPAQPDAAPAGWVAESIPDPATADPDDA--FETLAPAPAAA 548
                         170
                  ....*....|....*
gi 968121920  254 PQPQKKLDPDSIPSP 268
Cdd:PRK12323  549 PAPRAAAATEPVVAP 563
PHA02682 PHA02682
ORF080 virion core protein; Provisional
98-264 1.39e-04

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 44.85  E-value: 1.39e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   98 APYQPSAQSSYPGPISTSSVTQLGSQL-SAMQINS----YGSGMAPPSQGPPGPLSATSLQTP-PRPPQPSILQPGSQVL 171
Cdd:PHA02682   36 APAAPCPPDADVDPLDKYSVKEAGRYYqSRLKANSacmqRPSGQSPLAPSPACAAPAPACPACaPAAPAPAVTCPAPAPA 115
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  172 PPPPTTLNGPGASPLPLPMyRPDglSGPPPPNAQYQP-PPLP-GQTLGAGYP---------PQQAANSGPQMAGAQLSYP 240
Cdd:PHA02682  116 CPPATAPTCPPPAVCPAPA-RPA--PACPPSTRQCPPaPPLPtPKPAPAAKPiflhnqlppPDYPAASCPTIETAPAASP 192
                         170       180
                  ....*....|....*....|....
gi 968121920  241 ggfpggpaqMAGPPQPQKKLDPDS 264
Cdd:PHA02682  193 ---------VLEPRIPDKIIDADN 207
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
133-265 1.46e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 45.75  E-value: 1.46e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  133 GSGMAPPSQGPPGPLSATSLQTPPRPPQPsilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLP 212
Cdd:PRK07764  389 GGAGAPAAAAPSAAAAAPAAAPAPAAAAP----AAAAAPAPAAAPQPAPAPAPAPAPPSPAGNAPAGGAPSPPPAAAPSA 464
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|...
gi 968121920  213 GQTLGAGYPPQQAANSGPQMAGAQLSypggfPGGPAQMAGPPQPQKKLDPDSI 265
Cdd:PRK07764  465 QPAPAPAAAPEPTAAPAPAPPAAPAP-----AAAPAAPAAPAAPAGADDAATL 512
DUF2076 pfam09849
Uncharacterized protein conserved in bacteria (DUF2076); This domain, found in various ...
179-251 1.47e-04

Uncharacterized protein conserved in bacteria (DUF2076); This domain, found in various hypothetical prokaryotic proteins, has no known function. The domain, however, is found in various periplasmic ligand-binding sensor proteins.


Pssm-ID: 430876  Cd Length: 263  Bit Score: 44.73  E-value: 1.47e-04
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 968121920   179 NGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPlPGQtlGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQMA 251
Cdd:pfam09849   97 GGSQSRPPPPPQARPAWPAGQAPGQPQPYPGQ-PGY--AQQGQPQYGQPAQPPRGPWGPGGGGGFLGGALQTA 166
COG3416 COG3416
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
117-251 1.89e-04

Uncharacterized conserved protein, DUF2076 domain [Function unknown];


Pssm-ID: 442642 [Multi-domain]  Cd Length: 237  Bit Score: 44.24  E-value: 1.89e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  117 VTQLGSQLSAMQinsygsgmAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPttlngpgasplplpmyrpdgl 196
Cdd:COG3416    64 IQELEAQLAQLQ--------QQQPQSSGGFLSGLFGGGQRPPPAPQPSQPGPQQQPAPP--------------------- 114
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....*
gi 968121920  197 sgPPPPNAQYQPPPLPGQtlgagypPQQAAnsgPQMAGAQlsyPGGFPGGPAQMA 251
Cdd:COG3416   115 --SGPWGQAAPQQPGYGQ-------PQYGQ---PAAGPSG---GGGFLGGALQTA 154
Drf_FH1 pfam06346
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ...
143-212 2.41e-04

Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.


Pssm-ID: 461881 [Multi-domain]  Cd Length: 157  Bit Score: 42.55  E-value: 2.41e-04
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   143 PPGPLSATSLQTPPRPPQPsilqpGSQVLPPPPTTLngPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLP 212
Cdd:pfam06346   83 PPPPLPGGAGIPPPPPPLP-----GGAGVPPPPPPL--PGGPGIPPPPPFPGGPGIPPPPPGMGMPPPPP 145
BimA_first NF040984
trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular ...
94-211 2.43e-04

trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular motility protein A) is a trimeric autotransporter, homologous in its C-terminal half to a number of trimeric autotransporter adhesins. It is a virulence factor that nucleates actin, so that actin polymerization can drive escape by B. pseudomallei out of one cell and into a neighboring cell. HMM NF040983 describes a homolog with similar activity but substantial difference in sequence architecture in the N-terminal region.


Pssm-ID: 468914 [Multi-domain]  Cd Length: 517  Bit Score: 44.86  E-value: 2.43e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   94 ASSHAPYQPSaqssyPGPISTSSVTQLGSQLSAMQINSYGSgmappsqgPPGPLSATSLQTPPrpPQPSilqPGSQVLPP 173
Cdd:NF040984    6 SSSHAPDAPK-----PSSIATTLCRALASLSLGLSMDAEAN--------PPEPPGGTNIPVPP--PMPG---GGANIPVP 67
                          90       100       110
                  ....*....|....*....|....*....|....*...
gi 968121920  174 PPTTLNGPGASPLPLPmyrPDGLSGPPPpnaqyQPPPL 211
Cdd:NF040984   68 PPMPGGGANIPPPPPP---PGGIGGATP-----SPPPL 97
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
99-276 2.62e-04

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 45.05  E-value: 2.62e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   99 PYQPSAQSSyPGPISTSSvtqlgSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSIL-----QPGSQVLPP 173
Cdd:COG5180   278 PGLPVLEAG-SEPQSDAP-----EAETARPIDVKGVASAPPATRPVRPPGGARDPGTPRPGQPTERpagvpEAASDAGQP 351
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  174 PPTTLNGPGASPL-PLPMYRPD------------GLSGPPPPNAQYQPPPLPGQTLGAGYPP----QQAANSGPQMAGAQ 236
Cdd:COG5180   352 PSAYPPAEEAVPGkPLEQGAPRpgssggdgapfqPPNGAPQPGLGRRGAPGPPMGAGDLVQAaldgGGRETASLGGAAGG 431
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|
gi 968121920  237 LSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDR 276
Cdd:COG5180   432 AGQGPKADFVPGDAESVSGPAGLADQAGAAASTAMADFVA 471
PHA03379 PHA03379
EBNA-3A; Provisional
98-333 3.69e-04

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 44.66  E-value: 3.69e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   98 APYQPSAQSSYPGPISTSSVTQLGSQlsamqinsygSGMAP--PSQGPPGPLSATSL--QTP-----PRP-PQPSILQPG 167
Cdd:PHA03379  434 ATSHGSAQVPEPPPVHDLEPGPLHDQ----------HSMAPcpVAQLPPGPLQDLEPgdQLPgvvqdGRPaCAPVPAPAG 503
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  168 SQVLPPPPTTLNGPGASPLPlpmYRPDGLSGP--PPPNAQYQPPPLPGQTLGAGYPPqqAANSGPQMAGAQLSYPGGFPg 245
Cdd:PHA03379  504 PIVRPWEASLSQVPGVAFAP---VMPQPMPVEpvPVPTVALERPVCPAPPLIAMQGP--GETSGIVRVRERWRPAPWTP- 577
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  246 gpaqmaGPPQPqkkldpdsiPSPIQVieNDRASRG---GQVYATNTRGQIPPLVTTDCMIQDQGNASPRFIRCTTYCFPC 322
Cdd:PHA03379  578 ------NPPRS---------PSQMSV--RDRLARLraeAQPYQASVEVQPPQLTQVSPQQPMEYPLEPEQQMFPGSPFSQ 640
                         250
                  ....*....|.
gi 968121920  323 TSDMAKQAQIP 333
Cdd:PHA03379  641 VADVMRAGGVP 651
PRK10263 PRK10263
DNA translocase FtsK; Provisional
100-260 6.01e-04

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 43.92  E-value: 6.01e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  100 YQPSAQSsYPGPISTSSVTQLGSQLSAMQINSYG----SGMAPPS--------------QGPPGPLSATSLQTPPRPPQP 161
Cdd:PRK10263  680 YQHDVPV-NAEDADAAAEAELARQFAQTQQQRYSgeqpAGANPFSlddfefspmkalldDGPHEPLFTPIVEPVQQPQQP 758
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  162 SILQPGSQVL--PPPPTTLNGPGASPLPlPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQlsy 239
Cdd:PRK10263  759 VAPQQQYQQPqqPVAPQPQYQQPQQPVA-PQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQ--- 834
                         170       180
                  ....*....|....*....|.
gi 968121920  240 pggfpggpAQMAgpPQPQKKL 260
Cdd:PRK10263  835 --------QPVA--PQPQDTL 845
DUF3729 pfam12526
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ...
139-214 7.20e-04

Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.


Pssm-ID: 372164 [Multi-domain]  Cd Length: 115  Bit Score: 40.45  E-value: 7.20e-04
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 968121920   139 PSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQ 214
Cdd:pfam12526   29 FSPPESAHPDPPPPVGDPRPPVVDTPPPVSAVWVLPPPSEPAAPEPDLVPPVTGPAGPPSPLAPPAPAQKPPLPPP 104
Gag_spuma pfam03276
Spumavirus gag protein;
134-272 7.80e-04

Spumavirus gag protein;


Pssm-ID: 460872 [Multi-domain]  Cd Length: 614  Bit Score: 43.58  E-value: 7.80e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   134 SGMAPPSQGPPGPLSATSlqtppRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPG 213
Cdd:pfam03276  176 AEISPGAQGGIPPGASFS-----GLPSLPAIGGIHLPAIPGIHARAPPGNIARSLGDDIMPSLGDAGMPQPRFAFHPGNP 250
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   214 QTLGAGYPPQQAANSGPQmagAQLSYPG-GFPGGPAQMAGPPQPQKKLDPDSIPSPIQVI 272
Cdd:pfam03276  251 FAEAEGHPFAEAEGERPR---DIPRAPRiDAPSAPAIPAIQPIAPPMIPPIGAPIPIPHG 307
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
160-262 8.98e-04

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 41.18  E-value: 8.98e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   160 QPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPqmaGAQLSY 239
Cdd:pfam15240   28 SPSLISEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPGGPQQPPPQGGKQKPQGPPPQGGPRPPP---GKPQGP 104
                           90       100
                   ....*....|....*....|...
gi 968121920   240 PggfPGGPAQMAGPPQPQKKLDP 262
Cdd:pfam15240  105 P---PQGGNQQQGPPPPGKPQGP 124
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
153-260 9.21e-04

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 43.26  E-value: 9.21e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   153 QTPPRPPQPSILQP--GSQVLPP----------PPTTLNGPGASPLPLPMY--RPDGLSGPPPPNAQYQPP----PLPGQ 214
Cdd:TIGR01628  377 QLQPRMRQLPMGSPmgGAMGQPPyygqgpqqqfNGQPLGWPRMSMMPTPMGpgGPLRPNGLAPMNAVRAPSrnaqNAAQK 456
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 968121920   215 TLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGP----AQM--AGPPQPQKKL 260
Cdd:TIGR01628  457 PPMQPVMYPPNYQSLPLSQDLPQPQSTASQGGQnkklAQVlaSATPQMQKQV 508
SP6_N cd22544
N-terminal domain of transcription factor Specificity Protein (SP) 6; Specificity Proteins ...
114-254 9.48e-04

N-terminal domain of transcription factor Specificity Protein (SP) 6; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP6, also known as epiprofin, shows specific expression pattern in hair follicles and the apical ectodermal ridge (AER) of the developing limbs. SP6 null mice are nude and show defects in skin, teeth, limbs (syndactyly and oligodactyly), and lung alveoli. SP6 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP6.


Pssm-ID: 411693 [Multi-domain]  Cd Length: 245  Bit Score: 42.22  E-value: 9.48e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  114 TSSVTQLGSQLSAMQINSYGSGMAPPSQ-----------GPPGPLSATSLQTPPRPPQPSILQPGSQVlPPPPTTLNGPG 182
Cdd:cd22544     3 TAVCGSLGNQHSETPRASPPTLDLQPLQpyqihsspeagDYPSPLQPTELQSLPLGPGVDFSARESYE-PHSSRRTCLDL 81
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  183 ASPLPLPMYRPdgLSGPPPPNAQ-YQP---PPLPGQTLGAG----------------YPPQQAANSGPQMAGAQLSYPGG 242
Cdd:cd22544    82 ESDLPLGPFPK--LLHPPPDMAHpYESwfrPPHPGGSGEEGgvpswwdlhagsswmdLQHGQGGLQSPGPPGGLQPPLGG 159
                         170
                  ....*....|..
gi 968121920  243 FpGGPAQMAGPP 254
Cdd:cd22544   160 Y-GSEHQLCGPP 170
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
93-217 1.08e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 42.78  E-value: 1.08e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   93 VASSHAPYQPSAQSSYPGPIStssVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPsilqpGSQVLP 172
Cdd:PRK14951  375 PAEKKTPARPEAAAPAAAPVA---QAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAA-----APAAVA 446
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*...
gi 968121920  173 PPPTTLNGPGASPLPLPMY---RPDGLSGPPPPNAQYQPPPLPGQTLG 217
Cdd:PRK14951  447 LAPAPPAQAAPETVAIPVRvapEPAVASAAPAPAAAPAAARLTPTEEG 494
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
94-269 1.28e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 42.85  E-value: 1.28e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   94 ASSHAPYQPSAQSSYPGPISTSSVTQL---GSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQT-PPRPPQPSILQPgsq 169
Cdd:PHA03307  101 AREGSPTPPGPSSPDPPPPTPPPASPPpspAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASdAASSRQAALPLS--- 177
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  170 vLPPPPTTLNGPGASPLPLpmyRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQ 249
Cdd:PHA03307  178 -SPEETARAPSSPPAEPPP---STPPAAASPRPPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSSESSGCGWGPEN 253
                         170       180
                  ....*....|....*....|
gi 968121920  250 MAGPPQPQKKLDPDSIPSPI 269
Cdd:PHA03307  254 ECPLPRPAPITLPTRIWEAS 273
BimA_second NF040983
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia ...
139-247 1.35e-03

trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia intracellular motility A), WP_004266405.1-like proteins in Burkholderia mallei or B. pseudomallei. The term BimA has also been used for WP_011205626.1-like homologs that have a very different N-terminal half.


Pssm-ID: 468913 [Multi-domain]  Cd Length: 382  Bit Score: 42.20  E-value: 1.35e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  139 PSQGPPGPlsatslqtPPRPPQPSILQPGSQVLPPPPttlngPGASPLPLPmyrpdglsgPPPPNAQYQPPPlPGQTLGA 218
Cdd:NF040983   86 PNKVPPPP--------PPPPPPPPPPPTPPPPPPPPP-----PPPPPSPPP---------PPPPSPPPSPPP-PTTTPPT 142
                          90       100
                  ....*....|....*....|....*....
gi 968121920  219 GYPPQQAANSgPQMAGAQlsyPGGFPGGP 247
Cdd:NF040983  143 RTTPSTTTPT-PSMHPIQ---PTQLPSIP 167
PRK14971 PRK14971
DNA polymerase III subunit gamma/tau;
121-226 1.51e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237874 [Multi-domain]  Cd Length: 614  Bit Score: 42.46  E-value: 1.51e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  121 GSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLP-PPPTTLNGPGASPLPLPMYRPDglSGP 199
Cdd:PRK14971  371 GGRGPKQHIKPVFTQPAAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQPAGtPPTVSVDPPAAVPVNPPSTAPQ--AVR 448
                          90       100       110
                  ....*....|....*....|....*....|...
gi 968121920  200 PPPNAQYQPPPLPGQ------TLGAGYPPQQAA 226
Cdd:PRK14971  449 PAQFKEEKKIPVSKVsslgpsTLRPIQEKAEQA 481
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
132-257 1.79e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 42.28  E-value: 1.79e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  132 YGSGMAPPSQGPPGPLSATslqTPPRPPQPsilqpgsqvlPPPPTTLNGPGASPLPLPMYRPDglSGPPPPNAQYQPPPL 211
Cdd:PRK07764  387 VAGGAGAPAAAAPSAAAAA---PAAAPAPA----------AAAPAAAAAPAPAAAPQPAPAPA--PAPAPPSPAGNAPAG 451
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*.
gi 968121920  212 PGQTLGAGypPQQAANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQ 257
Cdd:PRK07764  452 GAPSPPPA--AAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPA 495
Prog_receptor pfam02161
Progesterone receptor;
99-200 1.83e-03

Progesterone receptor;


Pssm-ID: 460470  Cd Length: 564  Bit Score: 42.22  E-value: 1.83e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    99 PYQPSAQSSYPGPIS------TSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLsatslqtPPRPPQPSILQPGSQVLP 172
Cdd:pfam02161  424 PLPPRAPSSRPGEAAvaaapaSASVSSASSSGSTLECILYKAEGAPPQQGPFAPP-------PCKPPGAGACLLPRDGLP 496
                           90       100
                   ....*....|....*....|....*...
gi 968121920   173 PPPTTLNGPGASPlplPMYRPDGLSGPP 200
Cdd:pfam02161  497 STSASAAAAGAAP---ALYPPLGLNGLP 521
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
132-268 2.04e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 42.47  E-value: 2.04e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  132 YGS-GMAPPSQGP---PGPLSATSLQTPPRPPQ-----PSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPP 202
Cdd:PHA03307   37 SGSqGQLVSDSAElaaVTVVAGAAACDRFEPPTgpppgPGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDP 116
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  203 NAQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKL----DPDSIPSP 268
Cdd:PHA03307  117 PPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQAALplssPEETARAP 186
PRK13729 PRK13729
conjugal transfer pilus assembly protein TraB; Provisional
181-257 2.41e-03

conjugal transfer pilus assembly protein TraB; Provisional


Pssm-ID: 184281 [Multi-domain]  Cd Length: 475  Bit Score: 41.73  E-value: 2.41e-03
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 968121920  181 PGASP-LPLPMYRPDGLSGPPPPNAQYQPPPLPgqtlgAGYPPQQAANSGPQMAgaqlSYPGGFPGGPAQMAGPPQPQ 257
Cdd:PRK13729  123 LGANPvTATGEPVPQMPASPPGPEGEPQPGNTP-----VSFPPQGSVAVPPPTA----FYPGNGVTPPPQVTYQSVPV 191
PHA03377 PHA03377
EBNA-3C; Provisional
101-271 2.83e-03

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 41.58  E-value: 2.83e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  101 QPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSqgPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNG 180
Cdd:PHA03377  671 QPATQSTPPRPSWLPSVFVLPSVDAGRAQPSEESHLSSMS--PTQPISHEEQPRYEDPDDPLDLSLHPDQAPPPSHQAPY 748
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  181 PGASPLPLPMYRPDGLSGPPPPNAQY---QPPPLPGQTLgAGYPpqqaANSGPQMAGAQL-----------SYPG-GFPG 245
Cdd:PHA03377  749 SGHEEPQAQQAPYPGYWEPRPPQAPYlgyQEPQAQGVQV-SSYP----GYAGPWGLRAQHpryrhswaywsQYPGhGHPQ 823
                         170       180
                  ....*....|....*....|....*.
gi 968121920  246 GPAQMAgPPQPQKKLDPDSIPSPIQV 271
Cdd:PHA03377  824 GPWAPR-PPHLPPQWDGSAGHGQDQV 848
PPE COG5651
PPE-repeat protein [Function unknown];
70-254 2.87e-03

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 41.42  E-value: 2.87e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   70 GQNGAHATGhPPQRFPGPPPVNNVASSHAPYQPSAQSSYPGPISTSS-VTQLGSQLSAMQINSYGSGMAPPSQGPPGPLS 148
Cdd:COG5651   179 GLLGAQNAG-SGNTSSNPGFANLGLTGLNQVGIGGLNSGSGPIGLNSgPGNTGFAGTGAAAGAAAAAAAAAAAAGAGASA 257
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  149 AtslqtpprpPQPSILQPGSQVLPPPPTTLNGPGASPLPLPmyrPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAANS 228
Cdd:COG5651   258 A---------LASLAATLLNASSLGLAATAASSAATNLGLA---GSPLGLAGGGAGAAAATGLGLGAGGAAGAAGATGAG 325
                         170       180
                  ....*....|....*....|....*.
gi 968121920  229 GPQMAGAQLSYPGGFPGGPAQMAGPP 254
Cdd:COG5651   326 AALGAGAAAAAAGAAAGAGAAAAAAA 351
PHA03247 PHA03247
large tegument protein UL36; Provisional
137-226 3.97e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.46  E-value: 3.97e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  137 APPSQGPPGPLSATSLQTPPRPPQPSIlqpgsqvlPPPPTTLNGPGASPL----PLPMYRPDGLSGPPPPNAQYQPPPLP 212
Cdd:PHA03247  388 ARHAATPFARGPGGDDQTRPAAPVPAS--------VPTPAPTPVPASAPPppatPLPSAEPGSDDGPAPPPERQPPAPAT 459
                          90
                  ....*....|....
gi 968121920  213 GQTLGAGYPPQQAA 226
Cdd:PHA03247  460 EPAPDDPDDATRKA 473
Drf_FH1 pfam06346
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ...
143-254 4.52e-03

Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.


Pssm-ID: 461881 [Multi-domain]  Cd Length: 157  Bit Score: 39.08  E-value: 4.52e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   143 PPGPLSATSLQTPPRPPQPSIlqpgsqvlpPPPTTLNGPGASPLPLPMYRPDGLSGPPP-PNAQY--QPPPLPGQTLGAG 219
Cdd:pfam06346    1 PPPPPLPGDSSTIPLPPGACI---------PTPPPLPGGGGPPPPPPLPGSAAIPPPPPlPGGTSipPPPPLPGAASIPP 71
                           90       100       110
                   ....*....|....*....|....*....|....*
gi 968121920   220 YPPQQAANSGPQmagaqlsyPGGFPGGPAQMAGPP 254
Cdd:pfam06346   72 PPPLPGSTGIPP--------PPPLPGGAGIPPPPP 98
SAV_2336_NTERM NF041121
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ...
138-205 5.41e-03

SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.


Pssm-ID: 469044 [Multi-domain]  Cd Length: 473  Bit Score: 40.37  E-value: 5.41e-03
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 968121920  138 PPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDglsGPPPPNAQ 205
Cdd:NF041121   39 PPPAAPPSPPGDPPEPPAPEPAPLPAPYPGSLAPPPPPPPGPAGAAPGAALPVRVPA---PPALPNPL 103
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
3-231 7.12e-03

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 40.38  E-value: 7.12e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920     3 QQGYVATPPYSQPQPGIGLSPPHYGHYGDPShtASPTGMmkPAGPLGATATR-----------GMLPPGPPPPGPHQFGQ 71
Cdd:pfam09606  243 MQQQQPQQQGQQSQLGMGINQMQQMPQGVGG--GAGQGG--PGQPMGPPGQQpgampnvmsigDQNNYQQQQTRQQQQQQ 318
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920    72 NGAHATGHPPQRFPGPPPVNNVASshAPYQPSAQSSYPGPISTSSVTQLGSQLSAMqinsygsgMAPPSQGPPGPL-SAT 150
Cdd:pfam09606  319 GGNHPAAHQQQMNQSVGQGGQVVA--LGGLNHLETWNPGNFGGLGANPMQRGQPGM--------MSSPSPVPGQQVrQVT 388
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   151 SLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPplPGQTLGAgyPPQQAANS-- 228
Cdd:pfam09606  389 PNQFMRQSPQPSVPSPQGPGSQPPQSHPGGMIPSPALIPSPSPQMSQQPAQQRTIGQDS--PGGSLNT--PGQSAVNSpl 464

                   ...
gi 968121920   229 GPQ 231
Cdd:pfam09606  465 NPQ 467
SAV_2336_NTERM NF041121
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ...
137-214 7.50e-03

SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.


Pssm-ID: 469044 [Multi-domain]  Cd Length: 473  Bit Score: 39.99  E-value: 7.50e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  137 APPSQGPPGPLSATSLQTPPRPPQPSiLQPGSQVLPPPPTTLNGPGASP--LPLPMYRPDGLSGPPPPNAQ----YQPPP 210
Cdd:NF041121   20 APPSPEGPAPTAASQPATPPPPAAPP-SPPGDPPEPPAPEPAPLPAPYPgsLAPPPPPPPGPAGAAPGAALpvrvPAPPA 98

                  ....
gi 968121920  211 LPGQ 214
Cdd:NF041121   99 LPNP 102
PHA03201 PHA03201
uracil DNA glycosylase; Provisional
156-225 7.99e-03

uracil DNA glycosylase; Provisional


Pssm-ID: 165468  Cd Length: 318  Bit Score: 39.49  E-value: 7.99e-03
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 968121920  156 PRPPQPSilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPL-------PGQTLGAGYPPQQA 225
Cdd:PHA03201    4 ARSRSPS---PPRRPSPPRPTPPRSPDASPEETPPSPPGPGAEPPPGRAAGPAAPRrrprgcpAGVTFSSSAPPRPP 77
PHA03379 PHA03379
EBNA-3A; Provisional
94-279 8.67e-03

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 40.04  E-value: 8.67e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920   94 ASSHAPYQ----PSAQSSYPGPISTSS------VTQL----------GSQLSAmqINSYGSGMAPPSQGPPGPL----SA 149
Cdd:PHA03379  434 ATSHGSAQvpepPPVHDLEPGPLHDQHsmapcpVAQLppgplqdlepGDQLPG--VVQDGRPACAPVPAPAGPIvrpwEA 511
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 968121920  150 TSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLP--LPMYRPDGLSG---------PPP--PNAQYQPPPLPGQT- 215
Cdd:PHA03379  512 SLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPplIAMQGPGETSGivrvrerwrPAPwtPNPPRSPSQMSVRDr 591
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 968121920  216 LGAGYPPQQAANSGPQMAGAQLsyPGGFPGGPaqMAGPPQPQKKLDPDSIPSpiQVIENDRASR 279
Cdd:PHA03379  592 LARLRAEAQPYQASVEVQPPQL--TQVSPQQP--MEYPLEPEQQMFPGSPFS--QVADVMRAGG 649
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH