NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|442619636|ref|NP_001262676|]
View 

uncharacterized protein Dmel_CG14322, isoform D [Drosophila melanogaster]

Protein Classification

WD40 repeat domain-containing protein( domain architecture ID 11455410)

WD40 repeat domain-containing protein similar to proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly

CATH:  2.130.10.10
PubMed:  10322433|8090199
SCOP:  4002744

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 COG2319
WD40 repeat [General function prediction only];
1307-1566 6.71e-08

WD40 repeat [General function prediction only];


:

Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 56.84  E-value: 6.71e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1307 VIAAAEDGDIYVFHLVTHKLEQKITKHSEAITNMFLSEKDSILYTTSADGffkkssllnlervfeTVYL-----KEPLQS 1381
Cdd:COG2319    93 LASASADGTVRLWDLATGLLLRTLTGHTGAVRSVAFSPDGKTLASGSADG---------------TVRLwdlatGKLLRT 157
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1382 M----DVAWGLAF--------IGSRWGQISTFNVVTNKVVeKPLVSTGQSIIAIKATKEGvrKILVLGCKGNFVQMHDAG 1449
Cdd:COG2319   158 LtghsGAVTSVAFspdgkllaSGSDDGTVRLWDLATGKLL-RTLTGHTGAVRSVAFSPDG--KLLASGSADGTVRLWDLA 234
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1450 NGLLLRhVFIAEGLNIYSLLL--DEGHIYCGTQKNELYQLEFVSGNLVTKFSCGNGAVAVAAY--GERYLLVGCYDGYIY 1525
Cdd:COG2319   235 TGKLLR-TLTGHSGSVRSVAFspDGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFspDGKLLASGSDDGTVR 313
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 442619636 1526 VLNKITGTQTGRFAGAGRMV--LALSVVGDKIVTSSKDNSLAI 1566
Cdd:COG2319   314 LWDLATGKLLRTLTGHTGAVrsVAFSPDGKTLASGSDDGTVRL 356
PTZ00395 super family cl33180
Sec24-related protein; Provisional
5-257 9.25e-05

Sec24-related protein; Provisional


The actual alignment was detected with superfamily member PTZ00395:

Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 47.38  E-value: 9.25e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636    5 QQQRGESGGGAPPDTGRDSA--SQPAKSSTA-SGSGHSATCSHSPSSKNRGhFNNRVHSNHNQRQTPYPKSNYENRGRSD 81
Cdd:PTZ00395  373 PDARGAWAGGPHSNASYNCAaySNAAQSNAAqSNAGFSNAGYSNPGNSNPG-YNNAPNSNTPYNNPPNSNTPYSNPPNSN 451
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636   82 TRSSNQERQERDYTKrTDFRERNTRFSDERHSGRYSDYHRR------------NHYHRFKSWGSEGRSYRRDKGARDLSK 149
Cdd:PTZ00395  452 PPYSNLPYSNTPYSN-APLSNAPPSSAKDHHSAYHAAYQHRaanqpaanlptaNQPAANNFHGAAGNSVGNPFASRPFGS 530
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636  150 SRYDNDASGSSlirgaseirrknghncDPNSDASKTENSEDIQSCINHYQTEENKIDTEQSKDQGANRTSPTEQLS---- 225
Cdd:PTZ00395  531 APYGGNAATTA----------------DPNGIAKREDHPEGGTNRQKYEQSDEESVESSSSENSSENENEVTDKGEeiys 594
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 442619636  226 ---------DISKQSNPIALEGDQNKKTNELKESSCS--SKPS 257
Cdd:PTZ00395  595 llkktinriDMNKIPRPIINTQEKKKKKNLKVFETCKyiSPPS 637
 
Name Accession Description Interval E-value
WD40 COG2319
WD40 repeat [General function prediction only];
1307-1566 6.71e-08

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 56.84  E-value: 6.71e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1307 VIAAAEDGDIYVFHLVTHKLEQKITKHSEAITNMFLSEKDSILYTTSADGffkkssllnlervfeTVYL-----KEPLQS 1381
Cdd:COG2319    93 LASASADGTVRLWDLATGLLLRTLTGHTGAVRSVAFSPDGKTLASGSADG---------------TVRLwdlatGKLLRT 157
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1382 M----DVAWGLAF--------IGSRWGQISTFNVVTNKVVeKPLVSTGQSIIAIKATKEGvrKILVLGCKGNFVQMHDAG 1449
Cdd:COG2319   158 LtghsGAVTSVAFspdgkllaSGSDDGTVRLWDLATGKLL-RTLTGHTGAVRSVAFSPDG--KLLASGSADGTVRLWDLA 234
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1450 NGLLLRhVFIAEGLNIYSLLL--DEGHIYCGTQKNELYQLEFVSGNLVTKFSCGNGAVAVAAY--GERYLLVGCYDGYIY 1525
Cdd:COG2319   235 TGKLLR-TLTGHSGSVRSVAFspDGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFspDGKLLASGSDDGTVR 313
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 442619636 1526 VLNKITGTQTGRFAGAGRMV--LALSVVGDKIVTSSKDNSLAI 1566
Cdd:COG2319   314 LWDLATGKLLRTLTGHTGAVrsVAFSPDGKTLASGSDDGTVRL 356
PTZ00395 PTZ00395
Sec24-related protein; Provisional
5-257 9.25e-05

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 47.38  E-value: 9.25e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636    5 QQQRGESGGGAPPDTGRDSA--SQPAKSSTA-SGSGHSATCSHSPSSKNRGhFNNRVHSNHNQRQTPYPKSNYENRGRSD 81
Cdd:PTZ00395  373 PDARGAWAGGPHSNASYNCAaySNAAQSNAAqSNAGFSNAGYSNPGNSNPG-YNNAPNSNTPYNNPPNSNTPYSNPPNSN 451
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636   82 TRSSNQERQERDYTKrTDFRERNTRFSDERHSGRYSDYHRR------------NHYHRFKSWGSEGRSYRRDKGARDLSK 149
Cdd:PTZ00395  452 PPYSNLPYSNTPYSN-APLSNAPPSSAKDHHSAYHAAYQHRaanqpaanlptaNQPAANNFHGAAGNSVGNPFASRPFGS 530
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636  150 SRYDNDASGSSlirgaseirrknghncDPNSDASKTENSEDIQSCINHYQTEENKIDTEQSKDQGANRTSPTEQLS---- 225
Cdd:PTZ00395  531 APYGGNAATTA----------------DPNGIAKREDHPEGGTNRQKYEQSDEESVESSSSENSSENENEVTDKGEeiys 594
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 442619636  226 ---------DISKQSNPIALEGDQNKKTNELKESSCS--SKPS 257
Cdd:PTZ00395  595 llkktinriDMNKIPRPIINTQEKKKKKNLKVFETCKyiSPPS 637
assembly_YfgL TIGR03300
outer membrane assembly lipoprotein YfgL; Members of this protein family are YfgL, a ...
1468-1568 1.74e-04

outer membrane assembly lipoprotein YfgL; Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ. [Protein fate, Protein and peptide secretion and trafficking]


Pssm-ID: 274511 [Multi-domain]  Cd Length: 377  Bit Score: 45.69  E-value: 1.74e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636  1468 LLLDEGHIYCGTQKNELYQLEFVSGNLVTKfscgNGAV------AVAAYGeRYLLVGCYDGYIYVLNkitgTQTGRFagA 1541
Cdd:TIGR03300  275 PAVDDNRLYVTDADGVVVALDRRSGSELWK----NDELkyrqltAPAVLG-GYLVVGDFEGYLHWLD----RDDGSF--V 343
                           90       100       110
                   ....*....|....*....|....*....|....
gi 442619636  1542 GRM-------VLALSVVGDKIVTSSKDNSLAILE 1568
Cdd:TIGR03300  344 ARLktdgsgiASPPVVVGDGLLVQTRDGDLYAFR 377
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
1304-1359 4.28e-04

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 44.25  E-value: 4.28e-04
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 442619636 1304 RDNVIAAAEDGDIYVFHLVTHKLEQKITKHSEAITNMFLSEKDSILYTTSADGFFK 1359
Cdd:cd00200   231 GYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLASGSADGTIR 286
U2AF_lg TIGR01642
U2 snRNP auxilliary factor, large subunit, splicing factor; These splicing factors consist of ...
51-151 9.03e-04

U2 snRNP auxilliary factor, large subunit, splicing factor; These splicing factors consist of an N-terminal arginine-rich low complexity domain followed by three tandem RNA recognition motifs (pfam00076). The well-characterized members of this family are auxilliary components of the U2 small nuclear ribonuclearprotein splicing factor (U2AF). These proteins are closely related to the CC1-like subfamily of splicing factors (TIGR01622). Members of this subfamily are found in plants, metazoa and fungi.


Pssm-ID: 273727 [Multi-domain]  Cd Length: 509  Bit Score: 43.73  E-value: 9.03e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636    51 RGHFNNRVHSNHNQRQTPYPKSNYENRGRSDTRSSNQERQERDYTKRTDFRERNTRFSDERHSGRYSDYHRRNHYHRFKS 130
Cdd:TIGR01642    1 RDEEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRSPRSLRYSSVRRSRDRPRR 80
                           90       100
                   ....*....|....*....|.
gi 442619636   131 wgsegRSYRRDKGARDLSKSR 151
Cdd:TIGR01642   81 -----RSRSVRSIEQHRRRLR 96
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
10-295 3.07e-03

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 42.20  E-value: 3.07e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636   10 ESGGGAPPDTGRDSASQPAKSSTASGSGHSATCSHSPS-SKNRGHFNNRVHSNHNQRQTPYPKSNYENRGRSDTRSSNQE 88
Cdd:NF033609  581 DSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSaSDSDSASDSDSASDSDSASDSDSDSDSDSDSDSDSDSDSDS 660
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636   89 RQERDYTKRTDFRERNTRFSDErHSGRYSDYHRRNHYHRFKSWGSEGRSYRRDKGARDlSKSRYDNDASGSSLIRGASEI 168
Cdd:NF033609  661 DSDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDS 738
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636  169 RRKNGHNCDPNSDA-SKTENSEDIQSCINHYQTEENKIDTEQSKDQGANRTSPTEQLSDISKQSNPIAlEGDQNKKTNEL 247
Cdd:NF033609  739 DSDSDSDSDSDSDSdSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSD 817
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|.
gi 442619636  248 KESSCSSKPSREINETSKASQFPDQRSKEQSSGEELNVSES---SDNHSGA 295
Cdd:NF033609  818 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSESdsnSDSESGS 868
 
Name Accession Description Interval E-value
WD40 COG2319
WD40 repeat [General function prediction only];
1307-1566 6.71e-08

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 56.84  E-value: 6.71e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1307 VIAAAEDGDIYVFHLVTHKLEQKITKHSEAITNMFLSEKDSILYTTSADGffkkssllnlervfeTVYL-----KEPLQS 1381
Cdd:COG2319    93 LASASADGTVRLWDLATGLLLRTLTGHTGAVRSVAFSPDGKTLASGSADG---------------TVRLwdlatGKLLRT 157
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1382 M----DVAWGLAF--------IGSRWGQISTFNVVTNKVVeKPLVSTGQSIIAIKATKEGvrKILVLGCKGNFVQMHDAG 1449
Cdd:COG2319   158 LtghsGAVTSVAFspdgkllaSGSDDGTVRLWDLATGKLL-RTLTGHTGAVRSVAFSPDG--KLLASGSADGTVRLWDLA 234
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1450 NGLLLRhVFIAEGLNIYSLLL--DEGHIYCGTQKNELYQLEFVSGNLVTKFSCGNGAVAVAAY--GERYLLVGCYDGYIY 1525
Cdd:COG2319   235 TGKLLR-TLTGHSGSVRSVAFspDGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFspDGKLLASGSDDGTVR 313
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 442619636 1526 VLNKITGTQTGRFAGAGRMV--LALSVVGDKIVTSSKDNSLAI 1566
Cdd:COG2319   314 LWDLATGKLLRTLTGHTGAVrsVAFSPDGKTLASGSDDGTVRL 356
WD40 COG2319
WD40 repeat [General function prediction only];
1307-1566 2.82e-07

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 54.92  E-value: 2.82e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1307 VIAAAEDGDIYVFHLVTHKLEQKITKHSEAITNMFLSEKDSILYTTSADGffkkssllnlervfeTVYL-----KEPLQS 1381
Cdd:COG2319   135 LASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDG---------------TVRLwdlatGKLLRT 199
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1382 M----DVAWGLAF--------IGSRWGQISTFNVVTNKVVeKPLVSTGQSIIAIKATKEGvrKILVLGCKGNFVQMHDAG 1449
Cdd:COG2319   200 LtghtGAVRSVAFspdgkllaSGSADGTVRLWDLATGKLL-RTLTGHSGSVRSVAFSPDG--RLLASGSADGTVRLWDLA 276
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636 1450 NGLLLRhVFIAEGLNIYSLLL--DEGHIYCGTQKNELYQLEFVSGNLVTKFSCGNG---AVAVAAYGeRYLLVGCYDGYI 1524
Cdd:COG2319   277 TGELLR-TLTGHSGGVNSVAFspDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGavrSVAFSPDG-KTLASGSDDGTV 354
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....
gi 442619636 1525 YVLNKITGTQTGRFAGAGRMV--LALSVVGDKIVTSSKDNSLAI 1566
Cdd:COG2319   355 RLWDLATGELLRTLTGHTGAVtsVAFSPDGRTLASGSADGTVRL 398
PTZ00395 PTZ00395
Sec24-related protein; Provisional
5-257 9.25e-05

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 47.38  E-value: 9.25e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636    5 QQQRGESGGGAPPDTGRDSA--SQPAKSSTA-SGSGHSATCSHSPSSKNRGhFNNRVHSNHNQRQTPYPKSNYENRGRSD 81
Cdd:PTZ00395  373 PDARGAWAGGPHSNASYNCAaySNAAQSNAAqSNAGFSNAGYSNPGNSNPG-YNNAPNSNTPYNNPPNSNTPYSNPPNSN 451
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636   82 TRSSNQERQERDYTKrTDFRERNTRFSDERHSGRYSDYHRR------------NHYHRFKSWGSEGRSYRRDKGARDLSK 149
Cdd:PTZ00395  452 PPYSNLPYSNTPYSN-APLSNAPPSSAKDHHSAYHAAYQHRaanqpaanlptaNQPAANNFHGAAGNSVGNPFASRPFGS 530
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636  150 SRYDNDASGSSlirgaseirrknghncDPNSDASKTENSEDIQSCINHYQTEENKIDTEQSKDQGANRTSPTEQLS---- 225
Cdd:PTZ00395  531 APYGGNAATTA----------------DPNGIAKREDHPEGGTNRQKYEQSDEESVESSSSENSSENENEVTDKGEeiys 594
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 442619636  226 ---------DISKQSNPIALEGDQNKKTNELKESSCS--SKPS 257
Cdd:PTZ00395  595 llkktinriDMNKIPRPIINTQEKKKKKNLKVFETCKyiSPPS 637
assembly_YfgL TIGR03300
outer membrane assembly lipoprotein YfgL; Members of this protein family are YfgL, a ...
1468-1568 1.74e-04

outer membrane assembly lipoprotein YfgL; Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ. [Protein fate, Protein and peptide secretion and trafficking]


Pssm-ID: 274511 [Multi-domain]  Cd Length: 377  Bit Score: 45.69  E-value: 1.74e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636  1468 LLLDEGHIYCGTQKNELYQLEFVSGNLVTKfscgNGAV------AVAAYGeRYLLVGCYDGYIYVLNkitgTQTGRFagA 1541
Cdd:TIGR03300  275 PAVDDNRLYVTDADGVVVALDRRSGSELWK----NDELkyrqltAPAVLG-GYLVVGDFEGYLHWLD----RDDGSF--V 343
                           90       100       110
                   ....*....|....*....|....*....|....
gi 442619636  1542 GRM-------VLALSVVGDKIVTSSKDNSLAILE 1568
Cdd:TIGR03300  344 ARLktdgsgiASPPVVVGDGLLVQTRDGDLYAFR 377
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
1304-1359 4.28e-04

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 44.25  E-value: 4.28e-04
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 442619636 1304 RDNVIAAAEDGDIYVFHLVTHKLEQKITKHSEAITNMFLSEKDSILYTTSADGFFK 1359
Cdd:cd00200   231 GYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLASGSADGTIR 286
U2AF_lg TIGR01642
U2 snRNP auxilliary factor, large subunit, splicing factor; These splicing factors consist of ...
51-151 9.03e-04

U2 snRNP auxilliary factor, large subunit, splicing factor; These splicing factors consist of an N-terminal arginine-rich low complexity domain followed by three tandem RNA recognition motifs (pfam00076). The well-characterized members of this family are auxilliary components of the U2 small nuclear ribonuclearprotein splicing factor (U2AF). These proteins are closely related to the CC1-like subfamily of splicing factors (TIGR01622). Members of this subfamily are found in plants, metazoa and fungi.


Pssm-ID: 273727 [Multi-domain]  Cd Length: 509  Bit Score: 43.73  E-value: 9.03e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636    51 RGHFNNRVHSNHNQRQTPYPKSNYENRGRSDTRSSNQERQERDYTKRTDFRERNTRFSDERHSGRYSDYHRRNHYHRFKS 130
Cdd:TIGR01642    1 RDEEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRSPRSLRYSSVRRSRDRPRR 80
                           90       100
                   ....*....|....*....|.
gi 442619636   131 wgsegRSYRRDKGARDLSKSR 151
Cdd:TIGR01642   81 -----RSRSVRSIEQHRRRLR 96
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
10-295 3.07e-03

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 42.20  E-value: 3.07e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636   10 ESGGGAPPDTGRDSASQPAKSSTASGSGHSATCSHSPS-SKNRGHFNNRVHSNHNQRQTPYPKSNYENRGRSDTRSSNQE 88
Cdd:NF033609  581 DSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSaSDSDSASDSDSASDSDSASDSDSDSDSDSDSDSDSDSDSDS 660
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636   89 RQERDYTKRTDFRERNTRFSDErHSGRYSDYHRRNHYHRFKSWGSEGRSYRRDKGARDlSKSRYDNDASGSSLIRGASEI 168
Cdd:NF033609  661 DSDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDS 738
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442619636  169 RRKNGHNCDPNSDA-SKTENSEDIQSCINHYQTEENKIDTEQSKDQGANRTSPTEQLSDISKQSNPIAlEGDQNKKTNEL 247
Cdd:NF033609  739 DSDSDSDSDSDSDSdSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS-DSDSDSDSDSD 817
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|.
gi 442619636  248 KESSCSSKPSREINETSKASQFPDQRSKEQSSGEELNVSES---SDNHSGA 295
Cdd:NF033609  818 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSESdsnSDSESGS 868
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH