NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|254588077|ref|NP_001156979|]
View 

multimerin-1 isoform b precursor [Mus musculus]

Protein Classification

calcium-binding EGF-like domain-containing protein( domain architecture ID 13728361)

calcium-binding epidermal growth factor (EGF)-like domain-containing protein may play a crucial role in numerous protein-protein interactions

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
C1q super family cl23878
C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement ...
1075-1209 4.57e-40

C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system.


The actual alignment was detected with superfamily member smart00110:

Pssm-ID: 420072  Cd Length: 135  Bit Score: 144.75  E-value: 4.57e-40
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   1075 SYRYAPMVAFFVSHTHGMTAPG-PILFNDLSVNYGASYNPRTGKFRIPYLGVYIFKYTIESFSAHISGFFVVDGVDKLRF 1153
Cdd:smart00110    1 NYKAQPRSAFSVIRSNRPPPPGqPIRFDKVLYNQQGHYDPRTGKFTCPVPGVYYFSYHVESKGRNVKVSLMKNGIQVMST 80
                            90       100       110       120       130
                    ....*....|....*....|....*....|....*....|....*....|....*...
gi 254588077   1154 eseNADNEIHCDRVLTGDALFELNYGQEVWLRL--VKGTIPIKYPPVTTFSGYLLYRT 1209
Cdd:smart00110   81 ---YDEYQKGLYDVASGGALLQLRQGDQVWLELpdEKNGLYAGEYVDSTFSGFLLFPD 135
EMI pfam07546
EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final ...
193-262 3.05e-12

EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains.


:

Pssm-ID: 462204  Cd Length: 69  Bit Score: 62.82  E-value: 3.05e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 254588077   193 KNWCAHvhtKLSPTVILD-TGSHL----PSGRGSCGWYSsgLCSRRsQKTSNAVYRMQHKIVTSLEWRCCPGYIG 262
Cdd:pfam07546    1 RNVCAY---KVVSCVVVTgTESYVqpvyKPYLTWCAGHR--RCSTY-RTTYRPAYRQVYKTVTRLEWRCCPGWGG 69
CCDC158 super family cl37899
Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. ...
305-852 2.79e-08

Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. The function is not known.


The actual alignment was detected with superfamily member pfam15921:

Pssm-ID: 464943 [Multi-domain]  Cd Length: 1112  Bit Score: 58.59  E-value: 2.79e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   305 IQKLAEQLSQQERKLsllQKKVDNASLVADD----MRNAYLSLEGKVGE-DNSRQFQSFLKALKSKSIEDL---LKNIVK 376
Cdd:pfam15921   76 IERVLEEYSHQVKDL---QRRLNESNELHEKqkfyLRQSVIDLQTKLQEmQMERDAMADIRRRESQSQEDLrnqLQNTVH 152
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   377 E--QFKVFQDDMQETTAQIFKTVSSLSEDLESTRQAVLQVNQSFVSSTAQKdfAFMQENQPTwkditdlknsiMNIRQeM 454
Cdd:pfam15921  153 EleAAKCLKEDMLEDSNTQIEQLRKMMLSHEGVLQEIRSILVDFEEASGKK--IYEHDSMST-----------MHFRS-L 218
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   455 ALTCEKPVKELEAKQAHLEGALRQEHSQI-VLYHQSLNET---LSKMQEAHTQLLSV--LQVSG-TENVATEESLNSNVT 527
Cdd:pfam15921  219 GSAISKILRELDTEISYLKGRIFPVEDQLeALKSESQNKIellLQQHQDRIEQLISEheVEITGlTEKASSARSQANSIQ 298
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   528 KYISVLQETASKQGLMLLQMLSDLhvqESKISNLtillemekesaRGEceemLSKCRHDFKFQLKDTEENLHVLNQTLSE 607
Cdd:pfam15921  299 SQLEIIQEQARNQNSMYMRQLSDL---ESTVSQL-----------RSE----LREAKRMYEDKIEELEKQLVLANSELTE 360
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   608 VIFPMD---IKVDKMSEQLNDLTYDMEILQPLLEQRSSLQHQVIHKPKEATVT----RRELQNLIGAINQLNVLTKELTK 680
Cdd:pfam15921  361 ARTERDqfsQESGNLDDQLQKLLADLHKREKELSLEKEQNKRLWDRDTGNSITidhlRRELDDRNMEVQRLEALLKAMKS 440
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   681 --RHNLLRN--EVQSRGEAFERRISEHA-LET---------EDGLNKTMTVINNAIDFVQDNYVLKETLSAM-TYNPKVC 745
Cdd:pfam15921  441 ecQGQMERQmaAIQGKNESLEKVSSLTAqLEStkemlrkvvEELTAKKMTLESSERTVSDLTASLQEKERAIeATNAEIT 520
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   746 ECNQNMDNILtfvSEFQHLNDSIQTLVNNKEKYNFI-LQIAKALTAIpkdEKLNQlNFQNIYQLFNE-----TTSQVNKC 819
Cdd:pfam15921  521 KLRSRVDLKL---QELQHLKNEGDHLRNVQTECEALkLQMAEKDKVI---EILRQ-QIENMTQLVGQhgrtaGAMQVEKA 593
                          570       580       590
                   ....*....|....*....|....*....|....*..
gi 254588077   820 QQ----NMSHLEENMLSVTKTAKefETRLQGIESKVT 852
Cdd:pfam15921  594 QLekeiNDRRLELQEFKILKDKK--DAKIRELEARVS 628
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
1026-1058 1.23e-07

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


:

Pssm-ID: 238011  Cd Length: 38  Bit Score: 48.79  E-value: 1.23e-07
                          10        20        30
                  ....*....|....*....|....*....|....
gi 254588077 1026 CSSF-PCQNGGTCISGRSNFICACRHPFMGDTCT 1058
Cdd:cd00054     5 CASGnPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38
 
Name Accession Description Interval E-value
C1Q smart00110
Complement component C1q domain; Globular domain found in many collagens and eponymously in ...
1075-1209 4.57e-40

Complement component C1q domain; Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor.


Pssm-ID: 128420  Cd Length: 135  Bit Score: 144.75  E-value: 4.57e-40
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   1075 SYRYAPMVAFFVSHTHGMTAPG-PILFNDLSVNYGASYNPRTGKFRIPYLGVYIFKYTIESFSAHISGFFVVDGVDKLRF 1153
Cdd:smart00110    1 NYKAQPRSAFSVIRSNRPPPPGqPIRFDKVLYNQQGHYDPRTGKFTCPVPGVYYFSYHVESKGRNVKVSLMKNGIQVMST 80
                            90       100       110       120       130
                    ....*....|....*....|....*....|....*....|....*....|....*...
gi 254588077   1154 eseNADNEIHCDRVLTGDALFELNYGQEVWLRL--VKGTIPIKYPPVTTFSGYLLYRT 1209
Cdd:smart00110   81 ---YDEYQKGLYDVASGGALLQLRQGDQVWLELpdEKNGLYAGEYVDSTFSGFLLFPD 135
C1q pfam00386
C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement ...
1083-1206 5.19e-31

C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system.


Pssm-ID: 395310 [Multi-domain]  Cd Length: 126  Bit Score: 118.54  E-value: 5.19e-31
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077  1083 AFFVSHTHGMTAPG--PILFNDLSVNYGASYNPRTGKFRIPYLGVYIFKYTIEsfSAHISGFFV---VDGVDKLRFESEN 1157
Cdd:pfam00386    1 AFSAGRTTGLTAPNeqPVRFDKVLTNIGGHYDPATGKFTCPVPGVYYFSYHIT--TVDGKSLYVslvKNGQEVVSFYDQP 78
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 254588077  1158 ADNEihcDRVLTGDALFELNYGQEVWLRL--VKGTIPIKYPPVTTFSGYLL 1206
Cdd:pfam00386   79 QKGS---LDVASGSVVLELQRGDEVWLQLtgYNGLYYDGSDTDSTFSGFLL 126
EMI pfam07546
EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final ...
193-262 3.05e-12

EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains.


Pssm-ID: 462204  Cd Length: 69  Bit Score: 62.82  E-value: 3.05e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 254588077   193 KNWCAHvhtKLSPTVILD-TGSHL----PSGRGSCGWYSsgLCSRRsQKTSNAVYRMQHKIVTSLEWRCCPGYIG 262
Cdd:pfam07546    1 RNVCAY---KVVSCVVVTgTESYVqpvyKPYLTWCAGHR--RCSTY-RTTYRPAYRQVYKTVTRLEWRCCPGWGG 69
CCDC158 pfam15921
Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. ...
305-852 2.79e-08

Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. The function is not known.


Pssm-ID: 464943 [Multi-domain]  Cd Length: 1112  Bit Score: 58.59  E-value: 2.79e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   305 IQKLAEQLSQQERKLsllQKKVDNASLVADD----MRNAYLSLEGKVGE-DNSRQFQSFLKALKSKSIEDL---LKNIVK 376
Cdd:pfam15921   76 IERVLEEYSHQVKDL---QRRLNESNELHEKqkfyLRQSVIDLQTKLQEmQMERDAMADIRRRESQSQEDLrnqLQNTVH 152
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   377 E--QFKVFQDDMQETTAQIFKTVSSLSEDLESTRQAVLQVNQSFVSSTAQKdfAFMQENQPTwkditdlknsiMNIRQeM 454
Cdd:pfam15921  153 EleAAKCLKEDMLEDSNTQIEQLRKMMLSHEGVLQEIRSILVDFEEASGKK--IYEHDSMST-----------MHFRS-L 218
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   455 ALTCEKPVKELEAKQAHLEGALRQEHSQI-VLYHQSLNET---LSKMQEAHTQLLSV--LQVSG-TENVATEESLNSNVT 527
Cdd:pfam15921  219 GSAISKILRELDTEISYLKGRIFPVEDQLeALKSESQNKIellLQQHQDRIEQLISEheVEITGlTEKASSARSQANSIQ 298
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   528 KYISVLQETASKQGLMLLQMLSDLhvqESKISNLtillemekesaRGEceemLSKCRHDFKFQLKDTEENLHVLNQTLSE 607
Cdd:pfam15921  299 SQLEIIQEQARNQNSMYMRQLSDL---ESTVSQL-----------RSE----LREAKRMYEDKIEELEKQLVLANSELTE 360
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   608 VIFPMD---IKVDKMSEQLNDLTYDMEILQPLLEQRSSLQHQVIHKPKEATVT----RRELQNLIGAINQLNVLTKELTK 680
Cdd:pfam15921  361 ARTERDqfsQESGNLDDQLQKLLADLHKREKELSLEKEQNKRLWDRDTGNSITidhlRRELDDRNMEVQRLEALLKAMKS 440
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   681 --RHNLLRN--EVQSRGEAFERRISEHA-LET---------EDGLNKTMTVINNAIDFVQDNYVLKETLSAM-TYNPKVC 745
Cdd:pfam15921  441 ecQGQMERQmaAIQGKNESLEKVSSLTAqLEStkemlrkvvEELTAKKMTLESSERTVSDLTASLQEKERAIeATNAEIT 520
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   746 ECNQNMDNILtfvSEFQHLNDSIQTLVNNKEKYNFI-LQIAKALTAIpkdEKLNQlNFQNIYQLFNE-----TTSQVNKC 819
Cdd:pfam15921  521 KLRSRVDLKL---QELQHLKNEGDHLRNVQTECEALkLQMAEKDKVI---EILRQ-QIENMTQLVGQhgrtaGAMQVEKA 593
                          570       580       590
                   ....*....|....*....|....*....|....*..
gi 254588077   820 QQ----NMSHLEENMLSVTKTAKefETRLQGIESKVT 852
Cdd:pfam15921  594 QLekeiNDRRLELQEFKILKDKK--DAKIRELEARVS 628
235kDa-fam TIGR01612
reticulocyte binding/rhoptry protein; This model represents a group of paralogous families in ...
324-832 3.75e-08

reticulocyte binding/rhoptry protein; This model represents a group of paralogous families in plasmodium species alternately annotated as reticulocyte binding protein, 235-kDa family protein and rhoptry protein. Rhoptry protein is localized on the cell surface and is extremely large (although apparently lacking in repeat structure) and is important for the process of invasion of the RBCs by the parasite. These proteins are found in P. falciparum, P. vivax and P. yoelii.


Pssm-ID: 130673 [Multi-domain]  Cd Length: 2757  Bit Score: 58.14  E-value: 3.75e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   324 KKVDNASLVADDMRNAYLSLEGKVGE--DNSRQFQSFLKALKSKSIeDLLKNIVKEQFKVFQDDMQETTAQIFKTVSSLS 401
Cdd:TIGR01612 1774 ETVSKEPITYDEIKNTRINAQNEFLKiiEIEKKSKSYLDDIEAKEF-DRIINHFKKKLDHVNDKFTKEYSKINEGFDDIS 1852
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   402 EDLE----STRQAVL-----QVNQSFVSSTAQKDFAFMQENQPTWKDITDLKNSImNIRqemaLTCEKPVKELEAKQAHL 472
Cdd:TIGR01612 1853 KSIEnvknSTDENLLfdilnKTKDAYAGIIGKKYYSYKDEAEKIFINISKLANSI-NIQ----IQNNSGIDLFDNINIAI 1927
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   473 EGALRQEHSQIVLY---HQSLNETLSKMQEAHTQLLSVLQVSGTENVATEESLN---SNVTKYISVLQETASKQglmllq 546
Cdd:TIGR01612 1928 LSSLDSEKEDTLKFipsPEKEPEIYTKIRDSYDTLLDIFKKSQDLHKKEQDTLNiifENQQLYEKIQASNELKD------ 2001
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   547 MLSDLHVQESKISN-LTILLEMEKESARGEC-----EEMLSKCRHDfkfQLKDTEENLHVLNQTlseviFPMDIKVDKMS 620
Cdd:TIGR01612 2002 TLSDLKYKKEKILNdVKLLLHKFDELNKLSCdsqnyDTILELSKQD---KIKEKIDNYEKEKEK-----FGIDFDVKAME 2073
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   621 EQLNDLTYDMEILQPLLEQRSSLQHQVIHKPKEATVTRRELQNLIGAIN-QLNVLTKELTKRHNLLRNEVQSRG------ 693
Cdd:TIGR01612 2074 EKFDNDIKDIEKFENNYKHSEKDNHDFSEEKDNIIQSKKKLKELTEAFNtEIKIIEDKIIEKNDLIDKLIEMRKecllfs 2153
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   694 -----EAFERRISEH----------ALETEDGLNKTMTVINNAIDFVQDNYVLKETLSAMTYNP------------KVCE 746
Cdd:TIGR01612 2154 yatlvETLKSKVINHsefitsaakfSKDFFEFIEDISDSLNDDIDALQIKYNLNQTKKHMISILadatkdhnnlieKEKE 2233
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   747 CNQNMDNILT-FVSEFQhlNDSIQTLVNNK-EKYNFILQIAKALTAIPK-DEKLNQLNFQNIYqLFNETTSQVNKCQQNM 823
Cdd:TIGR01612 2234 ATKIINNLTElFTIDFN--NADADILHNNKiQIIYFNSELHKSIESIKKlYKKINAFKLLNIS-HINEKYFDISKEFDNI 2310

                   ....*....
gi 254588077   824 SHLEENMLS 832
Cdd:TIGR01612 2311 IQLQKHKLT 2319
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
1026-1058 1.23e-07

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 48.79  E-value: 1.23e-07
                          10        20        30
                  ....*....|....*....|....*....|....
gi 254588077 1026 CSSF-PCQNGGTCISGRSNFICACRHPFMGDTCT 1058
Cdd:cd00054     5 CASGnPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38
EGF pfam00008
EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very ...
1026-1056 8.59e-07

EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very similar, but has 8 instead of 6 conserved cysteines. Includes some cytokine receptors. The EGF domain misses the N-terminus regions of the Ca2+ binding EGF domains (this is the main reason of discrepancy between swiss-prot domain start/end and Pfam). The family is hard to model due to many similar but different sub-types of EGF domains. Pfam certainly misses a number of EGF domains.


Pssm-ID: 394967  Cd Length: 31  Bit Score: 46.22  E-value: 8.59e-07
                           10        20        30
                   ....*....|....*....|....*....|.
gi 254588077  1026 CSSFPCQNGGTCISGRSNFICACRHPFMGDT 1056
Cdd:pfam00008    1 CAPNPCSNGGTCVDTPGGYTCICPEGYTGKR 31
EGF_CA smart00179
Calcium-binding EGF-like domain;
1026-1058 4.64e-05

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 41.46  E-value: 4.64e-05
                            10        20        30
                    ....*....|....*....|....*....|....*
gi 254588077   1026 CSSF-PCQNGGTCISGRSNFICACRHPFM-GDTCT 1058
Cdd:smart00179    5 CASGnPCQNGGTCVNTVGSYRCECPPGYTdGRNCE 39
 
Name Accession Description Interval E-value
C1Q smart00110
Complement component C1q domain; Globular domain found in many collagens and eponymously in ...
1075-1209 4.57e-40

Complement component C1q domain; Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor.


Pssm-ID: 128420  Cd Length: 135  Bit Score: 144.75  E-value: 4.57e-40
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   1075 SYRYAPMVAFFVSHTHGMTAPG-PILFNDLSVNYGASYNPRTGKFRIPYLGVYIFKYTIESFSAHISGFFVVDGVDKLRF 1153
Cdd:smart00110    1 NYKAQPRSAFSVIRSNRPPPPGqPIRFDKVLYNQQGHYDPRTGKFTCPVPGVYYFSYHVESKGRNVKVSLMKNGIQVMST 80
                            90       100       110       120       130
                    ....*....|....*....|....*....|....*....|....*....|....*...
gi 254588077   1154 eseNADNEIHCDRVLTGDALFELNYGQEVWLRL--VKGTIPIKYPPVTTFSGYLLYRT 1209
Cdd:smart00110   81 ---YDEYQKGLYDVASGGALLQLRQGDQVWLELpdEKNGLYAGEYVDSTFSGFLLFPD 135
C1q pfam00386
C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement ...
1083-1206 5.19e-31

C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system.


Pssm-ID: 395310 [Multi-domain]  Cd Length: 126  Bit Score: 118.54  E-value: 5.19e-31
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077  1083 AFFVSHTHGMTAPG--PILFNDLSVNYGASYNPRTGKFRIPYLGVYIFKYTIEsfSAHISGFFV---VDGVDKLRFESEN 1157
Cdd:pfam00386    1 AFSAGRTTGLTAPNeqPVRFDKVLTNIGGHYDPATGKFTCPVPGVYYFSYHIT--TVDGKSLYVslvKNGQEVVSFYDQP 78
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 254588077  1158 ADNEihcDRVLTGDALFELNYGQEVWLRL--VKGTIPIKYPPVTTFSGYLL 1206
Cdd:pfam00386   79 QKGS---LDVASGSVVLELQRGDEVWLQLtgYNGLYYDGSDTDSTFSGFLL 126
EMI pfam07546
EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final ...
193-262 3.05e-12

EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains.


Pssm-ID: 462204  Cd Length: 69  Bit Score: 62.82  E-value: 3.05e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 254588077   193 KNWCAHvhtKLSPTVILD-TGSHL----PSGRGSCGWYSsgLCSRRsQKTSNAVYRMQHKIVTSLEWRCCPGYIG 262
Cdd:pfam07546    1 RNVCAY---KVVSCVVVTgTESYVqpvyKPYLTWCAGHR--RCSTY-RTTYRPAYRQVYKTVTRLEWRCCPGWGG 69
CCDC158 pfam15921
Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. ...
305-852 2.79e-08

Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. The function is not known.


Pssm-ID: 464943 [Multi-domain]  Cd Length: 1112  Bit Score: 58.59  E-value: 2.79e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   305 IQKLAEQLSQQERKLsllQKKVDNASLVADD----MRNAYLSLEGKVGE-DNSRQFQSFLKALKSKSIEDL---LKNIVK 376
Cdd:pfam15921   76 IERVLEEYSHQVKDL---QRRLNESNELHEKqkfyLRQSVIDLQTKLQEmQMERDAMADIRRRESQSQEDLrnqLQNTVH 152
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   377 E--QFKVFQDDMQETTAQIFKTVSSLSEDLESTRQAVLQVNQSFVSSTAQKdfAFMQENQPTwkditdlknsiMNIRQeM 454
Cdd:pfam15921  153 EleAAKCLKEDMLEDSNTQIEQLRKMMLSHEGVLQEIRSILVDFEEASGKK--IYEHDSMST-----------MHFRS-L 218
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   455 ALTCEKPVKELEAKQAHLEGALRQEHSQI-VLYHQSLNET---LSKMQEAHTQLLSV--LQVSG-TENVATEESLNSNVT 527
Cdd:pfam15921  219 GSAISKILRELDTEISYLKGRIFPVEDQLeALKSESQNKIellLQQHQDRIEQLISEheVEITGlTEKASSARSQANSIQ 298
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   528 KYISVLQETASKQGLMLLQMLSDLhvqESKISNLtillemekesaRGEceemLSKCRHDFKFQLKDTEENLHVLNQTLSE 607
Cdd:pfam15921  299 SQLEIIQEQARNQNSMYMRQLSDL---ESTVSQL-----------RSE----LREAKRMYEDKIEELEKQLVLANSELTE 360
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   608 VIFPMD---IKVDKMSEQLNDLTYDMEILQPLLEQRSSLQHQVIHKPKEATVT----RRELQNLIGAINQLNVLTKELTK 680
Cdd:pfam15921  361 ARTERDqfsQESGNLDDQLQKLLADLHKREKELSLEKEQNKRLWDRDTGNSITidhlRRELDDRNMEVQRLEALLKAMKS 440
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   681 --RHNLLRN--EVQSRGEAFERRISEHA-LET---------EDGLNKTMTVINNAIDFVQDNYVLKETLSAM-TYNPKVC 745
Cdd:pfam15921  441 ecQGQMERQmaAIQGKNESLEKVSSLTAqLEStkemlrkvvEELTAKKMTLESSERTVSDLTASLQEKERAIeATNAEIT 520
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   746 ECNQNMDNILtfvSEFQHLNDSIQTLVNNKEKYNFI-LQIAKALTAIpkdEKLNQlNFQNIYQLFNE-----TTSQVNKC 819
Cdd:pfam15921  521 KLRSRVDLKL---QELQHLKNEGDHLRNVQTECEALkLQMAEKDKVI---EILRQ-QIENMTQLVGQhgrtaGAMQVEKA 593
                          570       580       590
                   ....*....|....*....|....*....|....*..
gi 254588077   820 QQ----NMSHLEENMLSVTKTAKefETRLQGIESKVT 852
Cdd:pfam15921  594 QLekeiNDRRLELQEFKILKDKK--DAKIRELEARVS 628
235kDa-fam TIGR01612
reticulocyte binding/rhoptry protein; This model represents a group of paralogous families in ...
324-832 3.75e-08

reticulocyte binding/rhoptry protein; This model represents a group of paralogous families in plasmodium species alternately annotated as reticulocyte binding protein, 235-kDa family protein and rhoptry protein. Rhoptry protein is localized on the cell surface and is extremely large (although apparently lacking in repeat structure) and is important for the process of invasion of the RBCs by the parasite. These proteins are found in P. falciparum, P. vivax and P. yoelii.


Pssm-ID: 130673 [Multi-domain]  Cd Length: 2757  Bit Score: 58.14  E-value: 3.75e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   324 KKVDNASLVADDMRNAYLSLEGKVGE--DNSRQFQSFLKALKSKSIeDLLKNIVKEQFKVFQDDMQETTAQIFKTVSSLS 401
Cdd:TIGR01612 1774 ETVSKEPITYDEIKNTRINAQNEFLKiiEIEKKSKSYLDDIEAKEF-DRIINHFKKKLDHVNDKFTKEYSKINEGFDDIS 1852
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   402 EDLE----STRQAVL-----QVNQSFVSSTAQKDFAFMQENQPTWKDITDLKNSImNIRqemaLTCEKPVKELEAKQAHL 472
Cdd:TIGR01612 1853 KSIEnvknSTDENLLfdilnKTKDAYAGIIGKKYYSYKDEAEKIFINISKLANSI-NIQ----IQNNSGIDLFDNINIAI 1927
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   473 EGALRQEHSQIVLY---HQSLNETLSKMQEAHTQLLSVLQVSGTENVATEESLN---SNVTKYISVLQETASKQglmllq 546
Cdd:TIGR01612 1928 LSSLDSEKEDTLKFipsPEKEPEIYTKIRDSYDTLLDIFKKSQDLHKKEQDTLNiifENQQLYEKIQASNELKD------ 2001
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   547 MLSDLHVQESKISN-LTILLEMEKESARGEC-----EEMLSKCRHDfkfQLKDTEENLHVLNQTlseviFPMDIKVDKMS 620
Cdd:TIGR01612 2002 TLSDLKYKKEKILNdVKLLLHKFDELNKLSCdsqnyDTILELSKQD---KIKEKIDNYEKEKEK-----FGIDFDVKAME 2073
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   621 EQLNDLTYDMEILQPLLEQRSSLQHQVIHKPKEATVTRRELQNLIGAIN-QLNVLTKELTKRHNLLRNEVQSRG------ 693
Cdd:TIGR01612 2074 EKFDNDIKDIEKFENNYKHSEKDNHDFSEEKDNIIQSKKKLKELTEAFNtEIKIIEDKIIEKNDLIDKLIEMRKecllfs 2153
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   694 -----EAFERRISEH----------ALETEDGLNKTMTVINNAIDFVQDNYVLKETLSAMTYNP------------KVCE 746
Cdd:TIGR01612 2154 yatlvETLKSKVINHsefitsaakfSKDFFEFIEDISDSLNDDIDALQIKYNLNQTKKHMISILadatkdhnnlieKEKE 2233
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   747 CNQNMDNILT-FVSEFQhlNDSIQTLVNNK-EKYNFILQIAKALTAIPK-DEKLNQLNFQNIYqLFNETTSQVNKCQQNM 823
Cdd:TIGR01612 2234 ATKIINNLTElFTIDFN--NADADILHNNKiQIIYFNSELHKSIESIKKlYKKINAFKLLNIS-HINEKYFDISKEFDNI 2310

                   ....*....
gi 254588077   824 SHLEENMLS 832
Cdd:TIGR01612 2311 IQLQKHKLT 2319
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
1026-1058 1.23e-07

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 48.79  E-value: 1.23e-07
                          10        20        30
                  ....*....|....*....|....*....|....
gi 254588077 1026 CSSF-PCQNGGTCISGRSNFICACRHPFMGDTCT 1058
Cdd:cd00054     5 CASGnPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38
EGF pfam00008
EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very ...
1026-1056 8.59e-07

EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very similar, but has 8 instead of 6 conserved cysteines. Includes some cytokine receptors. The EGF domain misses the N-terminus regions of the Ca2+ binding EGF domains (this is the main reason of discrepancy between swiss-prot domain start/end and Pfam). The family is hard to model due to many similar but different sub-types of EGF domains. Pfam certainly misses a number of EGF domains.


Pssm-ID: 394967  Cd Length: 31  Bit Score: 46.22  E-value: 8.59e-07
                           10        20        30
                   ....*....|....*....|....*....|.
gi 254588077  1026 CSSFPCQNGGTCISGRSNFICACRHPFMGDT 1056
Cdd:pfam00008    1 CAPNPCSNGGTCVDTPGGYTCICPEGYTGKR 31
EGF cd00053
Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large ...
1026-1058 7.79e-06

Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.


Pssm-ID: 238010  Cd Length: 36  Bit Score: 43.62  E-value: 7.79e-06
                          10        20        30
                  ....*....|....*....|....*....|....*
gi 254588077 1026 CSSF-PCQNGGTCISGRSNFICACRHPFMGD-TCT 1058
Cdd:cd00053     2 CAASnPCSNGGTCVNTPGSYRCVCPPGYTGDrSCE 36
sbcc TIGR00618
exonuclease SbcC; All proteins in this family for which functions are known are part of an ...
270-702 2.09e-05

exonuclease SbcC; All proteins in this family for which functions are known are part of an exonuclease complex with sbcD homologs. This complex is involved in the initiation of recombination to regulate the levels of palindromic sequences in DNA. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]


Pssm-ID: 129705 [Multi-domain]  Cd Length: 1042  Bit Score: 48.81  E-value: 2.09e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   270 EEQQQLA-----HSNQAESHTAVDQGRAQQQKQDCgdpamiqKLAEQLSQQERKLSLL-QKKVDNASLVADDMRNAYLsl 343
Cdd:TIGR00618  470 EREQQLQtkeqiHLQETRKKAVVLARLLELQEEPC-------PLCGSCIHPNPARQDIdNPGPLTRRMQRGEQTYAQL-- 540
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   344 eGKVGEDNSRQFQSFLKALKsksiedllknIVKEQFKVFQDDMQETTAQIfktvSSLSEDLESTRQAVLQVnQSFVSSTA 423
Cdd:TIGR00618  541 -ETSEEDVYHQLTSERKQRA----------SLKEQMQEIQQSFSILTQCD----NRSKEDIPNLQNITVRL-QDLTEKLS 604
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   424 QKDFAFMQENQptwkdITDLKNSIMNIRQEMALTCEKPVKELEAKQAHLEGAL------RQEHS------------QIVL 485
Cdd:TIGR00618  605 EAEDMLACEQH-----ALLRKLQPEQDLQDVRLHLQQCSQELALKLTALHALQltltqeRVREHalsirvlpkellASRQ 679
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   486 ----YHQSLNETLSKMQEAHTQLLSVLQvSGTENVATEESLNSNVTKYISVLQETASKQGLMLLQMLSDL-HVQESKISN 560
Cdd:TIGR00618  680 lalqKMQSEKEQLTYWKEMLAQCQTLLR-ELETHIEEYDREFNEIENASSSLGSDLAAREDALNQSLKELmHQARTVLKA 758
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   561 LTilLEMEKESARGECEEM----LSKCRHDFKFQLKDTEENLHVLNQTLSEVIFPMDIKVDKMSEQLNDLTYDMEILQPL 636
Cdd:TIGR00618  759 RT--EAHFNNNEEVTAALQtgaeLSHLAAEIQFFNRLREEDTHLLKTLEAEIGQEIPSDEDILNLQCETLVQEEEQFLSR 836
                          410       420       430       440       450       460
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 254588077   637 LEQRSSLQHQVIHKPKEATVTRRELQNLIGAINQLNVLTKELTKRHNLlrnEVQSRGEAFERRISE 702
Cdd:TIGR00618  837 LEEKSATLGEITHQLLKYEECSKQLAQLTQEQAKIIQLSDKLNGINQI---KIQFDGDALIKFLHE 899
EGF_CA smart00179
Calcium-binding EGF-like domain;
1026-1058 4.64e-05

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 41.46  E-value: 4.64e-05
                            10        20        30
                    ....*....|....*....|....*....|....*
gi 254588077   1026 CSSF-PCQNGGTCISGRSNFICACRHPFM-GDTCT 1058
Cdd:smart00179    5 CASGnPCQNGGTCVNTVGSYRCECPPGYTdGRNCE 39
235kDa-fam TIGR01612
reticulocyte binding/rhoptry protein; This model represents a group of paralogous families in ...
438-831 2.44e-04

reticulocyte binding/rhoptry protein; This model represents a group of paralogous families in plasmodium species alternately annotated as reticulocyte binding protein, 235-kDa family protein and rhoptry protein. Rhoptry protein is localized on the cell surface and is extremely large (although apparently lacking in repeat structure) and is important for the process of invasion of the RBCs by the parasite. These proteins are found in P. falciparum, P. vivax and P. yoelii.


Pssm-ID: 130673 [Multi-domain]  Cd Length: 2757  Bit Score: 45.81  E-value: 2.44e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   438 KDITDLKNSIMNIR---QEMALTCEKPVKELEAKQAHLEGALRQEHSQIvlYHQSLNETlskmqeaHTQLLSVLQ---VS 511
Cdd:TIGR01612  620 KKAIDLKKIIENNNayiDELAKISPYQVPEHLKNKDKIYSTIKSELSKI--YEDDIDAL-------YNELSSIVKenaID 690
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   512 GTENVATEESLNSNVTKYISVLQ--ETASKQglmllQMLSDLHVQESKISNltILLEMEKEsARGECEEMLSKCRHDFKF 589
Cdd:TIGR01612  691 NTEDKAKLDDLKSKIDKEYDKIQnmETATVE-----LHLSNIENKKNELLD--IIVEIKKH-IHGEINKDLNKILEDFKN 762
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   590 QLKDTEENLH-------VLNQTLSEVifpMDIKvDKMSEQLN-DLTYDMEILQPLlEQRSSLQHQVIHKPKEATVTRREL 661
Cdd:TIGR01612  763 KEKELSNKINdyakekdELNKYKSKI---SEIK-NHYNDQINiDNIKDEDAKQNY-DKSKEYIKTISIKEDEIFKIINEM 837
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   662 QNLIGAInqLNVLTKELTKRHNLlRNEVQSRGEAF-------ERRISEHALET-EDGLNKTMTVINNAIDFVQDNYVLKE 733
Cdd:TIGR01612  838 KFMKDDF--LNKVDKFINFENNC-KEKIDSEHEQFaeltnkiKAEISDDKLNDyEKKFNDSKSLINEINKSIEEEYQNIN 914
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   734 TLSAMTYNPKVCECN----QNMDNILTFVSEFqhLNDSIQTLVNN-------KEKY-NFILQIAKALTAIPKDEKLNQLN 801
Cdd:TIGR01612  915 TLKKVDEYIKICENTkesiEKFHNKQNILKEI--LNKNIDTIKESnlieksyKDKFdNTLIDKINELDKAFKDASLNDYE 992
                          410       420       430
                   ....*....|....*....|....*....|
gi 254588077   802 FQNiyqlfNETTSQVNKCQQNMSHLEENML 831
Cdd:TIGR01612  993 AKN-----NELIKYFNDLKANLGKNKENML 1017
Mplasa_alph_rch TIGR04523
helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of ...
305-839 2.77e-04

helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.


Pssm-ID: 275316 [Multi-domain]  Cd Length: 745  Bit Score: 45.01  E-value: 2.77e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   305 IQKLAEQLSQQERKLSLLQKKVDNaslvaddmrnaYLSLEGKVGEdnsrqfqsflkaLKSKsiedllKNIVKEQFKVFQD 384
Cdd:TIGR04523  189 IDKIKNKLLKLELLLSNLKKKIQK-----------NKSLESQISE------------LKKQ------NNQLKDNIEKKQQ 239
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   385 DMQETTAQIFKTVSSLSEDLESTRQAV--LQVNQSFVSSTAQKdfafmqeNQPTWKDITDLKNSIMNIRQEMALTCEKPV 462
Cdd:TIGR04523  240 EINEKTTEISNTQTQLNQLKDEQNKIKkqLSEKQKELEQNNKK-------IKELEKQLNQLKSEISDLNNQKEQDWNKEL 312
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   463 KELEAKQahlEGALRQEHSQIVlyhQSlNETLSKMQEAHTQLLSVLQVSGTENVATEESLNSNVTKYISVLQETASKqgl 542
Cdd:TIGR04523  313 KSELKNQ---EKKLEEIQNQIS---QN-NKIISQLNEQISQLKKELTNSESENSEKQRELEEKQNEIEKLKKENQSY--- 382
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   543 mlLQMLSDLHVQ----ESKISNLTIL----------LEMEKESARGECEEMLSKcRHDFKFQLKDTEENLHVLNQTLSEv 608
Cdd:TIGR04523  383 --KQEIKNLESQindlESKIQNQEKLnqqkdeqikkLQQEKELLEKEIERLKET-IIKNNSEIKDLTNQDSVKELIIKN- 458
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   609 ifpMDIKVDKMSEQLNDLTYDMEILQPLLEQrsslqhqvihKPKEATVTRRELQNLIGAINQLNVLTKELTKRHNLLRNE 688
Cdd:TIGR04523  459 ---LDNTRESLETQLKVLSRSINKIKQNLEQ----------KQKELKSKEKELKKLNEEKKELEEKVKDLTKKISSLKEK 525
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   689 VQsrgeAFERRISEHALET---EDGLNKtmtvinnaIDFVQDNYVLKETLSAmtYNPKVCECNQNMDNILTFVSEFQHLN 765
Cdd:TIGR04523  526 IE----KLESEKKEKESKIsdlEDELNK--------DDFELKKENLEKEIDE--KNKEIEELKQTQKSLKKKQEEKQELI 591
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   766 D----SIQTLVNNKEKYNF-ILQIAKALTAIPKD-EKLNQL--NFQNIYQLFNETTSQVNKCQQNMSHLEENMLSVTKTA 837
Cdd:TIGR04523  592 DqkekEKKDLIKEIEEKEKkISSLEKELEKAKKEnEKLSSIikNIKSKKNKLKQEVKQIKETIKEIRNKWPEIIKKIKES 671

                   ..
gi 254588077   838 KE 839
Cdd:TIGR04523  672 KT 673
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
437-690 2.79e-04

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 45.43  E-value: 2.79e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   437 WKDITDLKNSIMNIRQEMAlTCEKPVKELEAKQAHLEGALRQEHSQIVLYHQSLnETLSKMQEAHTQLLSVLQVSGTENV 516
Cdd:TIGR02168  683 EEKIEELEEKIAELEKALA-ELRKELEELEEELEQLRKELEELSRQISALRKDL-ARLEAEVEQLEERIAQLSKELTELE 760
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   517 ATEESLNSNVTKYISVLQETASKQGLM---LLQMLSDLHVQESKISNLTILLEMEKESAR--GECEEMLSKCRHDFKFQL 591
Cdd:TIGR02168  761 AEIEELEERLEEAEEELAEAEAEIEELeaqIEQLKEELKALREALDELRAELTLLNEEAAnlRERLESLERRIAATERRL 840
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   592 KDTEENLHVLNQTLSEV---IFPMDIKVDKMSEQLNDltydmeilqpLLEQRSSLQHQVIHKPKEATVTRRELQNLIGAI 668
Cdd:TIGR02168  841 EDLEEQIEELSEDIESLaaeIEELEELIEELESELEA----------LLNERASLEEALALLRSELEELSEELRELESKR 910
                          250       260
                   ....*....|....*....|..
gi 254588077   669 NQLNVLTKELTKRHNLLRNEVQ 690
Cdd:TIGR02168  911 SELRRELEELREKLAQLELRLE 932
Mplasa_alph_rch TIGR04523
helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of ...
438-908 8.17e-04

helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.


Pssm-ID: 275316 [Multi-domain]  Cd Length: 745  Bit Score: 43.86  E-value: 8.17e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   438 KDITDLKNSIMNIRQEMAL---------TCEKPVKELEAKQAHLEGALRQEHSQIVLYHQSLNETLSKMQEAHTQLLSvl 508
Cdd:TIGR04523  180 KEKLNIQKNIDKIKNKLLKlelllsnlkKKIQKNKSLESQISELKKQNNQLKDNIEKKQQEINEKTTEISNTQTQLNQ-- 257
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   509 qvsgtenvaTEESLNSNVTKyisvLQEtasKQglmllqmlSDLHVQESKISNLTILL---EMEKESARGECEEMLSKcrh 585
Cdd:TIGR04523  258 ---------LKDEQNKIKKQ----LSE---KQ--------KELEQNNKKIKELEKQLnqlKSEISDLNNQKEQDWNK--- 310
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   586 DFKFQLKDTEENLHVLNQTLSE---VIFPMDIKVDKMSEQLNDLTYDMEILQPLLEQRsslQHQVIHKPKEATVTRRELQ 662
Cdd:TIGR04523  311 ELKSELKNQEKKLEEIQNQISQnnkIISQLNEQISQLKKELTNSESENSEKQRELEEK---QNEIEKLKKENQSYKQEIK 387
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   663 NLIGAINQLN------------------VLTKE---LTKRHNLLRNEVQSRGEAFERRISE-HALETE-DGLNKTMTVIN 719
Cdd:TIGR04523  388 NLESQINDLEskiqnqeklnqqkdeqikKLQQEkelLEKEIERLKETIIKNNSEIKDLTNQdSVKELIiKNLDNTRESLE 467
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   720 NAIDFVQDNY-VLKETLSAMTYN--PKVCECNQNMDNILTFVSEFQHLNDSIQTLVNNKEKYNF-ILQIAKALTAipKDE 795
Cdd:TIGR04523  468 TQLKVLSRSInKIKQNLEQKQKElkSKEKELKKLNEEKKELEEKVKDLTKKISSLKEKIEKLESeKKEKESKISD--LED 545
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   796 KLNQLNFQNIYQLFNEttsQVNKCQQNMSHLEENMLSVTKTAKEFETRLQGIESKVTKtlIPYYISFKKGGILSNERDVD 875
Cdd:TIGR04523  546 ELNKDDFELKKENLEK---EIDEKNKEIEELKQTQKSLKKKQEEKQELIDQKEKEKKD--LIKEIEEKEKKISSLEKELE 620
                          490       500       510
                   ....*....|....*....|....*....|....*.
gi 254588077   876 L---QLKVLNTRFKALEAKSIHLSVSFSLLNKTVRE 908
Cdd:TIGR04523  621 KakkENEKLSSIIKNIKSKKNKLKQEVKQIKETIKE 656
SCP-1 pfam05483
Synaptonemal complex protein 1 (SCP-1); Synaptonemal complex protein 1 (SCP-1) is the major ...
265-598 1.03e-03

Synaptonemal complex protein 1 (SCP-1); Synaptonemal complex protein 1 (SCP-1) is the major component of the transverse filaments of the synaptonemal complex. Synaptonemal complexes are structures that are formed between homologous chromosomes during meiotic prophase.


Pssm-ID: 114219 [Multi-domain]  Cd Length: 787  Bit Score: 43.56  E-value: 1.03e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   265 CQLKVEEQQQLAHSNQAE-SHTAVdqgRAQQQKQDCGDPAMIQKLAEQLSQQERKLSL----LQKKVDNASLVADDMRNA 339
Cdd:pfam05483  327 CQLTEEKEAQMEELNKAKaAHSFV---VTEFEATTCSLEELLRTEQQRLEKNEDQLKIitmeLQKKSSELEEMTKFKNNK 403
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   340 YLSLE---GKVGEDNS-----RQFQSFLKALKSKSIE--DLLKNIVKEQFKV-FQDDMQETTAQIF-KTVSSLSEDLEST 407
Cdd:pfam05483  404 EVELEelkKILAEDEKlldekKQFEKIAEELKGKEQEliFLLQAREKEIHDLeIQLTAIKTSEEHYlKEVEDLKTELEKE 483
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   408 RQAVLQVnqsfvssTAQKDFAFMqENQPTWKDITDLKNSIMNiRQEMALTCE-------KPVKELEAKQAHLEGALRQEH 480
Cdd:pfam05483  484 KLKNIEL-------TAHCDKLLL-ENKELTQEASDMTLELKK-HQEDIINCKkqeermlKQIENLEEKEMNLRDELESVR 554
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 254588077   481 SQIVLYHQSLNETLSKMQEAHTQLLSVLQVSGTENVATEESLNS------NVTKYISVLQETASKQGLMLLQMLSDLHVQ 554
Cdd:pfam05483  555 EEFIQKGDEVKCKLDKSEENARSIEYEVLKKEKQMKILENKCNNlkkqieNKNKNIEELHQENKALKKKGSAENKQLNAY 634
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 254588077   555 ESKISNltilLEMEKESARGECEEMLSKCRHDFKFQlKDTEENL 598
Cdd:pfam05483  635 EIKVNK----LELELASAKQKFEEIIDNYQKEIEDK-KISEEKL 673
EGF_2 pfam07974
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
1031-1057 3.41e-03

EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.


Pssm-ID: 400365  Cd Length: 26  Bit Score: 36.17  E-value: 3.41e-03
                           10        20
                   ....*....|....*....|....*..
gi 254588077  1031 CQNGGTCIsgRSNFICACRHPFMGDTC 1057
Cdd:pfam07974    2 CSGRGTCV--NQCGKCVCDSGYQGATC 26
hEGF pfam12661
Human growth factor-like EGF; hEGF, or human growth factor-like EGF, domains have six ...
1031-1048 4.04e-03

Human growth factor-like EGF; hEGF, or human growth factor-like EGF, domains have six conserved residues disulfide-bonded into the characteriztic 'ababcc' pattern. They are involved in growth and proliferation of cells, in proteins of the Notch/Delta pathway, neurogulin and selectins. hEGFs are also found in mosaic proteins with four-disulfide laminin EGFs such as aggrecan and perlecan. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal Cys residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In hEGFs the C-terminal thiol resides in the beta-turn, resulting in shorter loop-lengths between the Cys residues of disulfide 'c', typically C[8-9]XC. These shorter loop-lengths are also typical of the four-disulfide EGF domains, laminin ad integrin. Tandem hEGF domains have six linking residues between terminal cysteines of adjacent domains. hEGF domains may or may not bind calcium in the linker region. hEGF domains with the consensus motif CXD4X[F,Y]XCXC are hydroxylated exclusively in the Asp residue.


Pssm-ID: 463660  Cd Length: 22  Bit Score: 35.77  E-value: 4.04e-03
                           10
                   ....*....|....*...
gi 254588077  1031 CQNGGTCISGRSNFICAC 1048
Cdd:pfam12661    1 CQNGGTCVDGVNGYKCQC 18
EGF smart00181
Epidermal growth factor-like domain;
1026-1057 4.97e-03

Epidermal growth factor-like domain;


Pssm-ID: 214544  Cd Length: 35  Bit Score: 35.96  E-value: 4.97e-03
                            10        20        30
                    ....*....|....*....|....*....|....
gi 254588077   1026 CSSF-PCQNGgTCISGRSNFICACRHPFMGD-TC 1057
Cdd:smart00181    2 CASGgPCSNG-TCINTPGSYTCSCPPGYTGDkRC 34
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH