NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|17539308|ref|NP_503080|]
View 

EGF-like domain-containing protein [Caenorhabditis elegans]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
DUF5585 super family cl39316
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
502-688 1.26e-05

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


The actual alignment was detected with superfamily member pfam17823:

Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 49.57  E-value: 1.26e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308    502 ASTTTGEASSTTtgEASSTTTGEASSTTTGEASSTTTGEASSTTTGeaSSTTTGEASSTTTGEATSVAATTSSASSTVVT 581
Cdd:pfam17823  128 QSLPAAIAALPS--EAFSAPRAAACRANASAAPRAAIAAASAPHAA--SPAPRTAASSTTAASSTTAASSAPTTAASSAP 203
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308    582 STEMPSS-TSQAVCTTQDNSATFLFAYAADFDPTTyGLVSRTIGSYVTANLPT----GQTLANVLTDLTTEMDIKYT-NV 655
Cdd:pfam17823  204 ATLTPARgISTAATATGHPAAGTALAAVGNSSPAA-GTVTAAVGTVTPAALATlaaaAGTVASAAGTINMGDPHARRlSP 282
                          170       180       190
                   ....*....|....*....|....*....|...
gi 17539308    656 ANDFTTNCVTDQPDATLaRSEVQASTALQTIQQ 688
Cdd:pfam17823  283 AKHMPSDTMARNPAAPM-GAQAQGPIIQVSTDQ 314
EGF_2 pfam07974
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
1505-1530 4.01e-04

EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.


:

Pssm-ID: 400365  Cd Length: 26  Bit Score: 38.87  E-value: 4.01e-04
                           10        20
                   ....*....|....*....|....*.
gi 17539308   1505 VCQNGGTPIQSDGSCLCPSGFQGSDC 1530
Cdd:pfam07974    1 ICSGRGTCVNQCGKCVCDSGYQGATC 26
 
Name Accession Description Interval E-value
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
502-688 1.26e-05

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 49.57  E-value: 1.26e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308    502 ASTTTGEASSTTtgEASSTTTGEASSTTTGEASSTTTGEASSTTTGeaSSTTTGEASSTTTGEATSVAATTSSASSTVVT 581
Cdd:pfam17823  128 QSLPAAIAALPS--EAFSAPRAAACRANASAAPRAAIAAASAPHAA--SPAPRTAASSTTAASSTTAASSAPTTAASSAP 203
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308    582 STEMPSS-TSQAVCTTQDNSATFLFAYAADFDPTTyGLVSRTIGSYVTANLPT----GQTLANVLTDLTTEMDIKYT-NV 655
Cdd:pfam17823  204 ATLTPARgISTAATATGHPAAGTALAAVGNSSPAA-GTVTAAVGTVTPAALATlaaaAGTVASAAGTINMGDPHARRlSP 282
                          170       180       190
                   ....*....|....*....|....*....|...
gi 17539308    656 ANDFTTNCVTDQPDATLaRSEVQASTALQTIQQ 688
Cdd:pfam17823  283 AKHMPSDTMARNPAAPM-GAQAQGPIIQVSTDQ 314
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
503-563 1.67e-04

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 46.54  E-value: 1.67e-04
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 17539308   503 STTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTG 563
Cdd:NF033849  294 SESTGQSSSVGTSESQSHGTTEGTSTTDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSIS 354
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
503-563 3.39e-04

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 45.38  E-value: 3.39e-04
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 17539308   503 STTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTG 563
Cdd:NF033849  278 GHGSTRGWSHTQSTSESESTGQSSSVGTSESQSHGTTEGTSTTDSSSHSQSSSYNVSSGTG 338
EGF_2 pfam07974
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
1505-1530 4.01e-04

EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.


Pssm-ID: 400365  Cd Length: 26  Bit Score: 38.87  E-value: 4.01e-04
                           10        20
                   ....*....|....*....|....*.
gi 17539308   1505 VCQNGGTPIQSDGSCLCPSGFQGSDC 1530
Cdd:pfam07974    1 ICSGRGTCVNQCGKCVCDSGYQGATC 26
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
500-649 4.27e-04

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 45.29  E-value: 4.27e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308   500 PLASTTTGEASSTTTGEaSSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGeatsvaattssasSTV 579
Cdd:NF033609   93 PAQQETTQSASTNATTE-ETPVTGEATTTATNQANTPATTQSSNTNAEELVNQTSNETTSNDTN-------------TVS 158
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308   580 VTSTEMPSSTSQAVCTTQDNS--ATFLFAYAA--DFDPTTYGLVSRTIG-----------SYVTANLP-TGQTLANVLTD 643
Cdd:NF033609  159 SVNSPQNSTNAENVSTTQDTSteATPSNNESApqSTDASNKDVVNQAVNtsaprmrafslAAVAADAPaAGTDITNQLTN 238

                  ....*.
gi 17539308   644 LTTEMD 649
Cdd:NF033609  239 VTVGID 244
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
502-563 1.44e-03

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 43.46  E-value: 1.44e-03
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 17539308   502 ASTTTGEASSTTTGEASSTTTGEASSTTTGeaSSTTTGEASSTTTGEASS----------TTTGEASSTTTG 563
Cdd:NF033849  255 QSHSVGTSESHSVGTSQSQSHTTGHGSTRG--WSHTQSTSESESTGQSSSvgtsesqshgTTEGTSTTDSSS 324
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
503-564 2.33e-03

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 42.69  E-value: 2.33e-03
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 17539308   503 STTTGEASSTTTGEAS----STTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGE 564
Cdd:NF033849  266 SVGTSQSQSHTTGHGStrgwSHTQSTSESESTGQSSSVGTSESQSHGTTEGTSTTDSSSHSQSSSY 331
PHA03255 PHA03255
BDLF3; Provisional
503-633 2.35e-03

BDLF3; Provisional


Pssm-ID: 165513 [Multi-domain]  Cd Length: 234  Bit Score: 41.43  E-value: 2.35e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308   503 STTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEATSVAATTSSASSTVVts 582
Cdd:PHA03255   44 TTPSPSASGPSTNQSTTLTTTSAPITTTAILSTNTTTVTSTGTTVTPVPTTSNASTINVTTKVTAQNITATEAGTGTS-- 121
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|...
gi 17539308   583 temPSSTSQAvcTTQDNSATFLFAYAADFDPTTYGLVSRTIGSY--VTANLPT 633
Cdd:PHA03255  122 ---TGVTSNV--TTRSSSTTSATTRITNATTLAPTLSSKGTSNAtkTTAELPT 169
SP4_N cd22536
N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins ...
500-564 4.55e-03

N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. Human SP4 is a risk gene of multiple psychiatric disorders including schizophrenia, bipolar disorder, and major depression. SP4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP4.


Pssm-ID: 411773 [Multi-domain]  Cd Length: 623  Bit Score: 41.44  E-value: 4.55e-03
                         10        20        30        40        50        60
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 17539308  500 PLASTTTGEASSTTTGEASSTTTGEASSTTTGEASSTT---TGEASSTTTGEASSTTTGEASSTTTGE 564
Cdd:cd22536  275 QLVSTPITTASVSTMPESPSSSTTCTTTASTSLTSSDTlvsSAETGQYASTAASSERTEEEPQTSAAE 342
 
Name Accession Description Interval E-value
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
502-688 1.26e-05

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 49.57  E-value: 1.26e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308    502 ASTTTGEASSTTtgEASSTTTGEASSTTTGEASSTTTGEASSTTTGeaSSTTTGEASSTTTGEATSVAATTSSASSTVVT 581
Cdd:pfam17823  128 QSLPAAIAALPS--EAFSAPRAAACRANASAAPRAAIAAASAPHAA--SPAPRTAASSTTAASSTTAASSAPTTAASSAP 203
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308    582 STEMPSS-TSQAVCTTQDNSATFLFAYAADFDPTTyGLVSRTIGSYVTANLPT----GQTLANVLTDLTTEMDIKYT-NV 655
Cdd:pfam17823  204 ATLTPARgISTAATATGHPAAGTALAAVGNSSPAA-GTVTAAVGTVTPAALATlaaaAGTVASAAGTINMGDPHARRlSP 282
                          170       180       190
                   ....*....|....*....|....*....|...
gi 17539308    656 ANDFTTNCVTDQPDATLaRSEVQASTALQTIQQ 688
Cdd:pfam17823  283 AKHMPSDTMARNPAAPM-GAQAQGPIIQVSTDQ 314
Epiglycanin_TR pfam05647
Tandem-repeating region of mucin, epiglycanin-like; The unusual mucin, epiglycanin, is ...
503-563 3.04e-05

Tandem-repeating region of mucin, epiglycanin-like; The unusual mucin, epiglycanin, is membrane-bound at the C-terminus but has a long region of this tandem-repeat at the N-terminus. It was the first mucin identified to be associated with the malignant behaviour of carcinoma cells. Mouse Muc21/epiglycanin is thought to be a highly glycosylated molecule, which makes it likely that its function is dependent on its glycoforms. Cells expressing Muc21 are significantly less adherent to each other and to extracellular matrix components than control cells, and this loss of adhesion is mediated by the TR portion of Muc21. This family also now contains the repeat that was the C. elegans protein of unknown function (DUF801).


Pssm-ID: 461702 [Multi-domain]  Cd Length: 68  Bit Score: 43.46  E-value: 3.04e-05
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 17539308    503 STTTGEASSTTTGEASSTTTGEASsTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTG 563
Cdd:pfam05647    5 SSTTSSGASTTSNTGSSTTSGGTS-TTSNTGSSTTSSGTSTATNTGSSETSSGSSTTSSTG 64
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
503-563 1.67e-04

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 46.54  E-value: 1.67e-04
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 17539308   503 STTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTG 563
Cdd:NF033849  294 SESTGQSSSVGTSESQSHGTTEGTSTTDSSSHSQSSSYNVSSGTGVSSSHSDGTSQSTSIS 354
Epiglycanin_TR pfam05647
Tandem-repeating region of mucin, epiglycanin-like; The unusual mucin, epiglycanin, is ...
502-561 2.61e-04

Tandem-repeating region of mucin, epiglycanin-like; The unusual mucin, epiglycanin, is membrane-bound at the C-terminus but has a long region of this tandem-repeat at the N-terminus. It was the first mucin identified to be associated with the malignant behaviour of carcinoma cells. Mouse Muc21/epiglycanin is thought to be a highly glycosylated molecule, which makes it likely that its function is dependent on its glycoforms. Cells expressing Muc21 are significantly less adherent to each other and to extracellular matrix components than control cells, and this loss of adhesion is mediated by the TR portion of Muc21. This family also now contains the repeat that was the C. elegans protein of unknown function (DUF801).


Pssm-ID: 461702 [Multi-domain]  Cd Length: 68  Bit Score: 40.77  E-value: 2.61e-04
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308    502 ASTTTGEASSTTTGEASsTTTGEASSTTTGeASSTTTGEASSTTTGeASSTTTGEASSTT 561
Cdd:pfam05647   12 ASTTSNTGSSTTSGGTS-TTSNTGSSTTSS-GTSTATNTGSSETSS-GSSTTSSTGTSTT 68
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
503-563 3.39e-04

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 45.38  E-value: 3.39e-04
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 17539308   503 STTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTG 563
Cdd:NF033849  278 GHGSTRGWSHTQSTSESESTGQSSSVGTSESQSHGTTEGTSTTDSSSHSQSSSYNVSSGTG 338
Epiglycanin_TR pfam05647
Tandem-repeating region of mucin, epiglycanin-like; The unusual mucin, epiglycanin, is ...
507-563 3.72e-04

Tandem-repeating region of mucin, epiglycanin-like; The unusual mucin, epiglycanin, is membrane-bound at the C-terminus but has a long region of this tandem-repeat at the N-terminus. It was the first mucin identified to be associated with the malignant behaviour of carcinoma cells. Mouse Muc21/epiglycanin is thought to be a highly glycosylated molecule, which makes it likely that its function is dependent on its glycoforms. Cells expressing Muc21 are significantly less adherent to each other and to extracellular matrix components than control cells, and this loss of adhesion is mediated by the TR portion of Muc21. This family also now contains the repeat that was the C. elegans protein of unknown function (DUF801).


Pssm-ID: 461702 [Multi-domain]  Cd Length: 68  Bit Score: 40.38  E-value: 3.72e-04
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 17539308    507 GEASSTTTGEASSTTTGEASSTTTGEASsTTTGEASSTTTGeASSTTTGEASSTTTG 563
Cdd:pfam05647    1 SSTESSTTSSGASTTSNTGSSTTSGGTS-TTSNTGSSTTSS-GTSTATNTGSSETSS 55
EGF_2 pfam07974
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
1505-1530 4.01e-04

EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.


Pssm-ID: 400365  Cd Length: 26  Bit Score: 38.87  E-value: 4.01e-04
                           10        20
                   ....*....|....*....|....*.
gi 17539308   1505 VCQNGGTPIQSDGSCLCPSGFQGSDC 1530
Cdd:pfam07974    1 ICSGRGTCVNQCGKCVCDSGYQGATC 26
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
500-649 4.27e-04

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 45.29  E-value: 4.27e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308   500 PLASTTTGEASSTTTGEaSSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGeatsvaattssasSTV 579
Cdd:NF033609   93 PAQQETTQSASTNATTE-ETPVTGEATTTATNQANTPATTQSSNTNAEELVNQTSNETTSNDTN-------------TVS 158
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308   580 VTSTEMPSSTSQAVCTTQDNS--ATFLFAYAA--DFDPTTYGLVSRTIG-----------SYVTANLP-TGQTLANVLTD 643
Cdd:NF033609  159 SVNSPQNSTNAENVSTTQDTSteATPSNNESApqSTDASNKDVVNQAVNtsaprmrafslAAVAADAPaAGTDITNQLTN 238

                  ....*.
gi 17539308   644 LTTEMD 649
Cdd:NF033609  239 VTVGID 244
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
502-563 1.44e-03

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 43.46  E-value: 1.44e-03
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 17539308   502 ASTTTGEASSTTTGEASSTTTGEASSTTTGeaSSTTTGEASSTTTGEASS----------TTTGEASSTTTG 563
Cdd:NF033849  255 QSHSVGTSESHSVGTSQSQSHTTGHGSTRG--WSHTQSTSESESTGQSSSvgtsesqshgTTEGTSTTDSSS 324
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
503-564 2.33e-03

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 42.69  E-value: 2.33e-03
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 17539308   503 STTTGEASSTTTGEAS----STTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGE 564
Cdd:NF033849  266 SVGTSQSQSHTTGHGStrgwSHTQSTSESESTGQSSSVGTSESQSHGTTEGTSTTDSSSHSQSSSY 331
PHA03255 PHA03255
BDLF3; Provisional
503-633 2.35e-03

BDLF3; Provisional


Pssm-ID: 165513 [Multi-domain]  Cd Length: 234  Bit Score: 41.43  E-value: 2.35e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 17539308   503 STTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEATSVAATTSSASSTVVts 582
Cdd:PHA03255   44 TTPSPSASGPSTNQSTTLTTTSAPITTTAILSTNTTTVTSTGTTVTPVPTTSNASTINVTTKVTAQNITATEAGTGTS-- 121
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|...
gi 17539308   583 temPSSTSQAvcTTQDNSATFLFAYAADFDPTTYGLVSRTIGSY--VTANLPT 633
Cdd:PHA03255  122 ---TGVTSNV--TTRSSSTTSATTRITNATTLAPTLSSKGTSNAtkTTAELPT 169
FSA_C pfam10479
Fragile site-associated protein C-terminus; This is the conserved C-terminal half of the ...
499-555 3.12e-03

Fragile site-associated protein C-terminus; This is the conserved C-terminal half of the protein KIAA1109 which is the fragile site-associated protein FSA. Genome-wide-association studies showed this protein to linked to the susceptibility to coeliac disease. The protein may also be associated with polycystic kidney disease.


Pssm-ID: 463105  Cd Length: 701  Bit Score: 42.09  E-value: 3.12e-03
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 17539308    499 EPLASTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTG 555
Cdd:pfam10479  553 EPMQGSYTNIANSTTANTATANTTTTTTTTTAATASSTNSTPTTTTTTTSTNDSKDG 609
SP4_N cd22536
N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins ...
500-564 4.55e-03

N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. Human SP4 is a risk gene of multiple psychiatric disorders including schizophrenia, bipolar disorder, and major depression. SP4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP4.


Pssm-ID: 411773 [Multi-domain]  Cd Length: 623  Bit Score: 41.44  E-value: 4.55e-03
                         10        20        30        40        50        60
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 17539308  500 PLASTTTGEASSTTTGEASSTTTGEASSTTTGEASSTT---TGEASSTTTGEASSTTTGEASSTTTGE 564
Cdd:cd22536  275 QLVSTPITTASVSTMPESPSSSTTCTTTASTSLTSSDTlvsSAETGQYASTAASSERTEEEPQTSAAE 342
PRK10905 PRK10905
cell division protein DamX; Validated
499-561 4.85e-03

cell division protein DamX; Validated


Pssm-ID: 236792 [Multi-domain]  Cd Length: 328  Bit Score: 41.08  E-value: 4.85e-03
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 17539308   499 EPLASTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTTTGEASSTT 561
Cdd:PRK10905  187 EPAAPVASTKAPAATSTPAPKETATTAPVQTASPAQTTATPAAGGKTAGNVGSLKSAPSSHYT 249
SSP160 pfam06933
Special lobe-specific silk protein SSP160; This family consists of several special ...
509-563 5.58e-03

Special lobe-specific silk protein SSP160; This family consists of several special lobe-specific silk protein SSP160 sequences which appear to be specific to Chironomus (Midge) species.


Pssm-ID: 115579 [Multi-domain]  Cd Length: 758  Bit Score: 41.30  E-value: 5.58e-03
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 17539308    509 ASSTTTGEASSTTTGEASSTTTGEASSTTTGEaSSTTTGEASSTTTGEASSTTTG 563
Cdd:pfam06933  101 GSGSASGNSSSSANSTSNSNSTTSNNSTTSSN-STTTTSNSTSSSNSTSSGLTSG 154
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH