NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1926162203|ref|XP_036870571|]
View 

protein SON isoform X3 [Manis javanica]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
rne super family cl35953
ribonuclease E; Reviewed
1345-1641 8.17e-08

ribonuclease E; Reviewed


The actual alignment was detected with superfamily member PRK10811:

Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 58.13  E-value: 8.17e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1345 EVPALPPEESVSlpETVSQNEISEP-----------------SALLANYSVSASEPSVLTSEAAVTAPEPPLEPESSVMS 1407
Cdd:PRK10811   693 EAKALNVEEQSV--QETEQEERVQQvqprrkqrqlnqkvrieQSVAEEAVAPVVEETVAAEPVVQEVPAPRTELVKVPLP 770
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1408 TPAESAVATEEHEIVPERpvtyiSENSM----LAEPSML----------------TSEPTIMSETAetfdsmrASGHTAS 1467
Cdd:PRK10811   771 VVAQTAPEQDEENNAENR-----DNNGMprrsRRSPRHLrvsgqrrrryrderypTQSPMPLTVAC-------ASPEMAS 838
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1468 -EVSIS--LMEPAVTIPEPSQQSTLELPAMAVSELPVMAVPEPLAVA---------VPAPAVMAVPELPAVVVAEHPAVA 1535
Cdd:PRK10811   839 gKVWIRypVVRPQDVQVEEQREAEEVQVQPVVAEVPVAAAVEPVVSApvveavaevVEEPVVVAEPQPEEVVVVETTHPE 918
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1536 VPEYPAVAVPeypavAVLDPPAESLLEPMAlAEPEHVTIPVPHVSALEPTVPGLEPTVSvlQPNVIVSEPSVSVQESTVT 1615
Cdd:PRK10811   919 VIAAPVTEQP-----QVITESDVAVAQEVA-EHAEPVVEPQDETADIEEAAETAEVVVA--EPEVVAQPAAPVVAEVAAE 990
                          330       340       350
                   ....*....|....*....|....*....|
gi 1926162203 1616 VLESAVTISQQTQVISTETVLE----STPM 1641
Cdd:PRK10811   991 VETVTAVEPEVAPAQVPEATVEhnhaTAPM 1020
ser_rich_anae_1 super family cl41472
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
742-1038 8.78e-07

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


The actual alignment was detected with superfamily member NF033849:

Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 54.63  E-value: 8.78e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  742 MLASNTMDSQmlasSTMDSQMLATSSMDSQMLATSSMDSQMLATSSMDSQMLATSSMDSqmlaTSSMDSQMLATSSMDSQ 821
Cdd:NF033849   230 MYAANLGQSA----GTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRG----WSHTQSTSESESTGQSS 301
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  822 MLATSSMDSQ--MLATSTMDSQMLATSSMDSQMLATSSMDSQmlatSSMDSQMLATSSMDSQMLATSSMDSQMLATSSMD 899
Cdd:NF033849   302 SVGTSESQSHgtTEGTSTTDSSSHSQSSSYNVSSGTGVSSSH----SDGTSQSTSISHSESSSESTGTSVGHSTSSSVSS 377
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  900 SQMLATSS---MDSQM---LATSSMDSQMLATSSMDSQ-------MLATSSMDSQMLATSSMDSQmlATSSMDSQMLATS 966
Cdd:NF033849   378 SESSSRSSssgVSGGFsggIAGGGVTSEGLGASQGGSEgwgsgdsVQSVSQSYGSSSSTGTSSGH--SDSSSHSTSSGQA 455
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1926162203  967 SMDSQMLATSSM--DSQMLATSTM----DSQMLATSTMDSQMLATSSMDSQMLASGAMDSQMLASGSMDAQML-ASGTM 1038
Cdd:NF033849   456 DSVSQGTSWSEGtgTSQGQSVGTSeswsTSQSETDSVGDSTGTSESVSQGDGRSTGRSESQGTSLGTSGGRTSgAGGSM 534
PHA03247 super family cl33720
large tegument protein UL36; Provisional
249-640 3.25e-05

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 49.55  E-value: 3.25e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  249 VPTTAVVLKSSEPVVTMSveyqtksvlKSLESIPPEPSKImllEPPVAKVLEPSETLVSSEIPTEVH----PEPSTSTTM 324
Cdd:PHA03247  2568 VPPPRPAPRPSEPAVTSR---------ARRPDAPPQSARP---RAPVDDRGDPRGPAPPSPLPPDTHapdpPPPSPSPAA 2635
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  325 DFPESAATEVLRLPEQPVEVPL---------EIADSSMTRPQELLELPKTTPLELPESSVASVMELPGPPATSmlELQGP 395
Cdd:PHA03247  2636 NEPDPHPPPTVPPPERPRDDPApgrvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTP--EPAPH 2713
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  396 PVTPVLELPGPSATPVPELPGPLSTPVPEllgppatAVPELPGPSVTSVPQLSQELPGLPAPSMGLEPPQEVPEPPVMAQ 475
Cdd:PHA03247  2714 ALVSATPLPPGPAAARQASPALPAAPAPP-------AVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRP 2786
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  476 ELPGLPVVTAAVELPGQPAVTVAMELTEQPVTTTELEQSVGMTTVEHPGQPEVTTATGLLGQPEAAMVLELPGQPVATTA 555
Cdd:PHA03247  2787 AVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRP 2866
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  556 lelPGQPSVTgVPELPGLPSATR----ALELSGQPVATGALELPGQLMAAGALEFAGQSgaagalELLGQPLATGVLELP 631
Cdd:PHA03247  2867 ---PSRSPAA-KPAAPARPPVRRlarpAVSRSTESFALPPDQPERPPQPQAPPPPQPQP------QPPPPPQPQPPPPPP 2936

                   ....*....
gi 1926162203  632 GQPGAPELP 640
Cdd:PHA03247  2937 PRPQPPLAP 2945
 
Name Accession Description Interval E-value
rne PRK10811
ribonuclease E; Reviewed
1345-1641 8.17e-08

ribonuclease E; Reviewed


Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 58.13  E-value: 8.17e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1345 EVPALPPEESVSlpETVSQNEISEP-----------------SALLANYSVSASEPSVLTSEAAVTAPEPPLEPESSVMS 1407
Cdd:PRK10811   693 EAKALNVEEQSV--QETEQEERVQQvqprrkqrqlnqkvrieQSVAEEAVAPVVEETVAAEPVVQEVPAPRTELVKVPLP 770
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1408 TPAESAVATEEHEIVPERpvtyiSENSM----LAEPSML----------------TSEPTIMSETAetfdsmrASGHTAS 1467
Cdd:PRK10811   771 VVAQTAPEQDEENNAENR-----DNNGMprrsRRSPRHLrvsgqrrrryrderypTQSPMPLTVAC-------ASPEMAS 838
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1468 -EVSIS--LMEPAVTIPEPSQQSTLELPAMAVSELPVMAVPEPLAVA---------VPAPAVMAVPELPAVVVAEHPAVA 1535
Cdd:PRK10811   839 gKVWIRypVVRPQDVQVEEQREAEEVQVQPVVAEVPVAAAVEPVVSApvveavaevVEEPVVVAEPQPEEVVVVETTHPE 918
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1536 VPEYPAVAVPeypavAVLDPPAESLLEPMAlAEPEHVTIPVPHVSALEPTVPGLEPTVSvlQPNVIVSEPSVSVQESTVT 1615
Cdd:PRK10811   919 VIAAPVTEQP-----QVITESDVAVAQEVA-EHAEPVVEPQDETADIEEAAETAEVVVA--EPEVVAQPAAPVVAEVAAE 990
                          330       340       350
                   ....*....|....*....|....*....|
gi 1926162203 1616 VLESAVTISQQTQVISTETVLE----STPM 1641
Cdd:PRK10811   991 VETVTAVEPEVAPAQVPEATVEhnhaTAPM 1020
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
742-1038 8.78e-07

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 54.63  E-value: 8.78e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  742 MLASNTMDSQmlasSTMDSQMLATSSMDSQMLATSSMDSQMLATSSMDSQMLATSSMDSqmlaTSSMDSQMLATSSMDSQ 821
Cdd:NF033849   230 MYAANLGQSA----GTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRG----WSHTQSTSESESTGQSS 301
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  822 MLATSSMDSQ--MLATSTMDSQMLATSSMDSQMLATSSMDSQmlatSSMDSQMLATSSMDSQMLATSSMDSQMLATSSMD 899
Cdd:NF033849   302 SVGTSESQSHgtTEGTSTTDSSSHSQSSSYNVSSGTGVSSSH----SDGTSQSTSISHSESSSESTGTSVGHSTSSSVSS 377
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  900 SQMLATSS---MDSQM---LATSSMDSQMLATSSMDSQ-------MLATSSMDSQMLATSSMDSQmlATSSMDSQMLATS 966
Cdd:NF033849   378 SESSSRSSssgVSGGFsggIAGGGVTSEGLGASQGGSEgwgsgdsVQSVSQSYGSSSSTGTSSGH--SDSSSHSTSSGQA 455
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1926162203  967 SMDSQMLATSSM--DSQMLATSTM----DSQMLATSTMDSQMLATSSMDSQMLASGAMDSQMLASGSMDAQML-ASGTM 1038
Cdd:NF033849   456 DSVSQGTSWSEGtgTSQGQSVGTSeswsTSQSETDSVGDSTGTSESVSQGDGRSTGRSESQGTSLGTSGGRTSgAGGSM 534
PHA03247 PHA03247
large tegument protein UL36; Provisional
249-640 3.25e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 49.55  E-value: 3.25e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  249 VPTTAVVLKSSEPVVTMSveyqtksvlKSLESIPPEPSKImllEPPVAKVLEPSETLVSSEIPTEVH----PEPSTSTTM 324
Cdd:PHA03247  2568 VPPPRPAPRPSEPAVTSR---------ARRPDAPPQSARP---RAPVDDRGDPRGPAPPSPLPPDTHapdpPPPSPSPAA 2635
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  325 DFPESAATEVLRLPEQPVEVPL---------EIADSSMTRPQELLELPKTTPLELPESSVASVMELPGPPATSmlELQGP 395
Cdd:PHA03247  2636 NEPDPHPPPTVPPPERPRDDPApgrvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTP--EPAPH 2713
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  396 PVTPVLELPGPSATPVPELPGPLSTPVPEllgppatAVPELPGPSVTSVPQLSQELPGLPAPSMGLEPPQEVPEPPVMAQ 475
Cdd:PHA03247  2714 ALVSATPLPPGPAAARQASPALPAAPAPP-------AVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRP 2786
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  476 ELPGLPVVTAAVELPGQPAVTVAMELTEQPVTTTELEQSVGMTTVEHPGQPEVTTATGLLGQPEAAMVLELPGQPVATTA 555
Cdd:PHA03247  2787 AVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRP 2866
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  556 lelPGQPSVTgVPELPGLPSATR----ALELSGQPVATGALELPGQLMAAGALEFAGQSgaagalELLGQPLATGVLELP 631
Cdd:PHA03247  2867 ---PSRSPAA-KPAAPARPPVRRlarpAVSRSTESFALPPDQPERPPQPQAPPPPQPQP------QPPPPPQPQPPPPPP 2936

                   ....*....
gi 1926162203  632 GQPGAPELP 640
Cdd:PHA03247  2937 PRPQPPLAP 2945
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
176-446 9.51e-05

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 47.84  E-value: 9.51e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  176 LSETSESAVVLEPPILSMEiSEPHTLETLKPATKTAELSVASASVISVQSEQSVAVMLEPSTTKVLDsfaTAPVPTTAVV 255
Cdd:NF033839   113 LNKIVESTSKSQLQKLMME-SQSKVDEAVSKFEKDSSSSSSSGSSTKPETPQPENPEHQKPTTPAPD---TKPSPQPEGK 188
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  256 lKSSEPVVTMSVEYQTKSVLKSLESIPPEPSKIMLLEP------PVAKVLEPSETLVSSEIP---TEVHPEPSTSTTMDF 326
Cdd:NF033839   189 -KPSVPDINQEKEKAKLAVATYMSKILDDIQKHHLQKEkhrqivALIKELDELKKQALSEIDnvnTKVEIENTVHKIFAD 267
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  327 PESAATEVLR-----LPEQPVEVPLEIADSSMTRPQELLELPKTTPLELPESSVASVMELPGPPATSMLELQGPPVTPVL 401
Cdd:NF033839   268 MDAVVTKFKKgltqdTPKEPGNKKPSAPKPGMQPSPQPEKKEVKPEPETPKPEVKPQLEKPKPEVKPQPEKPKPEVKPQL 347
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|....*
gi 1926162203  402 ELPGPSATPVPELPGPLSTPVPELLGPPATAVPELPGPSVTSVPQ 446
Cdd:NF033839   348 ETPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPETPKPEVKPQPE 392
DUF3729 pfam12526
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ...
367-450 3.95e-04

Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.


Pssm-ID: 372164 [Multi-domain]  Cd Length: 115  Bit Score: 42.37  E-value: 3.95e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  367 PLELPESSVASVMELPGPPATSMLelqgPPVTPVLELPGPSATPVPELPGPlsTPVPELLGPPATAVPELPGPSVTSVPQ 446
Cdd:pfam12526   31 PPESAHPDPPPPVGDPRPPVVDTP----PPVSAVWVLPPPSEPAAPEPDLV--PPVTGPAGPPSPLAPPAPAQKPPLPPP 104

                   ....
gi 1926162203  447 LSQE 450
Cdd:pfam12526  105 RPQR 108
Lepto_longest TIGR04388
putative large structural protein; Members of this family are restricted so far to the lineage ...
646-1042 6.92e-04

putative large structural protein; Members of this family are restricted so far to the lineage Leptospira, where they may be the longest protein encoded by the genome. Two or three paralogs are often found. The seed alignment for this model includes sequences with significant length variability, and stops adjacent to an intein feature most full-length members of this family share. Oddly, members closely related in sequence up to the start of the intein (see TIGR01445) usually show very little sequence similarity C-terminal to the end of the intein (see TIGR01443). [Unknown function, General]


Pssm-ID: 275181 [Multi-domain]  Cd Length: 1134  Bit Score: 45.27  E-value: 6.92e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  646 TVALEISVQSVVTTELSTMTVSQSLEVPSTTALESYNTV----------------------AQELPTTLVGETSVTVGVD 703
Cdd:TIGR04388  435 TFWYSINFQVQDLNAYANATTWNGFVSQLNSELHSWNNVtpsitnwegqvaayqaqyaawhAQAQTYIDSLQQSYTTGVQ 514
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  704 PLMAQESHMLasntmethmlasNTMDSQMLASNTMDSQMLASNTMDSQMLASSTMDSQMLATSSMDSQMLATSSmDSQML 783
Cdd:TIGR04388  515 DLQSQEQSWL------------ANMGNQFQAQSSFAQASNDLDNLKTQQILDSLSPKISQVNLPSSSDVLSRNA-DAPVP 581
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  784 ATSSMDSqmlATSSMDSQMLATS--SMDSQmLATSSMDSQMLATSSMDSQMLATSTMDSQ---MLATSSMDSQMLATSSM 858
Cdd:TIGR04388  582 DQSSLNN---VLSIFQQSLMGASnlALENQ-LNNQAIDEKLNAIQGIASSLGSNAVVDSHgniVYTTVIEDGHAKLKAGG 657
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  859 DsqmlATSSMDSQmlatSSMDSQMLATSSMDSQMLATSSMdSQMLATSSMDSQMLAT-----SSMDSQMLATSSMDSQML 933
Cdd:TIGR04388  658 D----ATNASDYE----ANTTDRVVTIAAPATIAIGAAAA-GDLFQSWDTGSVVSQNftnlgSFNSSYNSTINSLNHQVA 728
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  934 ATSSMDSQMLATSSMDSQMLATSSMDSQMLAT---------SSMDSQML-ATSSMDSQMLATST-----MDSQMLatSTM 998
Cdd:TIGR04388  729 ALNSLNAKNDSSFQEDAQAKASIASLIQSLAQavllggsfgSWVKGQIQdKVNSAIATALANTTgmspdMAAQLV--SWF 806
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....*...
gi 1926162203  999 DSQMLATSSMDSQMLASGAMDSQMLASGSMD----AQMLASGTMDAQM 1042
Cdd:TIGR04388  807 EKQQAAKKAKAKARTEDITSGVVVVASIALSflagPEMLAVGQAALQA 854
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
363-505 1.42e-03

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 43.91  E-value: 1.42e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  363 PKTTPLELPESSVASVMELPG--PPATSMLELQG--PPVTPVLELPGPSATPVPELPGPLstPVPELLGPPATAVPELPG 438
Cdd:TIGR01645  326 PRAQSPATPSSSLPTDIGNKAvvSSAKKEAEEVPplPQAAPAVVKPGPMEIPTPVPPPGL--AIPSLVAPPGLVAPTEIN 403
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1926162203  439 PSVTSVPQLSQELPGLPA------PSMGLEPPQEVPEPPVMAQELPGLPVVTAAVELPGQPAVTVAMELTEQP 505
Cdd:TIGR01645  404 PSFLASPRKKMKREKLPVtfgaldDTLAWKEPSKEDQTSEDGKMLAIMGEAAAALALEPKKKKKEKEGEELQP 476
 
Name Accession Description Interval E-value
rne PRK10811
ribonuclease E; Reviewed
1345-1641 8.17e-08

ribonuclease E; Reviewed


Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 58.13  E-value: 8.17e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1345 EVPALPPEESVSlpETVSQNEISEP-----------------SALLANYSVSASEPSVLTSEAAVTAPEPPLEPESSVMS 1407
Cdd:PRK10811   693 EAKALNVEEQSV--QETEQEERVQQvqprrkqrqlnqkvrieQSVAEEAVAPVVEETVAAEPVVQEVPAPRTELVKVPLP 770
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1408 TPAESAVATEEHEIVPERpvtyiSENSM----LAEPSML----------------TSEPTIMSETAetfdsmrASGHTAS 1467
Cdd:PRK10811   771 VVAQTAPEQDEENNAENR-----DNNGMprrsRRSPRHLrvsgqrrrryrderypTQSPMPLTVAC-------ASPEMAS 838
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1468 -EVSIS--LMEPAVTIPEPSQQSTLELPAMAVSELPVMAVPEPLAVA---------VPAPAVMAVPELPAVVVAEHPAVA 1535
Cdd:PRK10811   839 gKVWIRypVVRPQDVQVEEQREAEEVQVQPVVAEVPVAAAVEPVVSApvveavaevVEEPVVVAEPQPEEVVVVETTHPE 918
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1536 VPEYPAVAVPeypavAVLDPPAESLLEPMAlAEPEHVTIPVPHVSALEPTVPGLEPTVSvlQPNVIVSEPSVSVQESTVT 1615
Cdd:PRK10811   919 VIAAPVTEQP-----QVITESDVAVAQEVA-EHAEPVVEPQDETADIEEAAETAEVVVA--EPEVVAQPAAPVVAEVAAE 990
                          330       340       350
                   ....*....|....*....|....*....|
gi 1926162203 1616 VLESAVTISQQTQVISTETVLE----STPM 1641
Cdd:PRK10811   991 VETVTAVEPEVAPAQVPEATVEhnhaTAPM 1020
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
742-1038 8.78e-07

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 54.63  E-value: 8.78e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  742 MLASNTMDSQmlasSTMDSQMLATSSMDSQMLATSSMDSQMLATSSMDSQMLATSSMDSqmlaTSSMDSQMLATSSMDSQ 821
Cdd:NF033849   230 MYAANLGQSA----GTGYGESVGHSTSQGQSHSVGTSESHSVGTSQSQSHTTGHGSTRG----WSHTQSTSESESTGQSS 301
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  822 MLATSSMDSQ--MLATSTMDSQMLATSSMDSQMLATSSMDSQmlatSSMDSQMLATSSMDSQMLATSSMDSQMLATSSMD 899
Cdd:NF033849   302 SVGTSESQSHgtTEGTSTTDSSSHSQSSSYNVSSGTGVSSSH----SDGTSQSTSISHSESSSESTGTSVGHSTSSSVSS 377
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  900 SQMLATSS---MDSQM---LATSSMDSQMLATSSMDSQ-------MLATSSMDSQMLATSSMDSQmlATSSMDSQMLATS 966
Cdd:NF033849   378 SESSSRSSssgVSGGFsggIAGGGVTSEGLGASQGGSEgwgsgdsVQSVSQSYGSSSSTGTSSGH--SDSSSHSTSSGQA 455
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1926162203  967 SMDSQMLATSSM--DSQMLATSTM----DSQMLATSTMDSQMLATSSMDSQMLASGAMDSQMLASGSMDAQML-ASGTM 1038
Cdd:NF033849   456 DSVSQGTSWSEGtgTSQGQSVGTSeswsTSQSETDSVGDSTGTSESVSQGDGRSTGRSESQGTSLGTSGGRTSgAGGSM 534
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
1464-1598 9.63e-07

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 54.10  E-value: 9.63e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1464 HTASEVSISLMEPAVTIPEPSQQSTLELPAMAVSELPVMAVPEPLAVAVPAPAVMAVPELPAVVVAEHPAVAVPEYPAVA 1543
Cdd:PRK07994   360 HPAAPLPEPEVPPQSAAPAASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTSQLLAARQQLQRAQGATKAK 439
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1926162203 1544 VPEyPAVAVLDPPAESLLEPMALAEPEHVTIPVPHVSA----LEPTVPGLEPTVSVLQP 1598
Cdd:PRK07994   440 KSE-PAAASRARPVNSALERLASVRPAPSALEKAPAKKeayrWKATNPVEVKKEPVATP 497
rne PRK10811
ribonuclease E; Reviewed
1500-1647 7.47e-06

ribonuclease E; Reviewed


Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 51.58  E-value: 7.47e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1500 PVMAVPEPLAVAVPAPAVMAVPELPAVVVAEHPAVAVPEYPAVAVPEYPAVAVldPPAESLLEPMALAEPEHVTIPV--- 1576
Cdd:PRK10811   850 PQDVQVEEQREAEEVQVQPVVAEVPVAAAVEPVVSAPVVEAVAEVVEEPVVVA--EPQPEEVVVVETTHPEVIAAPVteq 927
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1926162203 1577 PHVSAlEPTVPGLEPTVSVLQPNVIVSEPSVSVQEST----VTVLESAVTISQQTQVISTETVLESTPMILESSV 1647
Cdd:PRK10811   928 PQVIT-ESDVAVAQEVAEHAEPVVEPQDETADIEEAAetaeVVVAEPEVVAQPAAPVVAEVAAEVETVTAVEPEV 1001
PHA03247 PHA03247
large tegument protein UL36; Provisional
249-640 3.25e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 49.55  E-value: 3.25e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  249 VPTTAVVLKSSEPVVTMSveyqtksvlKSLESIPPEPSKImllEPPVAKVLEPSETLVSSEIPTEVH----PEPSTSTTM 324
Cdd:PHA03247  2568 VPPPRPAPRPSEPAVTSR---------ARRPDAPPQSARP---RAPVDDRGDPRGPAPPSPLPPDTHapdpPPPSPSPAA 2635
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  325 DFPESAATEVLRLPEQPVEVPL---------EIADSSMTRPQELLELPKTTPLELPESSVASVMELPGPPATSmlELQGP 395
Cdd:PHA03247  2636 NEPDPHPPPTVPPPERPRDDPApgrvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTP--EPAPH 2713
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  396 PVTPVLELPGPSATPVPELPGPLSTPVPEllgppatAVPELPGPSVTSVPQLSQELPGLPAPSMGLEPPQEVPEPPVMAQ 475
Cdd:PHA03247  2714 ALVSATPLPPGPAAARQASPALPAAPAPP-------AVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRP 2786
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  476 ELPGLPVVTAAVELPGQPAVTVAMELTEQPVTTTELEQSVGMTTVEHPGQPEVTTATGLLGQPEAAMVLELPGQPVATTA 555
Cdd:PHA03247  2787 AVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRP 2866
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  556 lelPGQPSVTgVPELPGLPSATR----ALELSGQPVATGALELPGQLMAAGALEFAGQSgaagalELLGQPLATGVLELP 631
Cdd:PHA03247  2867 ---PSRSPAA-KPAAPARPPVRRlarpAVSRSTESFALPPDQPERPPQPQAPPPPQPQP------QPPPPPQPQPPPPPP 2936

                   ....*....
gi 1926162203  632 GQPGAPELP 640
Cdd:PHA03247  2937 PRPQPPLAP 2945
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
176-446 9.51e-05

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 47.84  E-value: 9.51e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  176 LSETSESAVVLEPPILSMEiSEPHTLETLKPATKTAELSVASASVISVQSEQSVAVMLEPSTTKVLDsfaTAPVPTTAVV 255
Cdd:NF033839   113 LNKIVESTSKSQLQKLMME-SQSKVDEAVSKFEKDSSSSSSSGSSTKPETPQPENPEHQKPTTPAPD---TKPSPQPEGK 188
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  256 lKSSEPVVTMSVEYQTKSVLKSLESIPPEPSKIMLLEP------PVAKVLEPSETLVSSEIP---TEVHPEPSTSTTMDF 326
Cdd:NF033839   189 -KPSVPDINQEKEKAKLAVATYMSKILDDIQKHHLQKEkhrqivALIKELDELKKQALSEIDnvnTKVEIENTVHKIFAD 267
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  327 PESAATEVLR-----LPEQPVEVPLEIADSSMTRPQELLELPKTTPLELPESSVASVMELPGPPATSMLELQGPPVTPVL 401
Cdd:NF033839   268 MDAVVTKFKKgltqdTPKEPGNKKPSAPKPGMQPSPQPEKKEVKPEPETPKPEVKPQLEKPKPEVKPQPEKPKPEVKPQL 347
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|....*
gi 1926162203  402 ELPGPSATPVPELPGPLSTPVPELLGPPATAVPELPGPSVTSVPQ 446
Cdd:NF033839   348 ETPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPETPKPEVKPQPE 392
DUF3729 pfam12526
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ...
367-450 3.95e-04

Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.


Pssm-ID: 372164 [Multi-domain]  Cd Length: 115  Bit Score: 42.37  E-value: 3.95e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  367 PLELPESSVASVMELPGPPATSMLelqgPPVTPVLELPGPSATPVPELPGPlsTPVPELLGPPATAVPELPGPSVTSVPQ 446
Cdd:pfam12526   31 PPESAHPDPPPPVGDPRPPVVDTP----PPVSAVWVLPPPSEPAAPEPDLV--PPVTGPAGPPSPLAPPAPAQKPPLPPP 104

                   ....
gi 1926162203  447 LSQE 450
Cdd:pfam12526  105 RPQR 108
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
1477-1595 6.10e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 45.09  E-value: 6.10e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1477 AVTIPEPSQQSTLELPAMAvselPVMAVPEPLAVAVPAP-----AVMAVPELPAVVVAEHPAVAVPEYPAVAVPEYPAVA 1551
Cdd:PRK14951   369 AAEAAAPAEKKTPARPEAA----APAAAPVAQAAAAPAPaaapaAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAAAPAA 444
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....
gi 1926162203 1552 VLDPPAESLLEPMALAEPEHVTIPVPHVSALEPTVPGLEPTVSV 1595
Cdd:PRK14951   445 VALAPAPPAQAAPETVAIPVRVAPEPAVASAAPAPAAAPAAARL 488
Lepto_longest TIGR04388
putative large structural protein; Members of this family are restricted so far to the lineage ...
646-1042 6.92e-04

putative large structural protein; Members of this family are restricted so far to the lineage Leptospira, where they may be the longest protein encoded by the genome. Two or three paralogs are often found. The seed alignment for this model includes sequences with significant length variability, and stops adjacent to an intein feature most full-length members of this family share. Oddly, members closely related in sequence up to the start of the intein (see TIGR01445) usually show very little sequence similarity C-terminal to the end of the intein (see TIGR01443). [Unknown function, General]


Pssm-ID: 275181 [Multi-domain]  Cd Length: 1134  Bit Score: 45.27  E-value: 6.92e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  646 TVALEISVQSVVTTELSTMTVSQSLEVPSTTALESYNTV----------------------AQELPTTLVGETSVTVGVD 703
Cdd:TIGR04388  435 TFWYSINFQVQDLNAYANATTWNGFVSQLNSELHSWNNVtpsitnwegqvaayqaqyaawhAQAQTYIDSLQQSYTTGVQ 514
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  704 PLMAQESHMLasntmethmlasNTMDSQMLASNTMDSQMLASNTMDSQMLASSTMDSQMLATSSMDSQMLATSSmDSQML 783
Cdd:TIGR04388  515 DLQSQEQSWL------------ANMGNQFQAQSSFAQASNDLDNLKTQQILDSLSPKISQVNLPSSSDVLSRNA-DAPVP 581
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  784 ATSSMDSqmlATSSMDSQMLATS--SMDSQmLATSSMDSQMLATSSMDSQMLATSTMDSQ---MLATSSMDSQMLATSSM 858
Cdd:TIGR04388  582 DQSSLNN---VLSIFQQSLMGASnlALENQ-LNNQAIDEKLNAIQGIASSLGSNAVVDSHgniVYTTVIEDGHAKLKAGG 657
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  859 DsqmlATSSMDSQmlatSSMDSQMLATSSMDSQMLATSSMdSQMLATSSMDSQMLAT-----SSMDSQMLATSSMDSQML 933
Cdd:TIGR04388  658 D----ATNASDYE----ANTTDRVVTIAAPATIAIGAAAA-GDLFQSWDTGSVVSQNftnlgSFNSSYNSTINSLNHQVA 728
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  934 ATSSMDSQMLATSSMDSQMLATSSMDSQMLAT---------SSMDSQML-ATSSMDSQMLATST-----MDSQMLatSTM 998
Cdd:TIGR04388  729 ALNSLNAKNDSSFQEDAQAKASIASLIQSLAQavllggsfgSWVKGQIQdKVNSAIATALANTTgmspdMAAQLV--SWF 806
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....*...
gi 1926162203  999 DSQMLATSSMDSQMLASGAMDSQMLASGSMD----AQMLASGTMDAQM 1042
Cdd:TIGR04388  807 EKQQAAKKAKAKARTEDITSGVVVVASIALSflagPEMLAVGQAALQA 854
PHA03247 PHA03247
large tegument protein UL36; Provisional
198-460 8.92e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.93  E-value: 8.92e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  198 PHTLETLKPATKTAELSVASASVISVQSEQSVAVMLEPSTTKVLDSFATAPVPTTAVVLKSSEPVVTMSVEYQTKSVLKS 277
Cdd:PHA03247  2717 SATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRE 2796
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  278 LESIPPEPSkimllePPVAKVLEPSETLVSSEIPTEVHPEPSTSTtmdfPESAATevlrlPEQPVEVPLE----IADSSM 353
Cdd:PHA03247  2797 SLPSPWDPA------DPPAAVLAPAAALPPAASPAGPLPPPTSAQ----PTAPPP-----PPGPPPPSLPlggsVAPGGD 2861
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  354 TRPQELLELPKTTPLELPESSVASVMELPGPPATSMLEL--QGPPVTPVLELPGPSATPVPELPGPLSTPVPELLGPPAT 431
Cdd:PHA03247  2862 VRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALppDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQP 2941
                          250       260
                   ....*....|....*....|....*....
gi 1926162203  432 AVPELPGPSVTSVPQlsqelPGLPAPSMG 460
Cdd:PHA03247  2942 PLAPTTDPAGAGEPS-----GAVPQPWLG 2965
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
223-587 1.08e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 44.37  E-value: 1.08e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  223 VQSEQSVAVMLEPSTTKVLDSFATAPVPTTAVVLKSSEPVVTMSVEYQTKSV--LKSLESIPP-EPSKIMLLEPPVAKVL 299
Cdd:pfam03154  174 LQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAapHTLIQQTPTlHPQRLPSPHPPLQPMT 253
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  300 EPSEtlvSSEIPTEVHPEPStsttmdfpesaatevLRLPEQPVEVPLEIADSSMTRPqellelpkTTPLELPESSVASVM 379
Cdd:pfam03154  254 QPPP---PSQVSPQPLPQPS---------------LHGQMPPMPHSLQTGPSHMQHP--------VPPQPFPLTPQSSQS 307
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  380 ELPGPPATSM-LELQGPPVTPVLELPGPSATPVPELP-GPLSTPVPELLGPPATAVPELPG------PSVTSVPQLSQEL 451
Cdd:pfam03154  308 QVPPGPSPAApGQSQQRIHTPPSQSQLQSQQPPREQPlPPAPLSMPHIKPPPTTPIPQLPNpqshkhPPHLSGPSPFQMN 387
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  452 PGLPAPSmGLEPPQEVPEPPVMAQELPGLPVVTAAVELPGQPAvtvamelteQPVTTTElEQSVGMTTVEHPgqpevtTA 531
Cdd:pfam03154  388 SNLPPPP-ALKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPPPA---------QPPVLTQ-SQSLPPPAASHP------PT 450
                          330       340       350       360       370
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1926162203  532 TGLLGQPEAAMVLELPGQPVATTALELPGQPSVTGVPELPGLPSATRALELSGQPV 587
Cdd:pfam03154  451 SGLHQVPSQSPFPQHPFVPGGPPPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPV 506
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
1461-1592 1.23e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 43.93  E-value: 1.23e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1461 ASGHTASEVSISLMEPAVTIPEPSQQStleLPAMAvselPVMAVPEPLAVAVPAPAVMAVPELPAVVVAEHPAVAVPEYP 1540
Cdd:PRK14951   369 AAEAAAPAEKKTPARPEAAAPAAAPVA---QAAAA----PAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAAA 441
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1926162203 1541 AVAVPEYPAVAVLDPPAESLLEPMALAEPEHVTIPVPHVSAlePTVPGLEPT 1592
Cdd:PRK14951   442 PAAVALAPAPPAQAAPETVAIPVRVAPEPAVASAAPAPAAA--PAAARLTPT 491
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
363-505 1.42e-03

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 43.91  E-value: 1.42e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  363 PKTTPLELPESSVASVMELPG--PPATSMLELQG--PPVTPVLELPGPSATPVPELPGPLstPVPELLGPPATAVPELPG 438
Cdd:TIGR01645  326 PRAQSPATPSSSLPTDIGNKAvvSSAKKEAEEVPplPQAAPAVVKPGPMEIPTPVPPPGL--AIPSLVAPPGLVAPTEIN 403
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1926162203  439 PSVTSVPQLSQELPGLPA------PSMGLEPPQEVPEPPVMAQELPGLPVVTAAVELPGQPAVTVAMELTEQP 505
Cdd:TIGR01645  404 PSFLASPRKKMKREKLPVtfgaldDTLAWKEPSKEDQTSEDGKMLAIMGEAAAALALEPKKKKKEKEGEELQP 476
PRK10263 PRK10263
DNA translocase FtsK; Provisional
217-445 2.34e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 43.54  E-value: 2.34e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  217 SASVISVQSEQSVAVMLEPSTTkvldsfaTAPVPTTAVVLksSEPVVtmsvEYQTKSVLKSLE-SIPPEPSKImllePPV 295
Cdd:PRK10263   321 AVAAAATTATQSWAAPVEPVTQ-------TPPVASVDVPP--AQPTV----AWQPVPGPQTGEpVIAPAPEGY----PQQ 383
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  296 AKVLEPSETLvSSEIPTEVHPEPSTSTTMDFPESAATEVLRLPEQPVEVPLEIADSSMTRPQELLELPKTTPLELPESSV 375
Cdd:PRK10263   384 SQYAQPAVQY-NEPLQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTY 462
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  376 ASVMELPGPPATSMLELQGPPV--------TPVLELPGPSATPV-------------PELPGPLSTPVPELLGPPATAVP 434
Cdd:PRK10263   463 QTEQTYQQPAAQEPLYQQPQPVeqqpvvepEPVVEETKPARPPLyyfeeveekrareREQLAAWYQPIPEPVKEPEPIKS 542
                          250
                   ....*....|.
gi 1926162203  435 ELPGPSVTSVP 445
Cdd:PRK10263   543 SLKAPSVAAVP 553
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
1378-1584 3.88e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 42.56  E-value: 3.88e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1378 VSASEPSVLTSEAAVTAPEPPLEPESSVMSTPAESAVATeeheivperpvtyisensmlAEPSMLTSEPTIMSETAETFD 1457
Cdd:PRK12323   375 ATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAA--------------------AAARAVAAAPARRSPAPEALA 434
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1458 SMRASGHTASEVSISLMEPAVTIPEPSQQstlelPAMAVSELPVMAVPEPLAVAVPAPAVMAVP-------ELPAVVVAE 1530
Cdd:PRK12323   435 AARQASARGPGGAPAPAPAPAAAPAAAAR-----PAAAGPRPVAAAAAAAPARAAPAAAPAPADddpppweELPPEFASP 509
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1926162203 1531 HPAVAVPEYPAVAVPEY--PAVAVLDPPAESLLEPMALAEPEHVTIPVPHVSALEP 1584
Cdd:PRK12323   510 APAQPDAAPAGWVAESIpdPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRP 565
PRK10263 PRK10263
DNA translocase FtsK; Provisional
1348-1589 4.78e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 42.38  E-value: 4.78e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1348 ALPPEESVSLPETVSQNEISEPSALLANYSVSASepsVLTSEAAVTAPEPPLEPESSVMSTPAESAVateehEIVPERPV 1427
Cdd:PRK10263   286 AADPDDVLFSGNRATQPEYDEYDPLLNGAPITEP---VAVAAAATTATQSWAAPVEPVTQTPPVASV-----DVPPAQPT 357
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1428 TyisenSMLAEPSMLTSEPTIMSEtAETFDSMRASGHTASEVSISLMEPAvtipePSQQSTLELPAMAVSELPVMAVPEP 1507
Cdd:PRK10263   358 V-----AWQPVPGPQTGEPVIAPA-PEGYPQQSQYAQPAVQYNEPLQQPV-----QPQQPYYAPAAEQPAQQPYYAPAPE 426
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1508 LAVAVPAPAvmAVPELPAVVVAEHPAVAVPEYPAVAVPEYPAVAVLDPPAESLLEPMALAEPEHVTIPVPHVSALEPTVP 1587
Cdd:PRK10263   427 QPAQQPYYA--PAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPVVEETKPARP 504

                   ..
gi 1926162203 1588 GL 1589
Cdd:PRK10263   505 PL 506
PHA03378 PHA03378
EBNA-3B; Provisional
234-457 9.67e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 41.21  E-value: 9.67e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  234 EPSTTK-VLDSFATAPVPTTAVVLKSSEPVVTmsveyQTKSVLKSLESIP-PEPSKIMLLEPPVAKVLEPsETLVSSEIP 311
Cdd:PHA03378   552 EPASTEpVHDQLLPAPGLGPLQIQPLTSPTTS-----QLASSAPSYAQTPwPVPHPSQTPEPPTTQSHIP-ETSAPRQWP 625
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203  312 TEVHPEPSTSTTMDfPESAATEVLRLPEQPVEVPLEIADSSMTRPQELLELPKTT--PLELPESSVASVMELPGPPATSM 389
Cdd:PHA03378   626 MPLRPIPMRPLRMQ-PITFNVLVFPTPHQPPQVEITPYKPTWTQIGHIPYQPSPTgaNTMLPIQWAPGTMQPPPRAPTPM 704
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1926162203  390 LELQGPPV------TPVLELPGPSATP-VPELPGPLSTPVPELLGPPATAVPELPGPSVTSVPQLSqelPGLPAP 457
Cdd:PHA03378   705 RPPAAPPGraqrpaAATGRARPPAAAPgRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAA---PGAPTP 776
rne PRK10811
ribonuclease E; Reviewed
1329-1548 9.80e-03

ribonuclease E; Reviewed


Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 41.18  E-value: 9.80e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1329 LSTEQSALTAENTWPTEVPALPpeeSVSLPETVSQNEISEPSALLANySVSASEPSVLTSEAAVTAPEPPLEPESSVMST 1408
Cdd:PRK10811   847 VVRPQDVQVEEQREAEEVQVQP---VVAEVPVAAAVEPVVSAPVVEA-VAEVVEEPVVVAEPQPEEVVVVETTHPEVIAA 922
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1926162203 1409 PAesavaTEEHEIVPErpvtyisENSMLAEPSMLTSEPTIMSETAETFDsmrasgHTASEVsislmePAVTIPEPSQQST 1488
Cdd:PRK10811   923 PV-----TEQPQVITE-------SDVAVAQEVAEHAEPVVEPQDETADI------EEAAET------AEVVVAEPEVVAQ 978
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1926162203 1489 LELPAMAVSELPVMAVPeplavavPAPAVMAVPELPAVVVAEHPAVA------VPEYpavaVPEYP 1548
Cdd:PRK10811   979 PAAPVVAEVAAEVETVT-------AVEPEVAPAQVPEATVEHNHATApmtrapAPEY----VPEAP 1033
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH