|
Name |
Accession |
Description |
Interval |
E-value |
| Ten_N |
pfam06484 |
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ... |
10-374 |
0e+00 |
|
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats). :
Pssm-ID: 461932 [Multi-domain] Cd Length: 367 Bit Score: 643.57 E-value: 0e+00
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 10 SLTRGRCGKECRYTSSSLDSEDCRVPTQKSYSSSETLKAYDHDSRMHYGNRVTDLVHRESDEFSRQGANFTLAELGICEP 89
Cdd:pfam06484 1 SLTKRRRDKERRYTSSSADSEECRVPTQKSYSSSETLKAFDHDSRMLYGNRVKDMVHKEADEFSRQGQNFSLRELGICEP 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 90 SP-HRSGYCSDMGILHQGYSLSTGSDADSDTEGGMSPEHAIRLWGRGIKSRRSSGLSSRENSALTLTDSDNENKSDDDNG 168
Cdd:pfam06484 81 SPrHGLAYCTEMGLPHRGYSISTGSDADTETDGPMSPEHAVRLWGRGTKSGRSSCLSSRSNSALTLTDTEHENKSDNENG 160
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 169 RPIPPTSSSSLLPsaqlpSSHNPPP---VSCQMPLLDSNTSHQIMDTNPDEEFSPNSYLLRACSGPQQASSSGPPNHHSQ 245
Cdd:pfam06484 161 PPIPPSSSSSSPV-----EQHSPPPpslNENQRPLLGNNASHPILDSDPDEEFSPNSYLVRTGSGPQSAPSEQPPNFQNH 235
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 246 STLRPPLPP-PHNHT-LSHHHSSANSLNRNSLTNRRSQIHAP-APAPNDLATTPESVQLQDSWVLNSNVPLETRHFLFKT 322
Cdd:pfam06484 236 SRLRTPPPPlPPPHKqNQHHHPSINSLNRSSLTNRRNPSPAPtASLPAELQSTQESVQLQDSWVLNSNVPLETRHFLFKT 315
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|..
gi 1958656154 323 SSGSTPLFSSSSPGYPLTSGTVYTPPPRLLPRNTFSRKAFKLKKPSKYCSWK 374
Cdd:pfam06484 316 GTGTTPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPYKYCSWK 367
|
|
| NHL super family |
cl18310 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1169-1499 |
1.04e-46 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats. The actual alignment was detected with superfamily member cd14953:
Pssm-ID: 302697 [Multi-domain] Cd Length: 323 Bit Score: 171.94 E-value: 1.04e-46
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1169 PVALAVGIDGSLFVGDF--NYIRRIFPSRNVTSILELRNKEFKHSNSPGHKYY----LAVDPvTGSLYVSDTNSRRIYRV 1242
Cdd:cd14953 25 PSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAGTGTAGFADGGGAAAQFNtpsgVAVDA-AGNLYVADTGNHRIRKI 103
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1243 kslsgakDLAGNSEVVAGTGEqclpfdeARCGDGGKAVDATLMSPRGIAVDKNGLMYFVDAT--MIRKVDQNGIISTLLG 1320
Cdd:cd14953 104 -------TPDGVVSTLAGTGT-------AGFSDDGGATAAQFNYPTGVAVDAAGNLYVADTGnhRIRKITPDGVVTTVAG 169
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1321 sndlTAVRPLSCDSSMDVAQVRleWPTDLAVNPMDNsLYVLE--NNVILRITENHQVSIIAGRPmhcqvpGIDYSLSKLA 1398
Cdd:cd14953 170 ----TGGAGYAGDGPATAAQFN--NPTGVAVDAAGN-LYVADrgNHRIRKITPDGVVTTVAGTG------TAGFSGDGGA 236
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1399 IHSALESASAIAISHTGVLYITETDEkkiNRLRQVTTNGEICLLAGAASDcdckndvncicYSGDDAYATDAILNSPSSL 1478
Cdd:cd14953 237 TAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVAGGGAG-----------FSGDGGPATSAQFNNPTGV 302
|
330 340
....*....|....*....|.
gi 1958656154 1479 AVAPDGTIYIADLGNIRIRAV 1499
Cdd:cd14953 303 AVDAAGNLYVADTGNNRIRKI 323
|
|
| Tox-GHH |
pfam15636 |
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ... |
2619-2696 |
4.10e-37 |
|
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus. :
Pssm-ID: 464783 Cd Length: 78 Bit Score: 135.05 E-value: 4.10e-37
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1958656154 2619 EEKARVLDQARQRALGTAWAKEQQKARDGREGSRLWTEGEKQQLLSTGRVQGYEGYYVLPVEQYPELADSSSNIQFLR 2696
Cdd:pfam15636 1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
1460-2397 |
2.16e-29 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only]; :
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 129.11 E-value: 2.16e-29
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1460 YSGDDAYATDAILNSPSSLAVAPDGTIYIADLGNIRIRAVSKNKPVLNAFNQYEAASPGEQELYVFNADGIHQYTVSLVT 1539
Cdd:COG3209 113 ASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGAVTLA 192
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1540 GEYLYNFTYSADNDVTELIDNNGNSLKIRRDSSGMPRHLLMPDNQIITLTVGTNGGLKAVSTQNLELGLMTYDGNTGLLA 1619
Cdd:COG3209 193 TGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTGAGTGASGA 272
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1620 TKSDETGWTTFYDYDHEGRLTNVTRPTGVVTSLHREMEKSITIDIENSNRDDDVTVITNLSSVEASYTVVQDQVRNSYQL 1699
Cdd:COG3209 273 GLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTTVGGG 352
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1700 CSNGTLRVMYANGMGVSFHSEPHVLAGTITPTIGRCNISLPMENGLNSIEWRLRKEQIKGKVTIFGRKLRVHGRNLLSID 1779
Cdd:COG3209 353 GSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTAGGTA 432
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1780 YDRNIRTEKIYDDHRKFTLRIIYDQVGRPFLWLPSSGLAAVNVSYFFNGRLAGLQRGAMSERTDIDKQGRIVSRMFADGK 1859
Cdd:COG3209 433 TGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGGTTTT 512
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1860 VWSYSYLDKSMVLLLQSQRQYIFEYDSSDRLHAVTMPSVARHSMSTHTSIGYIRNIYNPPESNASVIFDYSDDGRILKTS 1939
Cdd:COG3209 513 TAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGTATTT 592
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1940 FLGTGRQVFYKYGKLSKLSEIVYDSTAVTFGYDETTGVLKMVNLQSGGFSCTIRYRKVGPLVDKQIYRFSEEGMINARFD 2019
Cdd:COG3209 593 TVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTGTTTTRATGTTGTGTGVTAG 672
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2020 YTYHDNSFRIASIKPVISETPLPVDLYRYDEISGKVEHFGKFGVIYYDINQIITTAVMTLSKHFDTHGRIKEVQYEMF-R 2098
Cdd:COG3209 673 LTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTTTSTTtT 752
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2099 SLMYWMTVQYDSMGRVIKRELKLGPYANTTKYTYDYDGDGQLQSVAVNDRPTWRYSYDLNGNLH-----LLNPGNSARLM 2173
Cdd:COG3209 753 TTAGALTYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTsvitvGSGGGTDLQDR 832
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2174 PLRYDLRDRITRLGDVQykidDDGYLCQRgsdiFEYNSKGLLTRAynKASGWSVQYRYDGVSRRASyKTNLGHHlQYFYS 2253
Cdd:COG3209 833 TYTYDAAGNITSITDAL----RAGTLTQT----YTYDALGRLTSA--TDPGTTESYTYDANGNLTS-RTDGGTT-TYTYD 900
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2254 DLHHPTRITHvynhSNSEITSLYYDLQGHlfamesssgeeyyvaSDNTGTPLAVFSINGLMIKQLQYTAYGEIYYDSNPD 2333
Cdd:COG3209 901 ALGRLVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYDYDPFGNLLAETSGA 961
|
890 900 910 920 930 940
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1958656154 2334 FQMVIGFHGGLYDPLTKLVHFTQRDYDVLAGRWTSPDytmwrNVGKEPAPfNLYMFKNNNPLSN 2397
Cdd:COG3209 962 AANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD-----PIGLAGGL-NLYAYVGNNPVNY 1019
|
|
| acid_disulf_rpt |
NF033662 |
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ... |
772-802 |
2.33e-08 |
|
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids. :
Pssm-ID: 411265 [Multi-domain] Cd Length: 32 Bit Score: 51.75 E-value: 2.33e-08
10 20 30
....*....|....*....|....*....|.
gi 1958656154 772 AMETSCADNKDNEGDGLVDCLDPDCCLQSAC 802
Cdd:NF033662 2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
|
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
580-602 |
3.00e-04 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. :
Pssm-ID: 400365 Cd Length: 26 Bit Score: 40.02 E-value: 3.00e-04
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
645-667 |
1.07e-03 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. :
Pssm-ID: 400365 Cd Length: 26 Bit Score: 38.48 E-value: 1.07e-03
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
671-700 |
1.24e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements. :
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.39 E-value: 1.24e-03
10 20 30
....*....|....*....|....*....|....*
gi 1958656154 671 DCLDPT-CSSHGVCVNGE----CLCSPGWGGLNCE 700
Cdd:cd00054 4 ECASGNpCQNGGTCVNTVgsyrCSCPPGYTGRNCE 38
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
736-769 |
3.96e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements. :
Pssm-ID: 238011 Cd Length: 38 Bit Score: 37.23 E-value: 3.96e-03
10 20 30
....*....|....*....|....*....|....*.
gi 1958656154 736 VDGC--PDLCNGNGRCTLGQNSWQCVCQTGWRGPGC 769
Cdd:cd00054 2 IDECasGNPCQNGGTCVNTVGSYRCSCPPGYTGRNC 37
|
|
| I-EGF_1 |
pfam18372 |
Integrin beta epidermal growth factor like domain 1; This is the I-EGF 1 domain found in ... |
611-628 |
6.04e-03 |
|
Integrin beta epidermal growth factor like domain 1; This is the I-EGF 1 domain found in several integrin betas such as integrin beta 1-7. Structural analysis reveal an epidermal growth factor-like (I-EGF) domains 1 and 2. EGF1 lacks one disulfide (C2-C4) relative to the integrin EGF 2, 3, and 4 domains, this allows the C-terminal end of EGF1 to flex remarkably relative to its N-terminal end. :
Pssm-ID: 465729 Cd Length: 29 Bit Score: 36.31 E-value: 6.04e-03
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| Ten_N |
pfam06484 |
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ... |
10-374 |
0e+00 |
|
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).
Pssm-ID: 461932 [Multi-domain] Cd Length: 367 Bit Score: 643.57 E-value: 0e+00
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 10 SLTRGRCGKECRYTSSSLDSEDCRVPTQKSYSSSETLKAYDHDSRMHYGNRVTDLVHRESDEFSRQGANFTLAELGICEP 89
Cdd:pfam06484 1 SLTKRRRDKERRYTSSSADSEECRVPTQKSYSSSETLKAFDHDSRMLYGNRVKDMVHKEADEFSRQGQNFSLRELGICEP 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 90 SP-HRSGYCSDMGILHQGYSLSTGSDADSDTEGGMSPEHAIRLWGRGIKSRRSSGLSSRENSALTLTDSDNENKSDDDNG 168
Cdd:pfam06484 81 SPrHGLAYCTEMGLPHRGYSISTGSDADTETDGPMSPEHAVRLWGRGTKSGRSSCLSSRSNSALTLTDTEHENKSDNENG 160
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 169 RPIPPTSSSSLLPsaqlpSSHNPPP---VSCQMPLLDSNTSHQIMDTNPDEEFSPNSYLLRACSGPQQASSSGPPNHHSQ 245
Cdd:pfam06484 161 PPIPPSSSSSSPV-----EQHSPPPpslNENQRPLLGNNASHPILDSDPDEEFSPNSYLVRTGSGPQSAPSEQPPNFQNH 235
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 246 STLRPPLPP-PHNHT-LSHHHSSANSLNRNSLTNRRSQIHAP-APAPNDLATTPESVQLQDSWVLNSNVPLETRHFLFKT 322
Cdd:pfam06484 236 SRLRTPPPPlPPPHKqNQHHHPSINSLNRSSLTNRRNPSPAPtASLPAELQSTQESVQLQDSWVLNSNVPLETRHFLFKT 315
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|..
gi 1958656154 323 SSGSTPLFSSSSPGYPLTSGTVYTPPPRLLPRNTFSRKAFKLKKPSKYCSWK 374
Cdd:pfam06484 316 GTGTTPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPYKYCSWK 367
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1169-1499 |
1.04e-46 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 171.94 E-value: 1.04e-46
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1169 PVALAVGIDGSLFVGDF--NYIRRIFPSRNVTSILELRNKEFKHSNSPGHKYY----LAVDPvTGSLYVSDTNSRRIYRV 1242
Cdd:cd14953 25 PSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAGTGTAGFADGGGAAAQFNtpsgVAVDA-AGNLYVADTGNHRIRKI 103
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1243 kslsgakDLAGNSEVVAGTGEqclpfdeARCGDGGKAVDATLMSPRGIAVDKNGLMYFVDAT--MIRKVDQNGIISTLLG 1320
Cdd:cd14953 104 -------TPDGVVSTLAGTGT-------AGFSDDGGATAAQFNYPTGVAVDAAGNLYVADTGnhRIRKITPDGVVTTVAG 169
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1321 sndlTAVRPLSCDSSMDVAQVRleWPTDLAVNPMDNsLYVLE--NNVILRITENHQVSIIAGRPmhcqvpGIDYSLSKLA 1398
Cdd:cd14953 170 ----TGGAGYAGDGPATAAQFN--NPTGVAVDAAGN-LYVADrgNHRIRKITPDGVVTTVAGTG------TAGFSGDGGA 236
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1399 IHSALESASAIAISHTGVLYITETDEkkiNRLRQVTTNGEICLLAGAASDcdckndvncicYSGDDAYATDAILNSPSSL 1478
Cdd:cd14953 237 TAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVAGGGAG-----------FSGDGGPATSAQFNNPTGV 302
|
330 340
....*....|....*....|.
gi 1958656154 1479 AVAPDGTIYIADLGNIRIRAV 1499
Cdd:cd14953 303 AVDAAGNLYVADTGNNRIRKI 323
|
|
| Tox-GHH |
pfam15636 |
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ... |
2619-2696 |
4.10e-37 |
|
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus.
Pssm-ID: 464783 Cd Length: 78 Bit Score: 135.05 E-value: 4.10e-37
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1958656154 2619 EEKARVLDQARQRALGTAWAKEQQKARDGREGSRLWTEGEKQQLLSTGRVQGYEGYYVLPVEQYPELADSSSNIQFLR 2696
Cdd:pfam15636 1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
1460-2397 |
2.16e-29 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 129.11 E-value: 2.16e-29
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1460 YSGDDAYATDAILNSPSSLAVAPDGTIYIADLGNIRIRAVSKNKPVLNAFNQYEAASPGEQELYVFNADGIHQYTVSLVT 1539
Cdd:COG3209 113 ASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGAVTLA 192
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1540 GEYLYNFTYSADNDVTELIDNNGNSLKIRRDSSGMPRHLLMPDNQIITLTVGTNGGLKAVSTQNLELGLMTYDGNTGLLA 1619
Cdd:COG3209 193 TGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTGAGTGASGA 272
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1620 TKSDETGWTTFYDYDHEGRLTNVTRPTGVVTSLHREMEKSITIDIENSNRDDDVTVITNLSSVEASYTVVQDQVRNSYQL 1699
Cdd:COG3209 273 GLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTTVGGG 352
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1700 CSNGTLRVMYANGMGVSFHSEPHVLAGTITPTIGRCNISLPMENGLNSIEWRLRKEQIKGKVTIFGRKLRVHGRNLLSID 1779
Cdd:COG3209 353 GSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTAGGTA 432
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1780 YDRNIRTEKIYDDHRKFTLRIIYDQVGRPFLWLPSSGLAAVNVSYFFNGRLAGLQRGAMSERTDIDKQGRIVSRMFADGK 1859
Cdd:COG3209 433 TGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGGTTTT 512
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1860 VWSYSYLDKSMVLLLQSQRQYIFEYDSSDRLHAVTMPSVARHSMSTHTSIGYIRNIYNPPESNASVIFDYSDDGRILKTS 1939
Cdd:COG3209 513 TAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGTATTT 592
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1940 FLGTGRQVFYKYGKLSKLSEIVYDSTAVTFGYDETTGVLKMVNLQSGGFSCTIRYRKVGPLVDKQIYRFSEEGMINARFD 2019
Cdd:COG3209 593 TVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTGTTTTRATGTTGTGTGVTAG 672
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2020 YTYHDNSFRIASIKPVISETPLPVDLYRYDEISGKVEHFGKFGVIYYDINQIITTAVMTLSKHFDTHGRIKEVQYEMF-R 2098
Cdd:COG3209 673 LTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTTTSTTtT 752
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2099 SLMYWMTVQYDSMGRVIKRELKLGPYANTTKYTYDYDGDGQLQSVAVNDRPTWRYSYDLNGNLH-----LLNPGNSARLM 2173
Cdd:COG3209 753 TTAGALTYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTsvitvGSGGGTDLQDR 832
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2174 PLRYDLRDRITRLGDVQykidDDGYLCQRgsdiFEYNSKGLLTRAynKASGWSVQYRYDGVSRRASyKTNLGHHlQYFYS 2253
Cdd:COG3209 833 TYTYDAAGNITSITDAL----RAGTLTQT----YTYDALGRLTSA--TDPGTTESYTYDANGNLTS-RTDGGTT-TYTYD 900
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2254 DLHHPTRITHvynhSNSEITSLYYDLQGHlfamesssgeeyyvaSDNTGTPLAVFSINGLMIKQLQYTAYGEIYYDSNPD 2333
Cdd:COG3209 901 ALGRLVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYDYDPFGNLLAETSGA 961
|
890 900 910 920 930 940
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1958656154 2334 FQMVIGFHGGLYDPLTKLVHFTQRDYDVLAGRWTSPDytmwrNVGKEPAPfNLYMFKNNNPLSN 2397
Cdd:COG3209 962 AANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD-----PIGLAGGL-NLYAYVGNNPVNY 1019
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1169-1499 |
3.70e-12 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 69.28 E-value: 3.70e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1169 PVALAVGIDGSLFVGDF--NYIRRIFPsrnvtsilelRNKEFK-HSNSPGHKYY-LAVDPvTGSLYVSDTNSRRIYRVks 1244
Cdd:COG4257 19 PRDVAVDPDGAVWFTDQggGRIGRLDP----------ATGEFTeYPLGGGSGPHgIAVDP-DGNLWFTDNGNNRIGRI-- 85
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1245 lsGAKDlaGNSEVVAGTGEQCLPFdearcgdggkavdatlmsprGIAVDKNGLMYFVDAT--MIRKVD-QNGIISTLlgs 1321
Cdd:COG4257 86 --DPKT--GEITTFALPGGGSNPH--------------------GIAFDPDGNLWFTDQGgnRIGRLDpATGEVTEF--- 138
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1322 ndltavrPLSCDSSMdvaqvrlewPTDLAVNPmDNSLYV--LENNVILRI-TENHQVSIIAGrpmhcqvpgidyslskla 1398
Cdd:COG4257 139 -------PLPTGGAG---------PYGIAVDP-DGNLWVtdFGANAIGRIdPDTGTLTEYAL------------------ 183
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1399 iHSALESASAIAISHTGVLYITETDEKKINRLRqvTTNGEIcllagaasdcdckndvncicysgdDAYATDAILNSPSSL 1478
Cdd:COG4257 184 -PTPGAGPRGLAVDPDGNLWVADTGSGRIGRFD--PKTGTV------------------------TEYPLPGGGARPYGV 236
|
330 340
....*....|....*....|.
gi 1958656154 1479 AVAPDGTIYIADLGNIRIRAV 1499
Cdd:COG4257 237 AVDGDGRVWFAESGANRIVRF 257
|
|
| Rhs_assc_core |
TIGR03696 |
RHS repeat-associated core domain; This model represents a conserved unique core sequence ... |
2320-2397 |
8.60e-10 |
|
RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.
Pssm-ID: 274730 [Multi-domain] Cd Length: 77 Bit Score: 57.12 E-value: 8.60e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2320 YTAYGEIYYDSNPDFQmVIGFHGGLYDPLTKLVHFTQRDYDVLAGRWTSPDytmwrnvgkePA----PFNLYMFKNNNPL 2395
Cdd:TIGR03696 1 YDPYGEVLSESGAAPN-PLRFTGQYYDAETGLYYNGARYYDPELGRFLSPD----------PIglggGLNLYAYVGNNPV 69
|
..
gi 1958656154 2396 SN 2397
Cdd:TIGR03696 70 NW 71
|
|
| acid_disulf_rpt |
NF033662 |
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ... |
772-802 |
2.33e-08 |
|
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids.
Pssm-ID: 411265 [Multi-domain] Cd Length: 32 Bit Score: 51.75 E-value: 2.33e-08
10 20 30
....*....|....*....|....*....|.
gi 1958656154 772 AMETSCADNKDNEGDGLVDCLDPDCCLQSAC 802
Cdd:NF033662 2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1611-1647 |
7.06e-05 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 42.20 E-value: 7.06e-05
10 20 30
....*....|....*....|....*....|....*..
gi 1958656154 1611 YDGNtGLLATKSDETGWTTFYDYDHEGRLTNVTRPTG 1647
Cdd:pfam05593 1 YDAA-GRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
|
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
580-602 |
3.00e-04 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 40.02 E-value: 3.00e-04
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
645-667 |
1.07e-03 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 38.48 E-value: 1.07e-03
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
671-700 |
1.24e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.39 E-value: 1.24e-03
10 20 30
....*....|....*....|....*....|....*
gi 1958656154 671 DCLDPT-CSSHGVCVNGE----CLCSPGWGGLNCE 700
Cdd:cd00054 4 ECASGNpCQNGGTCVNTVgsyrCSCPPGYTGRNCE 38
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
638-668 |
1.73e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.00 E-value: 1.73e-03
10 20 30
....*....|....*....|....*....|....*.
gi 1958656154 638 NQCIDPS-CGGHGSCIDG----NCVCAAGYKGEHCE 668
Cdd:cd00054 3 DECASGNpCQNGGTCVNTvgsyRCSCPPGYTGRNCE 38
|
|
| C_rich_MXAN6577 |
NF041328 |
MXAN_6577-like cysteine-rich domain; |
583-693 |
2.21e-03 |
|
MXAN_6577-like cysteine-rich domain;
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 40.90 E-value: 2.21e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 583 NGECVSglchcfpgfLGADCAK-AACPVLCSGNGQYSKGTCQCYSGwkGAECDvpmNQCI----DP-SCGGHGSCIDGNC 656
Cdd:NF041328 29 GGACVD---------LRSDPSNcGACGVACGAGQTCVAGACGCGPG--TVACG---GACVdtasDPaHCGACGAACAPGQ 94
|
90 100 110 120
....*....|....*....|....*....|....*....|....*....
gi 1958656154 657 VCAAGYKGEHCEE--VDCldptcssHGVCVN--------GEC--LCSPG 693
Cdd:NF041328 95 VCEGGACREACSEglTRC-------GGACVDlatdplhcGACgvACDPG 136
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
736-769 |
3.96e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 37.23 E-value: 3.96e-03
10 20 30
....*....|....*....|....*....|....*.
gi 1958656154 736 VDGC--PDLCNGNGRCTLGQNSWQCVCQTGWRGPGC 769
Cdd:cd00054 2 IDECasGNPCQNGGTCVNTVGSYRCSCPPGYTGRNC 37
|
|
| I-EGF_1 |
pfam18372 |
Integrin beta epidermal growth factor like domain 1; This is the I-EGF 1 domain found in ... |
611-628 |
6.04e-03 |
|
Integrin beta epidermal growth factor like domain 1; This is the I-EGF 1 domain found in several integrin betas such as integrin beta 1-7. Structural analysis reveal an epidermal growth factor-like (I-EGF) domains 1 and 2. EGF1 lacks one disulfide (C2-C4) relative to the integrin EGF 2, 3, and 4 domains, this allows the C-terminal end of EGF1 to flex remarkably relative to its N-terminal end.
Pssm-ID: 465729 Cd Length: 29 Bit Score: 36.31 E-value: 6.04e-03
|
| EGF |
pfam00008 |
EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very ... |
739-767 |
6.73e-03 |
|
EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very similar, but has 8 instead of 6 conserved cysteines. Includes some cytokine receptors. The EGF domain misses the N-terminus regions of the Ca2+ binding EGF domains (this is the main reason of discrepancy between swiss-prot domain start/end and Pfam). The family is hard to model due to many similar but different sub-types of EGF domains. Pfam certainly misses a number of EGF domains.
Pssm-ID: 394967 Cd Length: 31 Bit Score: 36.21 E-value: 6.73e-03
10 20 30
....*....|....*....|....*....|
gi 1958656154 739 CPDL-CNGNGRCTLGQNSWQCVCQTGWRGP 767
Cdd:pfam00008 1 CAPNpCSNGGTCVDTPGGYTCICPEGYTGK 30
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| Ten_N |
pfam06484 |
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ... |
10-374 |
0e+00 |
|
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).
Pssm-ID: 461932 [Multi-domain] Cd Length: 367 Bit Score: 643.57 E-value: 0e+00
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 10 SLTRGRCGKECRYTSSSLDSEDCRVPTQKSYSSSETLKAYDHDSRMHYGNRVTDLVHRESDEFSRQGANFTLAELGICEP 89
Cdd:pfam06484 1 SLTKRRRDKERRYTSSSADSEECRVPTQKSYSSSETLKAFDHDSRMLYGNRVKDMVHKEADEFSRQGQNFSLRELGICEP 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 90 SP-HRSGYCSDMGILHQGYSLSTGSDADSDTEGGMSPEHAIRLWGRGIKSRRSSGLSSRENSALTLTDSDNENKSDDDNG 168
Cdd:pfam06484 81 SPrHGLAYCTEMGLPHRGYSISTGSDADTETDGPMSPEHAVRLWGRGTKSGRSSCLSSRSNSALTLTDTEHENKSDNENG 160
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 169 RPIPPTSSSSLLPsaqlpSSHNPPP---VSCQMPLLDSNTSHQIMDTNPDEEFSPNSYLLRACSGPQQASSSGPPNHHSQ 245
Cdd:pfam06484 161 PPIPPSSSSSSPV-----EQHSPPPpslNENQRPLLGNNASHPILDSDPDEEFSPNSYLVRTGSGPQSAPSEQPPNFQNH 235
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 246 STLRPPLPP-PHNHT-LSHHHSSANSLNRNSLTNRRSQIHAP-APAPNDLATTPESVQLQDSWVLNSNVPLETRHFLFKT 322
Cdd:pfam06484 236 SRLRTPPPPlPPPHKqNQHHHPSINSLNRSSLTNRRNPSPAPtASLPAELQSTQESVQLQDSWVLNSNVPLETRHFLFKT 315
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|..
gi 1958656154 323 SSGSTPLFSSSSPGYPLTSGTVYTPPPRLLPRNTFSRKAFKLKKPSKYCSWK 374
Cdd:pfam06484 316 GTGTTPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPYKYCSWK 367
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1169-1499 |
1.04e-46 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 171.94 E-value: 1.04e-46
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1169 PVALAVGIDGSLFVGDF--NYIRRIFPSRNVTSILELRNKEFKHSNSPGHKYY----LAVDPvTGSLYVSDTNSRRIYRV 1242
Cdd:cd14953 25 PSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAGTGTAGFADGGGAAAQFNtpsgVAVDA-AGNLYVADTGNHRIRKI 103
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1243 kslsgakDLAGNSEVVAGTGEqclpfdeARCGDGGKAVDATLMSPRGIAVDKNGLMYFVDAT--MIRKVDQNGIISTLLG 1320
Cdd:cd14953 104 -------TPDGVVSTLAGTGT-------AGFSDDGGATAAQFNYPTGVAVDAAGNLYVADTGnhRIRKITPDGVVTTVAG 169
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1321 sndlTAVRPLSCDSSMDVAQVRleWPTDLAVNPMDNsLYVLE--NNVILRITENHQVSIIAGRPmhcqvpGIDYSLSKLA 1398
Cdd:cd14953 170 ----TGGAGYAGDGPATAAQFN--NPTGVAVDAAGN-LYVADrgNHRIRKITPDGVVTTVAGTG------TAGFSGDGGA 236
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1399 IHSALESASAIAISHTGVLYITETDEkkiNRLRQVTTNGEICLLAGAASDcdckndvncicYSGDDAYATDAILNSPSSL 1478
Cdd:cd14953 237 TAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVAGGGAG-----------FSGDGGPATSAQFNNPTGV 302
|
330 340
....*....|....*....|.
gi 1958656154 1479 AVAPDGTIYIADLGNIRIRAV 1499
Cdd:cd14953 303 AVDAAGNLYVADTGNNRIRKI 323
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1220-1500 |
2.18e-40 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 153.45 E-value: 2.18e-40
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1220 LAVDPvTGSLYVSDTNSRRIYRVkslsgakDLAGNSEVVAGTGEqclpfdEARCGDGGKAvdATLMSPRGIAVDKNGLMY 1299
Cdd:cd14953 28 VAVDA-AGNLYVADRGNHRIRKI-------TPDGVVTTVAGTGT------AGFADGGGAA--AQFNTPSGVAVDAAGNLY 91
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1300 FVDAT--MIRKVDQNGIISTLLGsndlTAVRPLSCDSSMDVAQvrLEWPTDLAVNPMDNsLYVLE--NNVILRITENHQV 1375
Cdd:cd14953 92 VADTGnhRIRKITPDGVVSTLAG----TGTAGFSDDGGATAAQ--FNYPTGVAVDAAGN-LYVADtgNHRIRKITPDGVV 164
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1376 SIIAGRPmhcqVPGidYSLSKLAIHSALESASAIAISHTGVLYITETDEkkiNRLRQVTTNGEICLLAGAASDcdckndv 1455
Cdd:cd14953 165 TTVAGTG----GAG--YAGDGPATAAQFNNPTGVAVDAAGNLYVADRGN---HRIRKITPDGVVTTVAGTGTA------- 228
|
250 260 270 280
....*....|....*....|....*....|....*....|....*
gi 1958656154 1456 ncicYSGDDAYATDAILNSPSSLAVAPDGTIYIADLGNIRIRAVS 1500
Cdd:cd14953 229 ----GFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGNHRIRKIT 269
|
|
| Tox-GHH |
pfam15636 |
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ... |
2619-2696 |
4.10e-37 |
|
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus.
Pssm-ID: 464783 Cd Length: 78 Bit Score: 135.05 E-value: 4.10e-37
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1958656154 2619 EEKARVLDQARQRALGTAWAKEQQKARDGREGSRLWTEGEKQQLLSTGRVQGYEGYYVLPVEQYPELADSSSNIQFLR 2696
Cdd:pfam15636 1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1257-1500 |
6.66e-32 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 128.80 E-value: 6.66e-32
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1257 VVAGTGeqclpfdeARCGDGGKAVDATLMSPRGIAVDKNGLMYFVDAT--MIRKVDQNGIISTLL-----GSNDLTAvrp 1329
Cdd:cd14953 3 TVAGSG--------TAGFSGGGGTAARFNSPSGVAVDAAGNLYVADRGnhRIRKITPDGVVTTVAgtgtaGFADGGG--- 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1330 lscdssmdvAQVRLEWPTDLAVNPMDNsLYV--LENNVILRITENHQVSIIAGRPmhcqVPGidYSLSKLAIHSALESAS 1407
Cdd:cd14953 72 ---------AAAQFNTPSGVAVDAAGN-LYVadTGNHRIRKITPDGVVSTLAGTG----TAG--FSDDGGATAAQFNYPT 135
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1408 AIAISHTGVLYITETDEkkiNRLRQVTTNGEICLLAGAASDcdckndvncicYSGDDAYATDAILNSPSSLAVAPDGTIY 1487
Cdd:cd14953 136 GVAVDAAGNLYVADTGN---HRIRKITPDGVVTTVAGTGGA-----------GYAGDGPATAAQFNNPTGVAVDAAGNLY 201
|
250
....*....|...
gi 1958656154 1488 IADLGNIRIRAVS 1500
Cdd:cd14953 202 VADRGNHRIRKIT 214
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
1460-2397 |
2.16e-29 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 129.11 E-value: 2.16e-29
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1460 YSGDDAYATDAILNSPSSLAVAPDGTIYIADLGNIRIRAVSKNKPVLNAFNQYEAASPGEQELYVFNADGIHQYTVSLVT 1539
Cdd:COG3209 113 ASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGAVTLA 192
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1540 GEYLYNFTYSADNDVTELIDNNGNSLKIRRDSSGMPRHLLMPDNQIITLTVGTNGGLKAVSTQNLELGLMTYDGNTGLLA 1619
Cdd:COG3209 193 TGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTGAGTGASGA 272
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1620 TKSDETGWTTFYDYDHEGRLTNVTRPTGVVTSLHREMEKSITIDIENSNRDDDVTVITNLSSVEASYTVVQDQVRNSYQL 1699
Cdd:COG3209 273 GLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTTVGGG 352
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1700 CSNGTLRVMYANGMGVSFHSEPHVLAGTITPTIGRCNISLPMENGLNSIEWRLRKEQIKGKVTIFGRKLRVHGRNLLSID 1779
Cdd:COG3209 353 GSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTAGGTA 432
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1780 YDRNIRTEKIYDDHRKFTLRIIYDQVGRPFLWLPSSGLAAVNVSYFFNGRLAGLQRGAMSERTDIDKQGRIVSRMFADGK 1859
Cdd:COG3209 433 TGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGGTTTT 512
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1860 VWSYSYLDKSMVLLLQSQRQYIFEYDSSDRLHAVTMPSVARHSMSTHTSIGYIRNIYNPPESNASVIFDYSDDGRILKTS 1939
Cdd:COG3209 513 TAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGTATTT 592
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1940 FLGTGRQVFYKYGKLSKLSEIVYDSTAVTFGYDETTGVLKMVNLQSGGFSCTIRYRKVGPLVDKQIYRFSEEGMINARFD 2019
Cdd:COG3209 593 TVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTGTTTTRATGTTGTGTGVTAG 672
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2020 YTYHDNSFRIASIKPVISETPLPVDLYRYDEISGKVEHFGKFGVIYYDINQIITTAVMTLSKHFDTHGRIKEVQYEMF-R 2098
Cdd:COG3209 673 LTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTTTSTTtT 752
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2099 SLMYWMTVQYDSMGRVIKRELKLGPYANTTKYTYDYDGDGQLQSVAVNDRPTWRYSYDLNGNLH-----LLNPGNSARLM 2173
Cdd:COG3209 753 TTAGALTYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTsvitvGSGGGTDLQDR 832
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2174 PLRYDLRDRITRLGDVQykidDDGYLCQRgsdiFEYNSKGLLTRAynKASGWSVQYRYDGVSRRASyKTNLGHHlQYFYS 2253
Cdd:COG3209 833 TYTYDAAGNITSITDAL----RAGTLTQT----YTYDALGRLTSA--TDPGTTESYTYDANGNLTS-RTDGGTT-TYTYD 900
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2254 DLHHPTRITHvynhSNSEITSLYYDLQGHlfamesssgeeyyvaSDNTGTPLAVFSINGLMIKQLQYTAYGEIYYDSNPD 2333
Cdd:COG3209 901 ALGRLVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYDYDPFGNLLAETSGA 961
|
890 900 910 920 930 940
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1958656154 2334 FQMVIGFHGGLYDPLTKLVHFTQRDYDVLAGRWTSPDytmwrNVGKEPAPfNLYMFKNNNPLSN 2397
Cdd:COG3209 962 AANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD-----PIGLAGGL-NLYAYVGNNPVNY 1019
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1220-1518 |
1.60e-17 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 85.06 E-value: 1.60e-17
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1220 LAVDPvTGSLYVSDTNSRRIYRVkslsgakDLAGNSEVVAGTGeqclpfdearcGDGgkavDATLMSPRGIAVDKNGLMY 1299
Cdd:cd05819 13 IAVDS-SGNIYVADTGNNRIQVF-------DPDGNFITSFGSF-----------GSG----DGQFNEPAGVAVDSDGNLY 69
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1300 FVDAT--MIRKVDQNGIISTLLGSNDLTavrplscdssmdvaQVRLEWPTDLAVNPMDNsLYVL--ENNVILRITENHQV 1375
Cdd:cd05819 70 VADTGnhRIQKFDPDGNFLASFGGSGDG--------------DGEFNGPRGIAVDSSGN-IYVAdtGNHRIQKFDPDGEF 134
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1376 SIIAGrpmhcqvpgidyslSKLAIHSALESASAIAISHTGVLYITETDEkkiNRLRQVTTNGEICLLAGaasdcdckndv 1455
Cdd:cd05819 135 LTTFG--------------SGGSGPGQFNGPTGVAVDSDGNIYVADTGN---HRIQVFDPDGNFLTTFG----------- 186
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1958656154 1456 ncicysgdDAYATDAILNSPSSLAVAPDGTIYIADLGNIRIRAVSKNKPVLNAFNQYEAASPG 1518
Cdd:cd05819 187 --------STGTGPGQFNYPTGIAVDSDGNIYVADSGNNRVQVFDPDGAGFGGNGNFLGSDGQ 241
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1139-1309 |
2.25e-17 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 86.04 E-value: 2.25e-17
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1139 IITSIMGNGRRRSiscpSCNGLAEGNKLLAPVALAVGIDGSLFVGDF--NYIRRIFPSRNVTSILELRNKEFKHS----- 1211
Cdd:cd14953 163 VVTTVAGTGGAGY----AGDGPATAAQFNNPTGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAGTGTAGFSGDggata 238
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1212 ---NSPghkYYLAVDPvTGSLYVSDTNSRRIYRVkslsgakDLAGNSEVVAGTGeQCLPfdearcGDGGKAVDATLMSPR 1288
Cdd:cd14953 239 aqlNNP---TGVAVDA-AGNLYVADSGNHRIRKI-------TPAGVVTTVAGGG-AGFS------GDGGPATSAQFNNPT 300
|
170 180
....*....|....*....|...
gi 1958656154 1289 GIAVDKNGLMYFVDAT--MIRKV 1309
Cdd:cd14953 301 GVAVDAAGNLYVADTGnnRIRKI 323
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1164-1497 |
3.13e-16 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 81.21 E-value: 3.13e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1164 NKLLAPVALAVGIDGSLFVGDF--NYIRRIFPSRNVTSILELRNKEFKHSNSPghkYYLAVDPvTGSLYVSDTNSRRIYR 1241
Cdd:cd05819 5 GELNNPQGIAVDSSGNIYVADTgnNRIQVFDPDGNFITSFGSFGSGDGQFNEP---AGVAVDS-DGNLYVADTGNHRIQK 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1242 VkslsgakDLAGNSEVVAGTGeqclpfdearcGDGgkavDATLMSPRGIAVDKNGLMYFVDAT--MIRKVDQNGIISTLL 1319
Cdd:cd05819 81 F-------DPDGNFLASFGGS-----------GDG----DGEFNGPRGIAVDSSGNIYVADTGnhRIQKFDPDGEFLTTF 138
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1320 GSNdltavrplscdsSMDVAQvrLEWPTDLAVNPmDNSLYVLE--NNVILRITENHQVSIIAGRPmhCQVPGidyslskl 1397
Cdd:cd05819 139 GSG------------GSGPGQ--FNGPTGVAVDS-DGNIYVADtgNHRIQVFDPDGNFLTTFGST--GTGPG-------- 193
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1398 aihsALESASAIAISHTGVLYITETDEkkiNRLRQVTTNGEICLLAGaasdcdckndvncicysgdDAYATDAILNSPSS 1477
Cdd:cd05819 194 ----QFNYPTGIAVDSDGNIYVADSGN---NRVQVFDPDGAGFGGNG-------------------NFLGSDGQFNRPSG 247
|
330 340
....*....|....*....|
gi 1958656154 1478 LAVAPDGTIYIADLGNIRIR 1497
Cdd:cd05819 248 LAVDSDGNLYVADTGNNRIQ 267
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1158-1369 |
4.19e-15 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 78.13 E-value: 4.19e-15
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1158 NGLAEGNkLLAPVALAVGIDGSLFVGDF--NYIRRIFPSRNVTSIL---ELRNKEFkhsNSPghkYYLAVDPvTGSLYVS 1232
Cdd:cd05819 94 SGDGDGE-FNGPRGIAVDSSGNIYVADTgnHRIQKFDPDGEFLTTFgsgGSGPGQF---NGP---TGVAVDS-DGNIYVA 165
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1233 DTNSRRIYRVKSlsgakdlagNSEVVAGTGEQClpfdearcgdggkAVDATLMSPRGIAVDKNGLMYFVDATM--IRKVD 1310
Cdd:cd05819 166 DTGNHRIQVFDP---------DGNFLTTFGSTG-------------TGPGQFNYPTGIAVDSDGNIYVADSGNnrVQVFD 223
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1958656154 1311 QNGIISTLLGSNdltavrplscdssmDVAQVRLEWPTDLAVNPmDNSLYVLE--NNVILRI 1369
Cdd:cd05819 224 PDGAGFGGNGNF--------------LGSDGQFNRPSGLAVDS-DGNLYVADtgNNRIQVF 269
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1281-1512 |
9.72e-15 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 76.97 E-value: 9.72e-15
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1281 DATLMSPRGIAVDKNGLMYFVDATM--IRKVDQNGIISTLLGSNDltavrplscdssmdVAQVRLEWPTDLAVNPmDNSL 1358
Cdd:cd05819 4 PGELNNPQGIAVDSSGNIYVADTGNnrIQVFDPDGNFITSFGSFG--------------SGDGQFNEPAGVAVDS-DGNL 68
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1359 YVL--ENNVILRITENHQVSIIAGRPmhcqvpGIDYSlsklaihsALESASAIAISHTGVLYITETDEkkiNRLRQVTTN 1436
Cdd:cd05819 69 YVAdtGNHRIQKFDPDGNFLASFGGS------GDGDG--------EFNGPRGIAVDSSGNIYVADTGN---HRIQKFDPD 131
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1958656154 1437 GEICLLAGAASDCDCKndvncicysgddayatdaiLNSPSSLAVAPDGTIYIADLGNIRIRAVSKNKPVLNAFNQY 1512
Cdd:cd05819 132 GEFLTTFGSGGSGPGQ-------------------FNGPTGVAVDSDGNIYVADTGNHRIQVFDPDGNFLTTFGST 188
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1165-1430 |
1.39e-13 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 73.51 E-value: 1.39e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1165 KLLAPVALAVGIDGSLFVGDFNYIR-RIFPS----RNVTSILELRNKEFkhsNSPghkYYLAVDPvTGSLYVSDTNSRRI 1239
Cdd:cd05819 53 QFNEPAGVAVDSDGNLYVADTGNHRiQKFDPdgnfLASFGGSGDGDGEF---NGP---RGIAVDS-SGNIYVADTGNHRI 125
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1240 YRVkslsgakDLAGNSEVVAGtgeqclpfdearcgdGGKAVDATLMSPRGIAVDKNGLMYFVDAT--MIRKVDQNGIIST 1317
Cdd:cd05819 126 QKF-------DPDGEFLTTFG---------------SGGSGPGQFNGPTGVAVDSDGNIYVADTGnhRIQVFDPDGNFLT 183
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1318 LLGSNdltavrplscdssmDVAQVRLEWPTDLAVNPMDNsLYVLE--NNVILRITENHQVSIIAGrpmhcqvpgidyslS 1395
Cdd:cd05819 184 TFGST--------------GTGPGQFNYPTGIAVDSDGN-IYVADsgNNRVQVFDPDGAGFGGNG--------------N 234
|
250 260 270
....*....|....*....|....*....|....*
gi 1958656154 1396 KLAIHSALESASAIAISHTGVLYITETDEKKINRL 1430
Cdd:cd05819 235 FLGSDGQFNRPSGLAVDSDGNLYVADTGNNRIQVF 269
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1169-1499 |
3.70e-12 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 69.28 E-value: 3.70e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1169 PVALAVGIDGSLFVGDF--NYIRRIFPsrnvtsilelRNKEFK-HSNSPGHKYY-LAVDPvTGSLYVSDTNSRRIYRVks 1244
Cdd:COG4257 19 PRDVAVDPDGAVWFTDQggGRIGRLDP----------ATGEFTeYPLGGGSGPHgIAVDP-DGNLWFTDNGNNRIGRI-- 85
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1245 lsGAKDlaGNSEVVAGTGEQCLPFdearcgdggkavdatlmsprGIAVDKNGLMYFVDAT--MIRKVD-QNGIISTLlgs 1321
Cdd:COG4257 86 --DPKT--GEITTFALPGGGSNPH--------------------GIAFDPDGNLWFTDQGgnRIGRLDpATGEVTEF--- 138
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1322 ndltavrPLSCDSSMdvaqvrlewPTDLAVNPmDNSLYV--LENNVILRI-TENHQVSIIAGrpmhcqvpgidyslskla 1398
Cdd:COG4257 139 -------PLPTGGAG---------PYGIAVDP-DGNLWVtdFGANAIGRIdPDTGTLTEYAL------------------ 183
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1399 iHSALESASAIAISHTGVLYITETDEKKINRLRqvTTNGEIcllagaasdcdckndvncicysgdDAYATDAILNSPSSL 1478
Cdd:COG4257 184 -PTPGAGPRGLAVDPDGNLWVADTGSGRIGRFD--PKTGTV------------------------TEYPLPGGGARPYGV 236
|
330 340
....*....|....*....|.
gi 1958656154 1479 AVAPDGTIYIADLGNIRIRAV 1499
Cdd:COG4257 237 AVDGDGRVWFAESGANRIVRF 257
|
|
| Rhs_assc_core |
TIGR03696 |
RHS repeat-associated core domain; This model represents a conserved unique core sequence ... |
2320-2397 |
8.60e-10 |
|
RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.
Pssm-ID: 274730 [Multi-domain] Cd Length: 77 Bit Score: 57.12 E-value: 8.60e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 2320 YTAYGEIYYDSNPDFQmVIGFHGGLYDPLTKLVHFTQRDYDVLAGRWTSPDytmwrnvgkePA----PFNLYMFKNNNPL 2395
Cdd:TIGR03696 1 YDPYGEVLSESGAAPN-PLRFTGQYYDAETGLYYNGARYYDPELGRFLSPD----------PIglggGLNLYAYVGNNPV 69
|
..
gi 1958656154 2396 SN 2397
Cdd:TIGR03696 70 NW 71
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1206-1529 |
2.45e-09 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 60.80 E-value: 2.45e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1206 KEFKHSNSPGHKYYLAVDPvTGSLYVSDTNSRRIYRVkslsgakDLAgnsevvagTGEqclpFDEARCGDGGkavdatlm 1285
Cdd:COG4257 8 TEYPVPAPGSGPRDVAVDP-DGAVWFTDQGGGRIGRL-------DPA--------TGE----FTEYPLGGGS-------- 59
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1286 SPRGIAVDKNGLMYFVDAT--MIRKVD-QNGIISTLLGSNDLTAvrplscdssmdvaqvrlewPTDLAVNPmDNSLYV-- 1360
Cdd:COG4257 60 GPHGIAVDPDGNLWFTDNGnnRIGRIDpKTGEITTFALPGGGSN-------------------PHGIAFDP-DGNLWFtd 119
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1361 LENNVILRIT-ENHQVSIIAGRPMHCQvpgidyslsklaihsalesASAIAISHTGVLYITETdekKINRLRQVTT-NGE 1438
Cdd:COG4257 120 QGGNRIGRLDpATGEVTEFPLPTGGAG-------------------PYGIAVDPDGNLWVTDF---GANAIGRIDPdTGT 177
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1439 IcllagaasdcdckndvncicysgdDAYATDAILNSPSSLAVAPDGTIYIADLGNIRIRAVSknkPVLNAFNQYeAASPG 1518
Cdd:COG4257 178 L------------------------TEYALPTPGAGPRGLAVDPDGNLWVADTGSGRIGRFD---PKTGTVTEY-PLPGG 229
|
330
....*....|...
gi 1958656154 1519 EQELY--VFNADG 1529
Cdd:COG4257 230 GARPYgvAVDGDG 242
|
|
| acid_disulf_rpt |
NF033662 |
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ... |
772-802 |
2.33e-08 |
|
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids.
Pssm-ID: 411265 [Multi-domain] Cd Length: 32 Bit Score: 51.75 E-value: 2.33e-08
10 20 30
....*....|....*....|....*....|.
gi 1958656154 772 AMETSCADNKDNEGDGLVDCLDPDCCLQSAC 802
Cdd:NF033662 2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1164-1439 |
2.40e-08 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 57.72 E-value: 2.40e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1164 NKLLAPVALAVGIDGSLFVGD--FNYIRRIFPSRNVTSILELRNKEfkhsNSPghkYYLAVDPvTGSLYVSDTNSRRIYR 1241
Cdd:COG4257 56 GGGSGPHGIAVDPDGNLWFTDngNNRIGRIDPKTGEITTFALPGGG----SNP---HGIAFDP-DGNLWFTDQGGNRIGR 127
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1242 VkslsgakDLAGNsEVVAGTgeqcLPFDEARcgdggkavdatlmsPRGIAVDKNGLMYFVD--ATMIRKVD-QNGIISTL 1318
Cdd:COG4257 128 L-------DPATG-EVTEFP----LPTGGAG--------------PYGIAVDPDGNLWVTDfgANAIGRIDpDTGTLTEY 181
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1319 LGSNDLTAvrplscdssmdvaqvrlewPTDLAVNPmDNSLYVLE--NNVILRITENhqvsiiagrpmhcqvpgiDYSLSK 1396
Cdd:COG4257 182 ALPTPGAG-------------------PRGLAVDP-DGNLWVADtgSGRIGRFDPK------------------TGTVTE 223
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 1958656154 1397 LAIHSALESASAIAISHTGVLYITETDekkINRLRQVTTNGEI 1439
Cdd:COG4257 224 YPLPGGGARPYGVAVDGDGRVWFAESG---ANRIVRFDPDTEL 263
|
|
| NHL_PKND_like |
cd14952 |
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein ... |
1220-1496 |
1.05e-07 |
|
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein with a cytosolic kinase domain and an extracellular sensor domain that contains NHL repeats. It plays a key role in the development of central nervous system tuberculosis, by mediating the invasion of host brain endothelia. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271322 [Multi-domain] Cd Length: 247 Bit Score: 55.68 E-value: 1.05e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1220 LAVDPvTGSLYVSDTNSRRIYRvkslsgakdLAgnsevvAGTGEQC-LPFDEarcgdggkavdatLMSPRGIAVDKNGLM 1298
Cdd:cd14952 15 VAVDA-AGNVYVADSGNNRVLK---------LA------AGSTTQTvLPFTG-------------LYQPQGVAVDAAGTV 65
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1299 YFVDAtmirkvDQNGIISTLLGSNDLTAVrPLScdssmdvaqvRLEWPTDLAVNPMDNsLYVLE--NNVILRITenhqvs 1376
Cdd:cd14952 66 YVTDF------GNNRVLKLAAGSTTQTVL-PFT----------GLNDPTGVAVDAAGN-VYVADtgNNRVLKLA------ 121
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1377 iiAGRPMHCQVPGIDyslsklaihsaLESASAIAISHTGVLYITETDEkkiNRLRQvttngeicLLAGAASdcdckndvn 1456
Cdd:cd14952 122 --AGSNTQTVLPFTG-----------LSNPDGVAVDGAGNVYVTDTGN---NRVLK--------LAAGSTT--------- 168
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 1958656154 1457 cicysgddayATD---AILNSPSSLAVAPDGTIYIADLGNIRI 1496
Cdd:cd14952 169 ----------QTVlpfTGLNSPSGVAVDTAGNVYVTDHGNNRV 201
|
|
| NHL_like_2 |
cd14957 |
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and ... |
1284-1563 |
9.39e-06 |
|
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271327 [Multi-domain] Cd Length: 280 Bit Score: 49.96 E-value: 9.39e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1284 LMSPRGIAVDKNGLMYFVDA--TMIRKVDQNGIISTLLGSNDLTavrplscdssmdvaQVRLEWPTDLAVNPMDNsLYVL 1361
Cdd:cd14957 17 FNTPRGIAVDSAGNIYVADTgnNRIQVFTSSGVYSYSIGSGGTG--------------SGQFNSPYGIAVDSNGN-IYVA 81
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1362 EnnvilriTENHQVSII--AGrpmhcqvpGIDYSL-SKLAIHSALESASAIAISHTGVLYITETDEkkiNRLRQVTTNGE 1438
Cdd:cd14957 82 D-------TDNNRIQVFnsSG--------VYQYSIgTGGSGDGQFNGPYGIAVDSNGNIYVADTGN---HRIQVFTSSGT 143
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1439 ICllagaasdcdckndvncicYSGDDAYATDAILNSPSSLAVAPDGTIYIADLGNIRIRavsknkpvlnafnqyeaaspg 1518
Cdd:cd14957 144 FS-------------------YSIGSGGTGPGQFNGPQGIAVDSDGNIYVADTGNHRIQ--------------------- 183
|
250 260 270 280
....*....|....*....|....*....|....*....|....*.
gi 1958656154 1519 eqelyVFNADGIHQYTV-SLVTGEYLYNFTYsadnDVTelIDNNGN 1563
Cdd:cd14957 184 -----VFTSSGTFQYTFgSSGSGPGQFSDPY----GIA--VDSDGN 218
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1461-1502 |
2.02e-05 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 49.45 E-value: 2.02e-05
10 20 30 40
....*....|....*....|....*....|....*....|..
gi 1958656154 1461 SGDDAYATDAILNSPSSLAVAPDGTIYIADLGNIRIRAVSKN 1502
Cdd:cd14953 11 GFSGGGGTAARFNSPSGVAVDAAGNLYVADRGNHRIRKITPD 52
|
|
| NHL_like_2 |
cd14957 |
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and ... |
1169-1497 |
3.09e-05 |
|
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271327 [Multi-domain] Cd Length: 280 Bit Score: 48.42 E-value: 3.09e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1169 PVALAVGIDGSLFVGDFNYIR-RIF-PSRNVTSIL---ELRNKEFkhsNSPghkYYLAVDPvTGSLYVSDTNSRRIyRVK 1243
Cdd:cd14957 20 PRGIAVDSAGNIYVADTGNNRiQVFtSSGVYSYSIgsgGTGSGQF---NSP---YGIAVDS-NGNIYVADTDNNRI-QVF 91
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1244 SLSGAKDLAgnsevVAGTGEQCLPFDEarcgdggkavdatlmsPRGIAVDKNGLMYFVDA--TMIRKVDQNGIISTLLGS 1321
Cdd:cd14957 92 NSSGVYQYS-----IGTGGSGDGQFNG----------------PYGIAVDSNGNIYVADTgnHRIQVFTSSGTFSYSIGS 150
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1322 ndltavrplscdSSMDVAQVRLewPTDLAVNPMDNsLYVLENNvilriteNHQVSII--AGRPmhcqvpgiDYSL-SKLA 1398
Cdd:cd14957 151 ------------GGTGPGQFNG--PQGIAVDSDGN-IYVADTG-------NHRIQVFtsSGTF--------QYTFgSSGS 200
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1399 IHSALESASAIAISHTGVLYITETDEKKInrlrQVTTNgeicllagaasdcdckndvncicySGDDAYA------TDAIL 1472
Cdd:cd14957 201 GPGQFSDPYGIAVDSDGNIYVADTGNHRI----QVFTS------------------------SGAYQYSigtsgsGNGQF 252
|
330 340
....*....|....*....|....*
gi 1958656154 1473 NSPSSLAVAPDGTIYIADLGNIRIR 1497
Cdd:cd14957 253 NYPYGIAVDNDGKIYVADSNNNRIQ 277
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1611-1647 |
7.06e-05 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 42.20 E-value: 7.06e-05
10 20 30
....*....|....*....|....*....|....*..
gi 1958656154 1611 YDGNtGLLATKSDETGWTTFYDYDHEGRLTNVTRPTG 1647
Cdd:pfam05593 1 YDAA-GRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
|
|
| NHL_PKND_like |
cd14952 |
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein ... |
1166-1360 |
1.05e-04 |
|
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein with a cytosolic kinase domain and an extracellular sensor domain that contains NHL repeats. It plays a key role in the development of central nervous system tuberculosis, by mediating the invasion of host brain endothelia. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271322 [Multi-domain] Cd Length: 247 Bit Score: 46.43 E-value: 1.05e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1166 LLAPVALAVGIDGSLFVGDFNYIR--RIFPSRNVTSILELRNKEFKHSnspghkyyLAVDPVtGSLYVSDTNSRRIYRVK 1243
Cdd:cd14952 51 LYQPQGVAVDAAGTVYVTDFGNNRvlKLAAGSTTQTVLPFTGLNDPTG--------VAVDAA-GNVYVADTGNNRVLKLA 121
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1244 S------------LSGAKDLA------------GNSEVV---AGTGEQC-LPFDEarcgdggkavdatLMSPRGIAVDKN 1295
Cdd:cd14952 122 AgsntqtvlpftgLSNPDGVAvdgagnvyvtdtGNNRVLklaAGSTTQTvLPFTG-------------LNSPSGVAVDTA 188
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1958656154 1296 GLMYFVDAtmirkvDQNGIISTLLGSNDLTAVrPLScdssmdvaqvRLEWPTDLAVNPmDNSLYV 1360
Cdd:cd14952 189 GNVYVTDH------GNNRVLKLAAGSTTPTVL-PFT----------GLNGPLGVAVDA-AGNVYV 235
|
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
580-602 |
3.00e-04 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 40.02 E-value: 3.00e-04
|
| YvrE |
COG3386 |
Sugar lactone lactonase YvrE [Carbohydrate transport and metabolism]; Sugar lactone lactonase ... |
1172-1332 |
3.52e-04 |
|
Sugar lactone lactonase YvrE [Carbohydrate transport and metabolism]; Sugar lactone lactonase YvrE is part of the Pathway/BioSystem: Non-phosphorylated Entner-Doudoroff pathway
Pssm-ID: 442613 [Multi-domain] Cd Length: 266 Bit Score: 44.88 E-value: 3.52e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1172 LAVGIDGSLFVGDFNY------IRRIFPSRNVTSILElrnkEFKHSNSpghkyyLAVDPVTGSLYVSDTNSRRIYRVksl 1245
Cdd:COG3386 98 GVVDPDGRLYFTDMGEylptgaLYRVDPDGSLRVLAD----GLTFPNG------IAFSPDGRTLYVADTGAGRIYRF--- 164
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1246 sgakDLAGNSEVVAGTgeqclPFDEARCGDGGkavdatlmsPRGIAVDKNGLMY--FVDATMIRKVDQNGiisTLLGSND 1323
Cdd:COG3386 165 ----DLDADGTLGNRR-----VFADLPDGPGG---------PDGLAVDADGNLWvaLWGGGGVVRFDPDG---ELLGRIE 223
|
....*....
gi 1958656154 1324 LTAVRPLSC 1332
Cdd:COG3386 224 LPERRPTNV 232
|
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
645-667 |
1.07e-03 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 38.48 E-value: 1.07e-03
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
671-700 |
1.24e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.39 E-value: 1.24e-03
10 20 30
....*....|....*....|....*....|....*
gi 1958656154 671 DCLDPT-CSSHGVCVNGE----CLCSPGWGGLNCE 700
Cdd:cd00054 4 ECASGNpCQNGGTCVNTVgsyrCSCPPGYTGRNCE 38
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
638-668 |
1.73e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.00 E-value: 1.73e-03
10 20 30
....*....|....*....|....*....|....*.
gi 1958656154 638 NQCIDPS-CGGHGSCIDG----NCVCAAGYKGEHCE 668
Cdd:cd00054 3 DECASGNpCQNGGTCVNTvgsyRCSCPPGYTGRNCE 38
|
|
| C_rich_MXAN6577 |
NF041328 |
MXAN_6577-like cysteine-rich domain; |
583-693 |
2.21e-03 |
|
MXAN_6577-like cysteine-rich domain;
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 40.90 E-value: 2.21e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 583 NGECVSglchcfpgfLGADCAK-AACPVLCSGNGQYSKGTCQCYSGwkGAECDvpmNQCI----DP-SCGGHGSCIDGNC 656
Cdd:NF041328 29 GGACVD---------LRSDPSNcGACGVACGAGQTCVAGACGCGPG--TVACG---GACVdtasDPaHCGACGAACAPGQ 94
|
90 100 110 120
....*....|....*....|....*....|....*....|....*....
gi 1958656154 657 VCAAGYKGEHCEE--VDCldptcssHGVCVN--------GEC--LCSPG 693
Cdd:NF041328 95 VCEGGACREACSEglTRC-------GGACVDlatdplhcGACgvACDPG 136
|
|
| NHL_like_6 |
cd14962 |
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ... |
1218-1422 |
3.07e-03 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271332 [Multi-domain] Cd Length: 271 Bit Score: 42.19 E-value: 3.07e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1218 YYLAVDPvTGSLYVSDTNSRRIYRvkslsgakdlagnsevvagtgeqclpFDEARcgdG-----GKAVDATLMSPRGIAV 1292
Cdd:cd14962 15 YGVAADG-RGRIYVADTGRGAVFV--------------------------FDLPN---GkvfviGNAGPNRFVSPIGVAI 64
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1293 DKNGLMYFVDAT--MIRKVDQNGIISTLLGSNDLtavrplscdssmdvaQVRlewPTDLAVNPMDNSLYVLEnnvilriT 1370
Cdd:cd14962 65 DANGNLYVSDAElgKVFVFDRDGKFLRAIGAGAL---------------FKR---PTGIAVDPAGKRLYVVD-------T 119
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|..
gi 1958656154 1371 ENHQVSIIAGRPMHCQVPGIDYSlsklaIHSALESASAIAISHTGVLYITET 1422
Cdd:cd14962 120 LAHKVKVFDLDGRLLFDIGKRGS-----GPGEFNLPTDLAVDRDGNLYVTDT 166
|
|
| NHL_like_5 |
cd14963 |
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ... |
1159-1323 |
3.28e-03 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271333 [Multi-domain] Cd Length: 268 Bit Score: 41.89 E-value: 3.28e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1159 GLAEGnKLLAPVALAVGIDGSLFVGDFnYIRRI------------FPSRnvtsilelrnKEFKHSNSPGHkyyLAVDpvT 1226
Cdd:cd14963 49 GTGPG-EFKYPYGIAVDSDGNIYVADL-YNGRIqvfdpdgkflkyFPEK----------KDRVKLISPAG---LAID--D 111
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1227 GSLYVSDTNSRRIYrvkslsgakdlagnseVVAGTGEQCLPFDEARCGDGgkavdaTLMSPRGIAVDKNGLMYFVDATMI 1306
Cdd:cd14963 112 GKLYVSDVKKHKVI----------------VFDLEGKLLLEFGKPGSEPG------ELSYPNGIAVDEDGNIYVADSGNG 169
|
170 180
....*....|....*....|
gi 1958656154 1307 R-KV-DQNG-IISTLLGSND 1323
Cdd:cd14963 170 RiQVfDKNGkFIKELNGSPD 189
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
1611-1653 |
3.76e-03 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 37.18 E-value: 3.76e-03
10 20 30 40
....*....|....*....|....*....|....*....|...
gi 1958656154 1611 YDGNtGLLATKSDETGWTTFYDYDHEGRLTNVTRPTGVVTSLH 1653
Cdd:TIGR01643 1 YDAA-GRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRYE 42
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
736-769 |
3.96e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 37.23 E-value: 3.96e-03
10 20 30
....*....|....*....|....*....|....*.
gi 1958656154 736 VDGC--PDLCNGNGRCTLGQNSWQCVCQTGWRGPGC 769
Cdd:cd00054 2 IDECasGNPCQNGGTCVNTVGSYRCSCPPGYTGRNC 37
|
|
| SOBP |
pfam15279 |
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ... |
185-349 |
5.63e-03 |
|
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.
Pssm-ID: 464609 [Multi-domain] Cd Length: 325 Bit Score: 41.72 E-value: 5.63e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 185 LPSSHNPPPVScQMPLLDSNTSHQIMDTNPDEEF----SPNSYLLRACSGPQQ---ASSSGPPNHHSQSTLRPPLPPPhn 257
Cdd:pfam15279 127 APKPHEPPSLP-PPPLPPKKGRRHRPGLHPPLGRppgsPPMSMTPRGLLGKPQqhpPPSPLPAFMEPSSMPPPFLRPP-- 203
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 258 htlsHHHSSANSlnrnSLTNRRSQIHAPAPAP-NDLATTPEsvqlqdswvlnsnvPLEtRHFLFKTSSGSTPLFSSSSPG 336
Cdd:pfam15279 204 ----PSIPQPNS----PLSNPMLPGIGPPPKPpRNLGPPSN--------------PMH-RPPFSPHHPPPPPTPPGPPPG 260
|
170
....*....|...
gi 1958656154 337 YPLTSGTVYTPPP 349
Cdd:pfam15279 261 LPPPPPRGFTPPF 273
|
|
| I-EGF_1 |
pfam18372 |
Integrin beta epidermal growth factor like domain 1; This is the I-EGF 1 domain found in ... |
611-628 |
6.04e-03 |
|
Integrin beta epidermal growth factor like domain 1; This is the I-EGF 1 domain found in several integrin betas such as integrin beta 1-7. Structural analysis reveal an epidermal growth factor-like (I-EGF) domains 1 and 2. EGF1 lacks one disulfide (C2-C4) relative to the integrin EGF 2, 3, and 4 domains, this allows the C-terminal end of EGF1 to flex remarkably relative to its N-terminal end.
Pssm-ID: 465729 Cd Length: 29 Bit Score: 36.31 E-value: 6.04e-03
|
| EGF |
pfam00008 |
EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very ... |
739-767 |
6.73e-03 |
|
EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very similar, but has 8 instead of 6 conserved cysteines. Includes some cytokine receptors. The EGF domain misses the N-terminus regions of the Ca2+ binding EGF domains (this is the main reason of discrepancy between swiss-prot domain start/end and Pfam). The family is hard to model due to many similar but different sub-types of EGF domains. Pfam certainly misses a number of EGF domains.
Pssm-ID: 394967 Cd Length: 31 Bit Score: 36.21 E-value: 6.73e-03
10 20 30
....*....|....*....|....*....|
gi 1958656154 739 CPDL-CNGNGRCTLGQNSWQCVCQTGWRGP 767
Cdd:pfam00008 1 CAPNpCSNGGTCVDTPGGYTCICPEGYTGK 30
|
|
| NHL_PKND_like |
cd14952 |
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein ... |
1169-1302 |
7.94e-03 |
|
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein with a cytosolic kinase domain and an extracellular sensor domain that contains NHL repeats. It plays a key role in the development of central nervous system tuberculosis, by mediating the invasion of host brain endothelia. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271322 [Multi-domain] Cd Length: 247 Bit Score: 40.65 E-value: 7.94e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1169 PVALAVGIDGSLFVGDF--NYIRRIFPSRNVTSILElrnkeFKHSNSPGHkyyLAVDPvTGSLYVSDTNSRRIYRvksls 1246
Cdd:cd14952 138 PDGVAVDGAGNVYVTDTgnNRVLKLAAGSTTQTVLP-----FTGLNSPSG---VAVDT-AGNVYVTDHGNNRVLK----- 203
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*..
gi 1958656154 1247 gakdLAgnsevvAGTGEQC-LPFDEarcgdggkavdatLMSPRGIAVDKNGLMYFVD 1302
Cdd:cd14952 204 ----LA------AGSTTPTvLPFTG-------------LNGPLGVAVDAAGNVYVAD 237
|
|
| NHL-2_like |
cd14951 |
NHL repeat domain of NHL repeat-containing protein 2 and similar proteins; NHL ... |
1414-1499 |
8.10e-03 |
|
NHL repeat domain of NHL repeat-containing protein 2 and similar proteins; NHL repeat-containing protein 2 (NHLRC2) and related bacterial proteins; members of this eukaryotic and bacterial family are uncharacterized, the NHL repeat domain is found C-terminally of a thioredoxin domain. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271321 [Multi-domain] Cd Length: 334 Bit Score: 41.02 E-value: 8.10e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958656154 1414 TGVLYITETDEKKINRL----RQVTTngeiclLAGaasdcdckndvncicySGDDAYA-TDAILNSPSSLAVAPDGTIYI 1488
Cdd:cd14951 206 DGSVYVADTYNHKIKRVdpatGEVST------LAG----------------TGKAGYKdLEAQFSEPSGLVVDGDGRLYV 263
|
90
....*....|.
gi 1958656154 1489 ADLGNIRIRAV 1499
Cdd:cd14951 264 ADTNNHRIRRL 274
|
|
|