NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1092545404|ref|WP_070528148|]
View 

family 20 glycosylhydrolase [Streptococcus sp. HMSC073F11]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
GH20_DspB_LnbB-like cd06564
Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase ...
190-548 3.28e-102

Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase (LnbB) and related proteins. Dispersin B is a soluble beta-N-acetylglucosamidase found in bacteria that hydrolyzes the beta-1,6-linkages of PGA (poly-beta-(1,6)-N-acetylglucosamine), a major component of the extracellular polysaccharide matrix. Lacto-N-biosidase hydrolyzes lacto-N-biose (LNB) type I oligosaccharides at the nonreducing terminus to produce lacto-N-biose as part of the GNB/LNB (galacto-N-biose/lacto-N-biose I) degradation pathway. The lacto-N-biosidase from Bifidobacterium bifidum has this GH20 domain, a carbohydrate binding module 32, and a bacterial immunoglobulin-like domain 2, as well as a YSIRK signal peptide and a G5 membrane anchor at the N and C termini, respectively. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


:

Pssm-ID: 119334  Cd Length: 326  Bit Score: 328.48  E-value: 3.28e-102
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  190 KRKIVSIDAGRKYFSPEQLKEIIDKAKEYGYTDLHLLVgNDGLRFMLDDMSMKVGDKTYSSDDVKRaienGTNAYYDDPN 269
Cdd:cd06564      1 EVRGFMLDVGRKYYSMDFLKDIIKTMSWYKMNDLQLHL-NDNLIFNLDDMSTTVNNATYASDDVKS----GNNYYNLTAN 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  270 GNHLTESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELGIENPNFDYfgkkSERTVDLNNKQAVDFTKTLIDKYA 349
Cdd:cd06564     76 DGYYTKEEFKELIAYAKDRGVNIIPEIDSPGHSLAFTKAMPELGLKNPFSKY----DKDTLDISNPEAVKFVKALFDEYL 151
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  350 NYFSKKSEIFNIGLDEYANDATDakgwsvlqadkyypnegypekgYEKFISYANDLARIVKSHGLKPMAFNDGIYYNSDT 429
Cdd:cd06564    152 DGFNPKSDTVHIGADEYAGDAGY----------------------AEAFRAYVNDLAKYVKDKGKTPRVWGDGIYYKGDT 209
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  430 sfGSFDKDIIVSMWTGGWggydvASSKLLAEKGHQILNTNDAWYY-VLGRNADGQGWYNLDQGLNGIKNTPITSVPKTEG 508
Cdd:cd06564    210 --TVLSKDVIINYWSYGW-----ADPKELLNKGYKIINTNDGYLYiVPGAGYYGDYLNTEDIYNNWTPNKFGGTNATLPE 282
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 1092545404  509 ADVPIIGGMVAAWADTPSARYSPS----RLFKLMRHFANANAEY 548
Cdd:cd06564    283 GDPQILGGMFAIWNDDSDAGISEVdiydRIFPALPAFAEKTWGG 326
GH20_DspB_LnbB-like cd06564
Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase ...
635-982 4.65e-101

Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase (LnbB) and related proteins. Dispersin B is a soluble beta-N-acetylglucosamidase found in bacteria that hydrolyzes the beta-1,6-linkages of PGA (poly-beta-(1,6)-N-acetylglucosamine), a major component of the extracellular polysaccharide matrix. Lacto-N-biosidase hydrolyzes lacto-N-biose (LNB) type I oligosaccharides at the nonreducing terminus to produce lacto-N-biose as part of the GNB/LNB (galacto-N-biose/lacto-N-biose I) degradation pathway. The lacto-N-biosidase from Bifidobacterium bifidum has this GH20 domain, a carbohydrate binding module 32, and a bacterial immunoglobulin-like domain 2, as well as a YSIRK signal peptide and a G5 membrane anchor at the N and C termini, respectively. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


:

Pssm-ID: 119334  Cd Length: 326  Bit Score: 325.01  E-value: 4.65e-101
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  635 KNKVISIDAGRKYFSLDQLKRIVDKASELGYSDAHLLLgNDGLRFLLDDMTITANGKTYASDDVKkaiiEGTKAYYDDPN 714
Cdd:cd06564      1 EVRGFMLDVGRKYYSMDFLKDIIKTMSWYKMNDLQLHL-NDNLIFNLDDMSTTVNNATYASDDVK----SGNNYYNLTAN 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  715 GTALTQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGIANPqanFDKVSKTTMDLENQEALNFTKALIGKYMD 794
Cdd:cd06564     76 DGYYTKEEFKELIAYAKDRGVNIIPEIDSPGHSLAFTKAMPELGLKNP---FSKYDKDTLDISNPEAVKFVKALFDEYLD 152
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  795 YFADKSKIFNYGTDEYANDATNaqgwyylkwyglYNKFADYSNSLAAMAKERGLQPMAFNDGFYYEDkDDAEFDKDVLIS 874
Cdd:cd06564    153 GFNPKSDTVHIGADEYAGDAGY------------AEAFRAYVNDLAKYVKDKGKTPRVWGDGIYYKG-DTTVLSKDVIIN 219
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  875 YWSKGWwgynlASPQYLASKGYKFLNTNGDWYYILGQKPEDGGGF-LKKAIENTGKTPFNQlASTKYPEVDLPTVGSMLS 953
Cdd:cd06564    220 YWSYGW-----ADPKELLNKGYKIINTNDGYLYIVPGAGYYGDYLnTEDIYNNWTPNKFGG-TNATLPEGDPQILGGMFA 293
                          330       340       350
                   ....*....|....*....|....*....|...
gi 1092545404  954 IWADRPSAEYKEEEIFE----LMTAFADHNKDY 982
Cdd:cd06564    294 IWNDDSDAGISEVDIYDrifpALPAFAEKTWGG 326
PspC_relate_1 super family cl41464
PspC-related protein choline-binding protein 1; Members of this family share C-terminal ...
1066-1341 3.63e-58

PspC-related protein choline-binding protein 1; Members of this family share C-terminal homology to the choline-binding form of the pneumococcal surface antigen PspC, but not to its allelic LPXTG-anchored forms because they lack the choline-binding repeat region. Members of this family should not be confused with PspC itself, whose identity and function reflect regions N-terminal to the choline-binding region. See Iannelli, et al. (PMID: 11891047) for information about the different allelic forms of PspC.


The actual alignment was detected with superfamily member NF033840:

Pssm-ID: 411409 [Multi-domain]  Cd Length: 648  Bit Score: 213.41  E-value: 3.63e-58
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1066 TKPELVVktESIPFKVIRKENPNLPAGQEkVVKAGVLGE--RTSYVSVLTENGKASETVlDSQVTKEPVdqivefgapit 1143
Cdd:NF033840   397 TKPQVLV--QVIPIETEYLDDPTLDKGQE-VEEAGEIGEitLTTIYTVDERDGTIEETT-SRQITKEMV----------- 461
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1144 hvgdenglapiaeeKPRLDIPKEEPSRTETPFKVVVESGP-KAESNSANGITHQLADGKNSKPGWKLIESKWYYYDHADK 1222
Cdd:NF033840   462 --------------KRRIRRGTREPEKVVVPKKSSIPSYPvSVTSNQGTDAAVEPAKPVAPTTGWKQENGMWYFYNTDGS 527
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1223 VKTGWVK-DGAWYYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQTGWVNQGDTWYYLDNSGT 1301
Cdd:NF033840   528 MATGWVQvNGSWYYLNSNGSMATGWVQVNGSWYYLNSNGSMATGWVQVDGSWYYLNDNGSMETGWLQNNGSWYYLNSNGS 607
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|.
gi 1092545404 1302 MKTG-WFQVEDKWYYSYPSGALAVNTTIDGYAVNADGEWIQ 1341
Cdd:NF033840   608 MKANqWFQVGSKWYYVNASGELAVNTSIDGYRVNDNGEWVR 648
PspC_subgroup_2 super family cl41463
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
35-187 5.71e-08

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


The actual alignment was detected with superfamily member NF033839:

Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 57.09  E-value: 5.71e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   35 DGVTPTTENQPTIHTV----------------SDSPQPSENRTEETPKAELQPE-----TPKTVETETPSTDKVASL--P 91
Cdd:NF033839   249 DNVNTKVEIENTVHKIfadmdavvtkfkkgltQDTPKEPGNKKPSAPKPGMQPSpqpekKEVKPEPETPKPEVKPQLekP 328
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   92 KTEEKTQEEV----------SSTPSDKEEVVTPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDKTD-KSEADKDKPA 160
Cdd:NF033839   329 KPEVKPQPEKpkpevkpqleTPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPETPKPEVKPQPEKPKPEvKPQPEKPKPE 408
                          170       180
                   ....*....|....*....|....*....
gi 1092545404  161 KKD--ETKAEADKPATEAGKERAATENEK 187
Cdd:NF033839   409 VKPqpEKPKPEVKPQPEKPKPEVKPQPEK 437
YSIRK_signal TIGR01168
Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, ...
1-24 2.03e-07

Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.


:

Pssm-ID: 273479 [Multi-domain]  Cd Length: 39  Bit Score: 48.25  E-value: 2.03e-07
                           10        20
                   ....*....|....*....|....
gi 1092545404    1 MKQEKQQRFSIRKYAVGAASVLIG 24
Cdd:TIGR01168    3 KFNEKQQKYSIRKLSVGVASVLVA 26
 
Name Accession Description Interval E-value
GH20_DspB_LnbB-like cd06564
Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase ...
190-548 3.28e-102

Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase (LnbB) and related proteins. Dispersin B is a soluble beta-N-acetylglucosamidase found in bacteria that hydrolyzes the beta-1,6-linkages of PGA (poly-beta-(1,6)-N-acetylglucosamine), a major component of the extracellular polysaccharide matrix. Lacto-N-biosidase hydrolyzes lacto-N-biose (LNB) type I oligosaccharides at the nonreducing terminus to produce lacto-N-biose as part of the GNB/LNB (galacto-N-biose/lacto-N-biose I) degradation pathway. The lacto-N-biosidase from Bifidobacterium bifidum has this GH20 domain, a carbohydrate binding module 32, and a bacterial immunoglobulin-like domain 2, as well as a YSIRK signal peptide and a G5 membrane anchor at the N and C termini, respectively. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119334  Cd Length: 326  Bit Score: 328.48  E-value: 3.28e-102
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  190 KRKIVSIDAGRKYFSPEQLKEIIDKAKEYGYTDLHLLVgNDGLRFMLDDMSMKVGDKTYSSDDVKRaienGTNAYYDDPN 269
Cdd:cd06564      1 EVRGFMLDVGRKYYSMDFLKDIIKTMSWYKMNDLQLHL-NDNLIFNLDDMSTTVNNATYASDDVKS----GNNYYNLTAN 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  270 GNHLTESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELGIENPNFDYfgkkSERTVDLNNKQAVDFTKTLIDKYA 349
Cdd:cd06564     76 DGYYTKEEFKELIAYAKDRGVNIIPEIDSPGHSLAFTKAMPELGLKNPFSKY----DKDTLDISNPEAVKFVKALFDEYL 151
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  350 NYFSKKSEIFNIGLDEYANDATDakgwsvlqadkyypnegypekgYEKFISYANDLARIVKSHGLKPMAFNDGIYYNSDT 429
Cdd:cd06564    152 DGFNPKSDTVHIGADEYAGDAGY----------------------AEAFRAYVNDLAKYVKDKGKTPRVWGDGIYYKGDT 209
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  430 sfGSFDKDIIVSMWTGGWggydvASSKLLAEKGHQILNTNDAWYY-VLGRNADGQGWYNLDQGLNGIKNTPITSVPKTEG 508
Cdd:cd06564    210 --TVLSKDVIINYWSYGW-----ADPKELLNKGYKIINTNDGYLYiVPGAGYYGDYLNTEDIYNNWTPNKFGGTNATLPE 282
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 1092545404  509 ADVPIIGGMVAAWADTPSARYSPS----RLFKLMRHFANANAEY 548
Cdd:cd06564    283 GDPQILGGMFAIWNDDSDAGISEVdiydRIFPALPAFAEKTWGG 326
GH20_DspB_LnbB-like cd06564
Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase ...
635-982 4.65e-101

Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase (LnbB) and related proteins. Dispersin B is a soluble beta-N-acetylglucosamidase found in bacteria that hydrolyzes the beta-1,6-linkages of PGA (poly-beta-(1,6)-N-acetylglucosamine), a major component of the extracellular polysaccharide matrix. Lacto-N-biosidase hydrolyzes lacto-N-biose (LNB) type I oligosaccharides at the nonreducing terminus to produce lacto-N-biose as part of the GNB/LNB (galacto-N-biose/lacto-N-biose I) degradation pathway. The lacto-N-biosidase from Bifidobacterium bifidum has this GH20 domain, a carbohydrate binding module 32, and a bacterial immunoglobulin-like domain 2, as well as a YSIRK signal peptide and a G5 membrane anchor at the N and C termini, respectively. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119334  Cd Length: 326  Bit Score: 325.01  E-value: 4.65e-101
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  635 KNKVISIDAGRKYFSLDQLKRIVDKASELGYSDAHLLLgNDGLRFLLDDMTITANGKTYASDDVKkaiiEGTKAYYDDPN 714
Cdd:cd06564      1 EVRGFMLDVGRKYYSMDFLKDIIKTMSWYKMNDLQLHL-NDNLIFNLDDMSTTVNNATYASDDVK----SGNNYYNLTAN 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  715 GTALTQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGIANPqanFDKVSKTTMDLENQEALNFTKALIGKYMD 794
Cdd:cd06564     76 DGYYTKEEFKELIAYAKDRGVNIIPEIDSPGHSLAFTKAMPELGLKNP---FSKYDKDTLDISNPEAVKFVKALFDEYLD 152
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  795 YFADKSKIFNYGTDEYANDATNaqgwyylkwyglYNKFADYSNSLAAMAKERGLQPMAFNDGFYYEDkDDAEFDKDVLIS 874
Cdd:cd06564    153 GFNPKSDTVHIGADEYAGDAGY------------AEAFRAYVNDLAKYVKDKGKTPRVWGDGIYYKG-DTTVLSKDVIIN 219
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  875 YWSKGWwgynlASPQYLASKGYKFLNTNGDWYYILGQKPEDGGGF-LKKAIENTGKTPFNQlASTKYPEVDLPTVGSMLS 953
Cdd:cd06564    220 YWSYGW-----ADPKELLNKGYKIINTNDGYLYIVPGAGYYGDYLnTEDIYNNWTPNKFGG-TNATLPEGDPQILGGMFA 293
                          330       340       350
                   ....*....|....*....|....*....|...
gi 1092545404  954 IWADRPSAEYKEEEIFE----LMTAFADHNKDY 982
Cdd:cd06564    294 IWNDDSDAGISEVDIYDrifpALPAFAEKTWGG 326
PspC_relate_1 NF033840
PspC-related protein choline-binding protein 1; Members of this family share C-terminal ...
1066-1341 3.63e-58

PspC-related protein choline-binding protein 1; Members of this family share C-terminal homology to the choline-binding form of the pneumococcal surface antigen PspC, but not to its allelic LPXTG-anchored forms because they lack the choline-binding repeat region. Members of this family should not be confused with PspC itself, whose identity and function reflect regions N-terminal to the choline-binding region. See Iannelli, et al. (PMID: 11891047) for information about the different allelic forms of PspC.


Pssm-ID: 411409 [Multi-domain]  Cd Length: 648  Bit Score: 213.41  E-value: 3.63e-58
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1066 TKPELVVktESIPFKVIRKENPNLPAGQEkVVKAGVLGE--RTSYVSVLTENGKASETVlDSQVTKEPVdqivefgapit 1143
Cdd:NF033840   397 TKPQVLV--QVIPIETEYLDDPTLDKGQE-VEEAGEIGEitLTTIYTVDERDGTIEETT-SRQITKEMV----------- 461
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1144 hvgdenglapiaeeKPRLDIPKEEPSRTETPFKVVVESGP-KAESNSANGITHQLADGKNSKPGWKLIESKWYYYDHADK 1222
Cdd:NF033840   462 --------------KRRIRRGTREPEKVVVPKKSSIPSYPvSVTSNQGTDAAVEPAKPVAPTTGWKQENGMWYFYNTDGS 527
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1223 VKTGWVK-DGAWYYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQTGWVNQGDTWYYLDNSGT 1301
Cdd:NF033840   528 MATGWVQvNGSWYYLNSNGSMATGWVQVNGSWYYLNSNGSMATGWVQVDGSWYYLNDNGSMETGWLQNNGSWYYLNSNGS 607
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|.
gi 1092545404 1302 MKTG-WFQVEDKWYYSYPSGALAVNTTIDGYAVNADGEWIQ 1341
Cdd:NF033840   608 MKANqWFQVGSKWYYVNASGELAVNTSIDGYRVNDNGEWVR 648
COG5263 COG5263
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];
1206-1341 4.11e-52

Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];


Pssm-ID: 444077 [Multi-domain]  Cd Length: 486  Bit Score: 191.62  E-value: 4.11e-52
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1206 GWKLIESKWYYYDHADKVKTGWVK-DGAWYYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQT 1284
Cdd:COG5263    347 GWVTDDGKWYYLGSDGAMATGWQKiDGKWYYFDSNGAMATGWVKVDGKWYYFDSSGAMATGWLKIDGKWYYFDSDGAMAT 426
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1285 GWVNQGDTWYYLDNSGTMKTGWFQVEDKWYYSYPSGALAVNT-TIDG--YAVNADGEWIQ 1341
Cdd:COG5263    427 GWQKIGGKWYYFDSNGAMATGWVKVDGKWYYFDSDGAMATGWqTIDGktYYFDSNGAWVG 486
pneumo_PspA NF033930
pneumococcal surface protein A; The pneumococcal surface protein proteins, found in ...
1206-1340 4.98e-49

pneumococcal surface protein A; The pneumococcal surface protein proteins, found in Streptococcus pneumoniae, are repetitive, with patterns of localized high sequence identity across pairs of proteins given different specific names that recombination may be presumed. This protein, PspA, has an N-terminal region that lacks a cross-wall-targeting YSIRK type extended signal peptide, in contrast to the closely related choline-binding protein CbpA which has a similar C-terminus but a YSIRK-containing region at the N-terminus.


Pssm-ID: 468251 [Multi-domain]  Cd Length: 660  Bit Score: 186.27  E-value: 4.98e-49
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1206 GWKLIESKWYYYDHADKVKTGWVKD-GAWYYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQT 1284
Cdd:NF033930   504 GWAKVNGSWYYLNANGAMATGWLQYnGSWYYLNANGAMATGWLKYNGSWYYLNANGAMATGWLQYNGSWYYLNANGAMAT 583
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1092545404 1285 G--------------------WVNQGDTWYYLDNSGTMKTG-WFQVEDKWYYSYPSGALAVNTTIDGYAVNADGEWI 1340
Cdd:NF033930   584 GwakvngswyylnangsmatgWVKDGDTWYYLEASGAMKASqWFKVSDKWYYVNGLGALAVNTTVDGYTVNANGEWV 660
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
1148-1340 1.13e-47

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 182.91  E-value: 1.13e-47
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1148 ENGLAPIAEEKPRLdIPKEEPSRTETPFKvvvESGPKAESNSANGITHQLADGKNSKPGWKLIESKWYYYDHADKVKTGW 1227
Cdd:NF033838   453 EEDYARRSEEEYNR-LTQQQPPKTEKPAQ---PSTPKTGWKQENGMWYFYNTDGSMATGWLQNNGSWYYLNANGAMATGW 528
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1228 VK-DGAWYYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQTG--------------------- 1285
Cdd:NF033838   529 LQnNGSWYYLNANGSMATGWLQNNGSWYYLNANGAMATGWLQYNGSWYYLNANGDMATGwlqyngswyylnangdmatgw 608
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1092545404 1286 -------------------WVNQGDTWYYLDNSGTMKTG-WFQVEDKWYYSYPSGALAVNTTIDGYAVNADGEWI 1340
Cdd:NF033838   609 lqyngswyylnangsmatgWVKDGDTWYYLEASGAMKASqWFKVSDKWYYVNGSGALAVNTTVDGYGVNANGEWV 683
Glyco_hydro_20 pfam00728
Glycosyl hydrolase family 20, catalytic domain; This domain has a TIM barrel fold.
196-524 6.64e-22

Glycosyl hydrolase family 20, catalytic domain; This domain has a TIM barrel fold.


Pssm-ID: 425840  Cd Length: 344  Bit Score: 98.91  E-value: 6.64e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  196 IDAGRKYFSPEQLKEIIDKAKEYGYTDLHL-LVGNDGLRFMLDDMSMKVGDKTYSSDDVkraIENGTNAYYddpngnhlT 274
Cdd:pfam00728    8 LDVARHFLPVDDIKRTIDAMAAYKLNVLHWhLTDDQGWRLEIKKYPKLTEKGAYRPSDL---DGTPYGGFY--------T 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  275 ESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELG----IENPNFDYFGKKSERTVDLNNKQAVDFTKTLIDKYAN 350
Cdd:pfam00728   77 QEDIREIVAYAAARGIRVIPEIDMPGHARAALAAYPELGcgcgADSPWVSVQWGPPEGQLNPGNEKTYTFLDNVFDEVAD 156
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  351 YFSkkSEIFNIGLDEYAndatdAKGWsvlQAD----KYYPNEGYpeKGYEKFISY-ANDLARIVKSHGLKPMAFNDGIYY 425
Cdd:pfam00728  157 LFP--SDYIHIGGDEVP-----KGCW---EKSpecqARMKEEGL--KSLHELQQYfIKRASKIVSSKGRRLIGWDEILDG 224
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  426 NSDTSfgsfDKDIIVSMWTGGWGGydvasSKLLAEKGHQ-ILNTNDAWYYVLGRNADGQGWYNLDQGLNGIKNT----PI 500
Cdd:pfam00728  225 GVPLL----PKNTTVQSWRGGDEA-----AQKAAKQGYDvIMSPGDFLYLDCGQGGNPTEEPYYWGGFVPLEDVynwdPV 295
                          330       340
                   ....*....|....*....|....
gi 1092545404  501 TSVPKTEGADVPIIGGMVAAWADT 524
Cdd:pfam00728  296 PDTWNDPEQAKHVLGGQANLWTEQ 319
G5 pfam07501
G5 domain; This domain is found in a wide range of extracellular proteins. It is found ...
1069-1139 1.48e-19

G5 domain; This domain is found in a wide range of extracellular proteins. It is found tandemly repeated in up to 8 copies. It is found in the N-terminus of peptidases belonging to the M26 family which cleave human IgA. The domain is also found in proteins involved in metabolism of bacterial cell walls suggesting this domain may have an adhesive function.


Pssm-ID: 462185 [Multi-domain]  Cd Length: 75  Bit Score: 84.14  E-value: 1.48e-19
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1092545404 1069 ELVVKTESIPFKVIRKENPNLPAGQEKVVKAGVLGERTSYVSVLTENGK-ASETVLDSQVTKEPVDQIVEFG 1139
Cdd:pfam07501    2 KTVTEEEEIPFETVTKEDPSLPKGEEKVVQEGKPGEKEVTYKVTYVNGKeVSREVVSEEVTKEPVDEVVAVG 73
Glyco_hydro_20 pfam00728
Glycosyl hydrolase family 20, catalytic domain; This domain has a TIM barrel fold.
641-908 2.12e-14

Glycosyl hydrolase family 20, catalytic domain; This domain has a TIM barrel fold.


Pssm-ID: 425840  Cd Length: 344  Bit Score: 76.18  E-value: 2.12e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  641 IDAGRKYFSLDQLKRIVDKASELGYSDAHLLLGND-GLRFLLDDMTITANGKTYASDDVKKAIIEGtkaYYddpngtalT 719
Cdd:pfam00728    8 LDVARHFLPVDDIKRTIDAMAAYKLNVLHWHLTDDqGWRLEIKKYPKLTEKGAYRPSDLDGTPYGG---FY--------T 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  720 QAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGIANPQANFD-----KVSKTTMDLENQEALNFTKALIGKYMD 794
Cdd:pfam00728   77 QEDIREIVAYAAARGIRVIPEIDMPGHARAALAAYPELGCGCGADSPWvsvqwGPPEGQLNPGNEKTYTFLDNVFDEVAD 156
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  795 YFadKSKIFNYGTDEYANDATNA----QGWY----YLKWYGLYNKFADYsnsLAAMAKERGLQPMAFNDGFyyeDKDDAE 866
Cdd:pfam00728  157 LF--PSDYIHIGGDEVPKGCWEKspecQARMkeegLKSLHELQQYFIKR---ASKIVSSKGRRLIGWDEIL---DGGVPL 228
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|..
gi 1092545404  867 FDKDVLISYWsKGWWGYNLAspqyLASKGYKFLNTNGDWYYI 908
Cdd:pfam00728  229 LPKNTTVQSW-RGGDEAAQK----AAKQGYDVIMSPGDFLYL 265
Chb COG3525
N-acetyl-beta-hexosaminidase [Carbohydrate transport and metabolism];
257-474 7.04e-13

N-acetyl-beta-hexosaminidase [Carbohydrate transport and metabolism];


Pssm-ID: 442747 [Multi-domain]  Cd Length: 578  Bit Score: 72.97  E-value: 7.04e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  257 IENGTNAYYDDPNGNHLTESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELGIENPNFDYFGKK--SERTVDLNN 334
Cdd:COG3525    223 IGHDPQPFDGKPYGGFYTQEDIREIVAYAAARGITVIPEIDMPGHARAAIAAYPELGCTGKPYSVRSVWgvFDNVLNPGK 302
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  335 KQAVDFTKTLIDKYANYFskKSEIFNIGLDEYANDA----------------TDAKGwsvLQAdkYypnegypekgyekF 398
Cdd:COG3525    303 ESTYTFLEDVLDEVAALF--PSPYIHIGGDEVPKGQwekspacqalmkelglKDEHE---LQS--Y-------------F 362
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1092545404  399 IsyaNDLARIVKSHGLKPMAFNDGIyynsdtsFGSFDKDIIVSMWTGGWGGYDvassklLAEKGHQILNTNDAWYY 474
Cdd:COG3525    363 I---RRVEKILASKGRKMIGWDEIL-------EGGLAPNATVMSWRGEDGGIE------AAKAGHDVVMSPGSYLY 422
Chb COG3525
N-acetyl-beta-hexosaminidase [Carbohydrate transport and metabolism];
719-908 2.03e-09

N-acetyl-beta-hexosaminidase [Carbohydrate transport and metabolism];


Pssm-ID: 442747 [Multi-domain]  Cd Length: 578  Bit Score: 61.80  E-value: 2.03e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  719 TQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGianpqaNFDK---------VSKTTMDLENQEALNFTKALI 789
Cdd:COG3525    240 TQEDIREIVAYAAARGITVIPEIDMPGHARAAIAAYPELG------CTGKpysvrsvwgVFDNVLNPGKESTYTFLEDVL 313
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  790 GKYMDYFadKSKIFNYGTDEYANDA----------------TNA---QGWYYlkwyglynkfadysNSLAAMAKERGLQP 850
Cdd:COG3525    314 DEVAALF--PSPYIHIGGDEVPKGQwekspacqalmkelglKDEhelQSYFI--------------RRVEKILASKGRKM 377
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1092545404  851 MAFNDGFyyedkdDAEFDKDVLISYWSkgwwgyNLASPQYLASKGYKFLNTNGDWYYI 908
Cdd:COG3525    378 IGWDEIL------EGGLAPNATVMSWR------GEDGGIEAAKAGHDVVMSPGSYLYF 423
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
35-187 5.71e-08

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 57.09  E-value: 5.71e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   35 DGVTPTTENQPTIHTV----------------SDSPQPSENRTEETPKAELQPE-----TPKTVETETPSTDKVASL--P 91
Cdd:NF033839   249 DNVNTKVEIENTVHKIfadmdavvtkfkkgltQDTPKEPGNKKPSAPKPGMQPSpqpekKEVKPEPETPKPEVKPQLekP 328
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   92 KTEEKTQEEV----------SSTPSDKEEVVTPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDKTD-KSEADKDKPA 160
Cdd:NF033839   329 KPEVKPQPEKpkpevkpqleTPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPETPKPEVKPQPEKPKPEvKPQPEKPKPE 408
                          170       180
                   ....*....|....*....|....*....
gi 1092545404  161 KKD--ETKAEADKPATEAGKERAATENEK 187
Cdd:NF033839   409 VKPqpEKPKPEVKPQPEKPKPEVKPQPEK 437
YSIRK_signal TIGR01168
Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, ...
1-24 2.03e-07

Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.


Pssm-ID: 273479 [Multi-domain]  Cd Length: 39  Bit Score: 48.25  E-value: 2.03e-07
                           10        20
                   ....*....|....*....|....
gi 1092545404    1 MKQEKQQRFSIRKYAVGAASVLIG 24
Cdd:TIGR01168    3 KFNEKQQKYSIRKLSVGVASVLVA 26
YSIRK_signal pfam04650
YSIRK type signal peptide; Many surface proteins found in Streptococcus, Staphylococcus, and ...
4-24 3.57e-07

YSIRK type signal peptide; Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.


Pssm-ID: 428049 [Multi-domain]  Cd Length: 26  Bit Score: 47.38  E-value: 3.57e-07
                           10        20
                   ....*....|....*....|.
gi 1092545404    4 EKQQRFSIRKYAVGAASVLIG 24
Cdd:pfam04650    1 EKKQRYSIRKLSVGVASVLIG 21
DUF4045 pfam13254
Domain of unknown function (DUF4045); This presumed domain is functionally uncharacterized. ...
5-216 4.45e-07

Domain of unknown function (DUF4045); This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is typically between 384 and 430 amino acids in length.


Pssm-ID: 433066 [Multi-domain]  Cd Length: 415  Bit Score: 54.02  E-value: 4.45e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404    5 KQQRfsirkyavgaASVLIGFAFQAqavaaDGVTPTTENQPTiHTVSDSPQPSENrteeTPKAELQPETPK-TVETETPS 83
Cdd:pfam13254  186 RQSR----------ASVDLGRPNSF-----KEVTPVGLMRSP-APGGHSKSPSVS----GISADSSPTKEEpSEEADTLS 245
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   84 TDKvASLPkteekTQEEVSSTPSDKEEVvtPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDKTD------KSEADKD 157
Cdd:pfam13254  246 TDK-EQSP-----APTSASEPPPKTKEL--PKDSEEPAAPSKSAEASTEKKEPDTESSPETSSEKSApsllspVSKASID 317
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  158 KPAKKDETKAEADKPAT--------------EAGKERAAT-ENE------KLAKRKIvsidagRKYFSPEQLKEIIDKAK 216
Cdd:pfam13254  318 KPLSSPDRDPLSPKPKPqsppkdfranlrsrEVPKDKSKKdEPEfknvfgKLRKAET------KNYVAPDELKDNILRGK 391
Agg_substance NF033875
LPXTG-anchored aggregation substance; Aggregation substances, as described in Enterococcus, ...
1-217 9.58e-07

LPXTG-anchored aggregation substance; Aggregation substances, as described in Enterococcus, are LPXTG-anchored large surface proteins that contribute to virulence. Several closely related paralogs may be found in a single strain.


Pssm-ID: 411439 [Multi-domain]  Cd Length: 1306  Bit Score: 53.56  E-value: 9.58e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404    1 MKQ--EKQQRFSI---RKYAVGAASVLIGFAFQAQAVAADGVTPTTENQPTIHTVS-DSPQPSENRteETPKAELQPETp 74
Cdd:NF033875     1 MKQqtEVKKRFKMykaKKHWVVAPILFLGVLGVVGLATDNVQAAELDTQPGTTTVQpDNPDPQSGS--ETPKTAVSEEA- 77
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   75 kTVETETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPT-SVEKEAADK-------------KAEEASPKKEEQKEAN 140
Cdd:NF033875    78 -TVQKDTTSQPTKVEEVASEKNGAEQSSATPNDTTNAQQPTvGAEKSAQEQpvvspettneplgQPTEVAPAENEANKST 156
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  141 S--KESDTDKTDKSEADkdkpAKKDETKAEADKPATEAG----KERAATENEklakrkivsIDAGRKyfspEQLKEIIDK 214
Cdd:NF033875   157 SipKEFETPDVDKAVDE----AKKDPNITVVEKPAEDLGnvssKDLAAKEKE---------VDQLQK----EQAKKIAQQ 219

                   ...
gi 1092545404  215 AKE 217
Cdd:NF033875   220 AAE 222
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
38-216 1.21e-06

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 52.85  E-value: 1.21e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   38 TPTTENQPTIHTvsdsPQPSENRTEETPKAEL--QPETPKtvetetPSTDKVASLPKTEEKTQEEvSSTPSDKEEVVTPT 115
Cdd:NF033839   316 TPKPEVKPQLEK----PKPEVKPQPEKPKPEVkpQLETPK------PEVKPQPEKPKPEVKPQPE-KPKPEVKPQPETPK 384
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  116 SVEKEAADKKAEEASPKKEEQKEANSKESDTDKTD-KSEADKDKPAKKD--ETKAEADKPATEAGKERAATENEKLAKRK 192
Cdd:NF033839   385 PEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEvKPQPEKPKPEVKPqpEKPKPEVKPQPEKPKPEVKPQPETPKPEV 464
                          170       180
                   ....*....|....*....|....
gi 1092545404  193 IVSIDAGRKYFSPEQLKEIIDKAK 216
Cdd:NF033839   465 KPQPEKPKPEVKPQPEKPKPDNSK 488
PTZ00121 PTZ00121
MAEBL; Provisional
58-217 1.29e-06

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 53.22  E-value: 1.29e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   58 ENRTEETPKaelQPETPKTVETETPSTDKVASlpKTEEKTQEEVSSTPSDKEEVVTPTSVEK---EAADKKAEEASPKKE 134
Cdd:PTZ00121  1321 KKKAEEAKK---KADAAKKKAEEAKKAAEAAK--AEAEAAADEAEAAEEKAEAAEKKKEEAKkkaDAAKKKAEEKKKADE 1395
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  135 EQKEANSKESDTDKTDKSEADKDKpAKKDETKAEADKPATEAGKE----RAATENEKLAKRKIVSIDAGRKYFSPEQLKE 210
Cdd:PTZ00121  1396 AKKKAEEDKKKADELKKAAAAKKK-ADEAKKKAEEKKKADEAKKKaeeaKKADEAKKKAEEAKKAEEAKKKAEEAKKADE 1474

                   ....*..
gi 1092545404  211 IIDKAKE 217
Cdd:PTZ00121  1475 AKKKAEE 1481
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
38-172 2.10e-06

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 52.08  E-value: 2.10e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   38 TPTTENQPTIHTvsdsPQPSENRTEETPKAEL--QPETPK---TVETETPSTDKVASL--PKTEEKTQEEvSSTPSDKEE 110
Cdd:NF033839   382 TPKPEVKPQPEK----PKPEVKPQPEKPKPEVkpQPEKPKpevKPQPEKPKPEVKPQPekPKPEVKPQPE-KPKPEVKPQ 456
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1092545404  111 VVTPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDK--TDKSEADKDK-PAKKDETKAEA-DKP 172
Cdd:NF033839   457 PETPKPEVKPQPEKPKPEVKPQPEKPKPDNSKPQADDKkpSTPNNLSKDKqPSNQASTNEKAtNKP 522
glucan_65_rpt TIGR04035
glucan-binding repeat; This model describes a region of about 63 amino acids that is composed ...
1234-1284 4.69e-06

glucan-binding repeat; This model describes a region of about 63 amino acids that is composed of three repeats of a more broadly distributed family of shorter repeats modeled by pfam01473. While the shorter repeats are often associated with choline binding (and therefore with cell wall binding), the longer repeat described here represents a subgroup of repeat sequences associated with glucan binding, as found in a number glycosylhydrolases. Shah, et al. describe a repeat consensus, WYYFDANGKAVTGAQTINGQTLYFDQDGKQVKG, that corresponds to half of the repeat as modeled here and one and a half copies of the repeat as modeled by pfam01473.


Pssm-ID: 274933 [Multi-domain]  Cd Length: 62  Bit Score: 45.20  E-value: 4.69e-06
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1092545404 1234 YYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWI--DQGGSWYYLNDSGAMQT 1284
Cdd:TIGR04035    1 YYFDADGKAVTGAQTIDGVTYYFDENGKQVKGDFvtNGGGTYYYDKDSGALVT 53
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
1-181 9.62e-06

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 50.29  E-value: 9.62e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404    1 MKQEKQQRFSIRKYAVGAASVLIGF-------AFQAQAVAADGVT--------------------PTTENQptihTVSDS 53
Cdd:NF033609     1 MNMKKKEKHAIRKKSIGVASVLVGTligfgllSSKEADASENSVTqsdsasnesksndsssvsaaPKTDDT----NVSDT 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   54 PQPSENRTEETPKAE--LQPETPKTVET-----ETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKA 126
Cdd:NF033609    77 KTSSNTNNGETSVAQnpAQQETTQSASTnatteETPVTGEATTTATNQANTPATTQSSNTNAEELVNQTSNETTSNDTNT 156
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1092545404  127 EEA--SPKKEEQKEANSKESDTDKTDKSEADKDKPAKKDETKAEADKPA--TEAGKERA 181
Cdd:NF033609   157 VSSvnSPQNSTNAENVSTTQDTSTEATPSNNESAPQSTDASNKDVVNQAvnTSAPRMRA 215
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
2-218 6.44e-04

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 44.23  E-value: 6.44e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404    2 KQEKQQRFSIRKYAVGAASVLIGFAFQAqavaadGVTPTTENQPT-IHTVSDSPQPSENRTEETPKA-------ELQPET 73
Cdd:NF033838     5 KSERKVHYSIRKFSIGVASVVVASLFLG------GVVHAEEVRGGnNPTVTSSGNESQKEHAKEVEShlekilsEIQKSL 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   74 PKTVETETPSTDKVASLPKTE---------EKTQEEVSSTPSD---------KEEVVTPTSVEKEaADKKAEEASPKKEE 135
Cdd:NF033838    79 DKRKHTQNVALNKKLSDIKTEylyelnvlkEKSEAELTSKTKKeldaafeqfKKDTLEPGKKVAE-ATKKVEEAEKKAKD 157
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  136 QKEANSKESDTD-----KTDKSEAD-KDKPAKKDETKAEADKPATEA--GKERAATENEKLAKRKIVSIDAGRKYFSPEQ 207
Cdd:NF033838   158 QKEEDRRNYPTNtyktlELEIAESDvEVKKAELELVKEEAKEPRDEEkiKQAKAKVESKKAEATRLEKIKTDREKAEEEA 237
                          250
                   ....*....|.
gi 1092545404  208 LKEIIDKAKEY 218
Cdd:NF033838   238 KRRADAKLKEA 248
tolA_full TIGR02794
TolA protein; TolA couples the inner membrane complex of itself with TolQ and TolR to the ...
119-201 1.39e-03

TolA protein; TolA couples the inner membrane complex of itself with TolQ and TolR to the outer membrane complex of TolB and OprL (also called Pal). Most of the length of the protein consists of low-complexity sequence that may differ in both length and composition from one species to another, complicating efforts to discriminate TolA (the most divergent gene in the tol-pal system) from paralogs such as TonB. Selection of members of the seed alignment and criteria for setting scoring cutoffs are based largely conserved operon struction. //The Tol-Pal complex is required for maintaining outer membrane integrity. Also involved in transport (uptake) of colicins and filamentous DNA, and implicated in pathogenesis. Transport is energized by the proton motive force. TolA is an inner membrane protein that interacts with periplasmic TolB and with outer membrane porins ompC, phoE and lamB. [Transport and binding proteins, Other, Cellular processes, Pathogenesis]


Pssm-ID: 274303 [Multi-domain]  Cd Length: 346  Bit Score: 42.53  E-value: 1.39e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  119 KEAADKKAEEASPKK---EEQKEANSKESDTDKTDKSEADKDKPAKKDETKAEAdkpatEAGKERAATENEKLA---KRK 192
Cdd:TIGR02794  145 KEEAAKQAEEEAKAKaaaEAKKKAEEAKKKAEAEAKAKAEAEAKAKAEEAKAKA-----EAAKAKAAAEAAAKAeaeAAA 219

                   ....*....
gi 1092545404  193 IVSIDAGRK 201
Cdd:TIGR02794  220 AAAAEAERK 228
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
2-186 4.66e-03

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 41.29  E-value: 4.66e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404    2 KQEKQQRFSIRKYAVGAASVLIGFAFQAQavaadgVTPTTENQPTIHTVSDSPQPSENRTEETPKAELQPETPKTVETET 81
Cdd:NF033839     5 NHERKMRYSIRKFSIGVASVAVASLFMGS------VVHATEKEGSTQAATSSNRGNESQAEQRKELDLERDKAKKAVSEY 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   82 pSTDKVASLPKTEEKTQEevsstpSDKEEVVTPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDK----------TDK 151
Cdd:NF033839    79 -KEKKVKEIYKKSTKERH------KNTVDLVNKLQNIKNEYLNKIVESTSKSQLQKLMMESQSKVDEavskfekdssSSS 151
                          170       180       190
                   ....*....|....*....|....*....|....*
gi 1092545404  152 SEADKDKPAKKDETKAEADKPATEAGKERAATENE 186
Cdd:NF033839   152 SSGSSTKPETPQPENPEHQKPTTPAPDTKPSPQPE 186
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
57-217 9.00e-03

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 40.38  E-value: 9.00e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   57 SENRTEETPKAELQPETPKTVETETPSTDK------VASLPKTEEKTQEEV-SSTPSDKEEVVTPTSVEKEA----ADKK 125
Cdd:NF033838   233 AEEEAKRRADAKLKEAVEKNVATSEQDKPKrrakrgVLGEPATPDKKENDAkSSDSSVGEETLPSPSLKPEKkvaeAEKK 312
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  126 AEEASPKKEEQKEANSKE--SDTDKT---DKSEAD-KDKPAKKDETKAEADKPATEagkeraatENEKLAKRKIVSIDAG 199
Cdd:NF033838   313 VEEAKKKAKDQKEEDRRNypTNTYKTlelEIAESDvKVKEAELELVKEEAKEPRNE--------EKIKQAKAKVESKKAE 384
                          170
                   ....*....|....*...
gi 1092545404  200 RKYFspEQLKEIIDKAKE 217
Cdd:NF033838   385 ATRL--EKIKTDRKKAEE 400
 
Name Accession Description Interval E-value
GH20_DspB_LnbB-like cd06564
Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase ...
190-548 3.28e-102

Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase (LnbB) and related proteins. Dispersin B is a soluble beta-N-acetylglucosamidase found in bacteria that hydrolyzes the beta-1,6-linkages of PGA (poly-beta-(1,6)-N-acetylglucosamine), a major component of the extracellular polysaccharide matrix. Lacto-N-biosidase hydrolyzes lacto-N-biose (LNB) type I oligosaccharides at the nonreducing terminus to produce lacto-N-biose as part of the GNB/LNB (galacto-N-biose/lacto-N-biose I) degradation pathway. The lacto-N-biosidase from Bifidobacterium bifidum has this GH20 domain, a carbohydrate binding module 32, and a bacterial immunoglobulin-like domain 2, as well as a YSIRK signal peptide and a G5 membrane anchor at the N and C termini, respectively. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119334  Cd Length: 326  Bit Score: 328.48  E-value: 3.28e-102
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  190 KRKIVSIDAGRKYFSPEQLKEIIDKAKEYGYTDLHLLVgNDGLRFMLDDMSMKVGDKTYSSDDVKRaienGTNAYYDDPN 269
Cdd:cd06564      1 EVRGFMLDVGRKYYSMDFLKDIIKTMSWYKMNDLQLHL-NDNLIFNLDDMSTTVNNATYASDDVKS----GNNYYNLTAN 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  270 GNHLTESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELGIENPNFDYfgkkSERTVDLNNKQAVDFTKTLIDKYA 349
Cdd:cd06564     76 DGYYTKEEFKELIAYAKDRGVNIIPEIDSPGHSLAFTKAMPELGLKNPFSKY----DKDTLDISNPEAVKFVKALFDEYL 151
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  350 NYFSKKSEIFNIGLDEYANDATDakgwsvlqadkyypnegypekgYEKFISYANDLARIVKSHGLKPMAFNDGIYYNSDT 429
Cdd:cd06564    152 DGFNPKSDTVHIGADEYAGDAGY----------------------AEAFRAYVNDLAKYVKDKGKTPRVWGDGIYYKGDT 209
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  430 sfGSFDKDIIVSMWTGGWggydvASSKLLAEKGHQILNTNDAWYY-VLGRNADGQGWYNLDQGLNGIKNTPITSVPKTEG 508
Cdd:cd06564    210 --TVLSKDVIINYWSYGW-----ADPKELLNKGYKIINTNDGYLYiVPGAGYYGDYLNTEDIYNNWTPNKFGGTNATLPE 282
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 1092545404  509 ADVPIIGGMVAAWADTPSARYSPS----RLFKLMRHFANANAEY 548
Cdd:cd06564    283 GDPQILGGMFAIWNDDSDAGISEVdiydRIFPALPAFAEKTWGG 326
GH20_DspB_LnbB-like cd06564
Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase ...
635-982 4.65e-101

Glycosyl hydrolase family 20 (GH20) catalytic domain of dispersin B (DspB), lacto-N-biosidase (LnbB) and related proteins. Dispersin B is a soluble beta-N-acetylglucosamidase found in bacteria that hydrolyzes the beta-1,6-linkages of PGA (poly-beta-(1,6)-N-acetylglucosamine), a major component of the extracellular polysaccharide matrix. Lacto-N-biosidase hydrolyzes lacto-N-biose (LNB) type I oligosaccharides at the nonreducing terminus to produce lacto-N-biose as part of the GNB/LNB (galacto-N-biose/lacto-N-biose I) degradation pathway. The lacto-N-biosidase from Bifidobacterium bifidum has this GH20 domain, a carbohydrate binding module 32, and a bacterial immunoglobulin-like domain 2, as well as a YSIRK signal peptide and a G5 membrane anchor at the N and C termini, respectively. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119334  Cd Length: 326  Bit Score: 325.01  E-value: 4.65e-101
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  635 KNKVISIDAGRKYFSLDQLKRIVDKASELGYSDAHLLLgNDGLRFLLDDMTITANGKTYASDDVKkaiiEGTKAYYDDPN 714
Cdd:cd06564      1 EVRGFMLDVGRKYYSMDFLKDIIKTMSWYKMNDLQLHL-NDNLIFNLDDMSTTVNNATYASDDVK----SGNNYYNLTAN 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  715 GTALTQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGIANPqanFDKVSKTTMDLENQEALNFTKALIGKYMD 794
Cdd:cd06564     76 DGYYTKEEFKELIAYAKDRGVNIIPEIDSPGHSLAFTKAMPELGLKNP---FSKYDKDTLDISNPEAVKFVKALFDEYLD 152
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  795 YFADKSKIFNYGTDEYANDATNaqgwyylkwyglYNKFADYSNSLAAMAKERGLQPMAFNDGFYYEDkDDAEFDKDVLIS 874
Cdd:cd06564    153 GFNPKSDTVHIGADEYAGDAGY------------AEAFRAYVNDLAKYVKDKGKTPRVWGDGIYYKG-DTTVLSKDVIIN 219
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  875 YWSKGWwgynlASPQYLASKGYKFLNTNGDWYYILGQKPEDGGGF-LKKAIENTGKTPFNQlASTKYPEVDLPTVGSMLS 953
Cdd:cd06564    220 YWSYGW-----ADPKELLNKGYKIINTNDGYLYIVPGAGYYGDYLnTEDIYNNWTPNKFGG-TNATLPEGDPQILGGMFA 293
                          330       340       350
                   ....*....|....*....|....*....|...
gi 1092545404  954 IWADRPSAEYKEEEIFE----LMTAFADHNKDY 982
Cdd:cd06564    294 IWNDDSDAGISEVDIYDrifpALPAFAEKTWGG 326
PspC_relate_1 NF033840
PspC-related protein choline-binding protein 1; Members of this family share C-terminal ...
1066-1341 3.63e-58

PspC-related protein choline-binding protein 1; Members of this family share C-terminal homology to the choline-binding form of the pneumococcal surface antigen PspC, but not to its allelic LPXTG-anchored forms because they lack the choline-binding repeat region. Members of this family should not be confused with PspC itself, whose identity and function reflect regions N-terminal to the choline-binding region. See Iannelli, et al. (PMID: 11891047) for information about the different allelic forms of PspC.


Pssm-ID: 411409 [Multi-domain]  Cd Length: 648  Bit Score: 213.41  E-value: 3.63e-58
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1066 TKPELVVktESIPFKVIRKENPNLPAGQEkVVKAGVLGE--RTSYVSVLTENGKASETVlDSQVTKEPVdqivefgapit 1143
Cdd:NF033840   397 TKPQVLV--QVIPIETEYLDDPTLDKGQE-VEEAGEIGEitLTTIYTVDERDGTIEETT-SRQITKEMV----------- 461
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1144 hvgdenglapiaeeKPRLDIPKEEPSRTETPFKVVVESGP-KAESNSANGITHQLADGKNSKPGWKLIESKWYYYDHADK 1222
Cdd:NF033840   462 --------------KRRIRRGTREPEKVVVPKKSSIPSYPvSVTSNQGTDAAVEPAKPVAPTTGWKQENGMWYFYNTDGS 527
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1223 VKTGWVK-DGAWYYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQTGWVNQGDTWYYLDNSGT 1301
Cdd:NF033840   528 MATGWVQvNGSWYYLNSNGSMATGWVQVNGSWYYLNSNGSMATGWVQVDGSWYYLNDNGSMETGWLQNNGSWYYLNSNGS 607
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|.
gi 1092545404 1302 MKTG-WFQVEDKWYYSYPSGALAVNTTIDGYAVNADGEWIQ 1341
Cdd:NF033840   608 MKANqWFQVGSKWYYVNASGELAVNTSIDGYRVNDNGEWVR 648
COG5263 COG5263
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];
1206-1341 4.11e-52

Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];


Pssm-ID: 444077 [Multi-domain]  Cd Length: 486  Bit Score: 191.62  E-value: 4.11e-52
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1206 GWKLIESKWYYYDHADKVKTGWVK-DGAWYYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQT 1284
Cdd:COG5263    347 GWVTDDGKWYYLGSDGAMATGWQKiDGKWYYFDSNGAMATGWVKVDGKWYYFDSSGAMATGWLKIDGKWYYFDSDGAMAT 426
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1285 GWVNQGDTWYYLDNSGTMKTGWFQVEDKWYYSYPSGALAVNT-TIDG--YAVNADGEWIQ 1341
Cdd:COG5263    427 GWQKIGGKWYYFDSNGAMATGWVKVDGKWYYFDSDGAMATGWqTIDGktYYFDSNGAWVG 486
pneumo_PspA NF033930
pneumococcal surface protein A; The pneumococcal surface protein proteins, found in ...
1206-1340 4.98e-49

pneumococcal surface protein A; The pneumococcal surface protein proteins, found in Streptococcus pneumoniae, are repetitive, with patterns of localized high sequence identity across pairs of proteins given different specific names that recombination may be presumed. This protein, PspA, has an N-terminal region that lacks a cross-wall-targeting YSIRK type extended signal peptide, in contrast to the closely related choline-binding protein CbpA which has a similar C-terminus but a YSIRK-containing region at the N-terminus.


Pssm-ID: 468251 [Multi-domain]  Cd Length: 660  Bit Score: 186.27  E-value: 4.98e-49
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1206 GWKLIESKWYYYDHADKVKTGWVKD-GAWYYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQT 1284
Cdd:NF033930   504 GWAKVNGSWYYLNANGAMATGWLQYnGSWYYLNANGAMATGWLKYNGSWYYLNANGAMATGWLQYNGSWYYLNANGAMAT 583
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1092545404 1285 G--------------------WVNQGDTWYYLDNSGTMKTG-WFQVEDKWYYSYPSGALAVNTTIDGYAVNADGEWI 1340
Cdd:NF033930   584 GwakvngswyylnangsmatgWVKDGDTWYYLEASGAMKASqWFKVSDKWYYVNGLGALAVNTTVDGYTVNANGEWV 660
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
1148-1340 1.13e-47

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 182.91  E-value: 1.13e-47
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1148 ENGLAPIAEEKPRLdIPKEEPSRTETPFKvvvESGPKAESNSANGITHQLADGKNSKPGWKLIESKWYYYDHADKVKTGW 1227
Cdd:NF033838   453 EEDYARRSEEEYNR-LTQQQPPKTEKPAQ---PSTPKTGWKQENGMWYFYNTDGSMATGWLQNNGSWYYLNANGAMATGW 528
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1228 VK-DGAWYYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQTG--------------------- 1285
Cdd:NF033838   529 LQnNGSWYYLNANGSMATGWLQNNGSWYYLNANGAMATGWLQYNGSWYYLNANGDMATGwlqyngswyylnangdmatgw 608
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1092545404 1286 -------------------WVNQGDTWYYLDNSGTMKTG-WFQVEDKWYYSYPSGALAVNTTIDGYAVNADGEWI 1340
Cdd:NF033838   609 lqyngswyylnangsmatgWVKDGDTWYYLEASGAMKASqWFKVSDKWYYVNGSGALAVNTTVDGYGVNANGEWV 683
COG5263 COG5263
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];
1169-1323 5.94e-40

Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];


Pssm-ID: 444077 [Multi-domain]  Cd Length: 486  Bit Score: 155.80  E-value: 5.94e-40
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1169 SRTETPFKVVVESGPKAESNSANGITHQLADGKNSKPGWK------LIESKWYYYDhADKVKTGWVK-DGAWYYLDESGV 1241
Cdd:COG5263    245 LSSLGGSSNALESGGENNQSLAGNGTSYDDAGAAGVDGTGttgtvgWVDGKWYYFD-AGKMVTGWQTiNGKWYYFDSDGA 323
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1242 MKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQTGWVNQGDTWYYLDNSGTMKTGWFQVEDKWYYSYPSGA 1321
Cdd:COG5263    324 MATGWQKINGKWYYFDEDGAMATGWVTDDGKWYYLGSDGAMATGWQKIDGKWYYFDSNGAMATGWVKVDGKWYYFDSSGA 403

                   ..
gi 1092545404 1322 LA 1323
Cdd:COG5263    404 MA 405
GH20_hexosaminidase cd02742
Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of ...
196-524 3.78e-24

Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119331 [Multi-domain]  Cd Length: 303  Bit Score: 104.44  E-value: 3.78e-24
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  196 IDAGRKYFSPEQLKEIIDKAKEYGYTDLHLlvgndglrFMLDDMSMkvgdkTYSSDDVKRAIENGTNAYYDDPNGnHLTE 275
Cdd:cd02742      6 LDVSRHFLSVESIKRTIDVLARYKINTFHW--------HLTDDQAW-----RIESKKFPELAEKGGQINPRSPGG-FYTY 71
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  276 SQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELGIENPNFDYFgKKSERTVDLNNKQAVDFTKTLIDKYANYFSkk 355
Cdd:cd02742     72 AQLKDIIEYAAARGIEVIPEIDMPGHSTAFVKSFPKLLTECYAGLKL-RDVFDPLDPTLPKGYDFLDDLFGEIAELFP-- 148
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  356 SEIFNIGLDEyANDATDAKgwsvlqadkyypnegypekgyEKFISYANDLARIVKSHGLKPMAFNDGIYYNSdtsfgSFD 435
Cdd:cd02742    149 DRYLHIGGDE-AHFKQDRK---------------------HLMSQFIQRVLDIVKKKGKKVIVWQDGFDKKM-----KLK 201
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  436 KDIIVSMWTGGWGGYDVaSSKLLAEKGHQILNTNDAWYYVLGrnADGQGWYNLdqglngIKNTPITsVPKTEGADVpIIG 515
Cdd:cd02742    202 EDVIVQYWDYDGDKYNV-ELPEAAAKGFPVILSNGYYLDIFI--DGALDARKV------YKNDPLA-VPTPQQKDL-VLG 270

                   ....*....
gi 1092545404  516 GMVAAWADT 524
Cdd:cd02742    271 VIACLWGET 279
GH20_hexosaminidase cd02742
Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of ...
639-908 1.07e-23

Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119331 [Multi-domain]  Cd Length: 303  Bit Score: 103.28  E-value: 1.07e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  639 ISIDAGRKYFSLDQLKRIVDKASELGYSDAHLLLGNDglrfllddmtitaNGKTYASDDVKKAIIEGTKAYYDDPNGTaL 718
Cdd:cd02742      4 IMLDVSRHFLSVESIKRTIDVLARYKINTFHWHLTDD-------------QAWRIESKKFPELAEKGGQINPRSPGGF-Y 69
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  719 TQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGIANPQANFDKVSKTTMDLENQEALNFTKALIGKYMDYFAD 798
Cdd:cd02742     70 TYAQLKDIIEYAAARGIEVIPEIDMPGHSTAFVKSFPKLLTECYAGLKLRDVFDPLDPTLPKGYDFLDDLFGEIAELFPD 149
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  799 KSkiFNYGTDEYANDATNAQgwyylkwyglynKFADYSNSLAAMAKERGLQPMAFNDGFyyedKDDAEFDKDVLISYWSK 878
Cdd:cd02742    150 RY--LHIGGDEAHFKQDRKH------------LMSQFIQRVLDIVKKKGKKVIVWQDGF----DKKMKLKEDVIVQYWDY 211
                          250       260       270
                   ....*....|....*....|....*....|
gi 1092545404  879 GWWGYNlASPQYLASKGYKFLNTNGDWYYI 908
Cdd:cd02742    212 DGDKYN-VELPEAAAKGFPVILSNGYYLDI 240
COG5263 COG5263
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];
1185-1338 7.08e-23

Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];


Pssm-ID: 444077 [Multi-domain]  Cd Length: 486  Bit Score: 104.18  E-value: 7.08e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1185 AESNSANGITHQLADGKNSKPGWKLIESKWYYYDHADKVKTGWVKDGAWYYLDESGVMKTGWQK-------VNGTWYYLD 1257
Cdd:COG5263    221 KKTGSTAGASGTAYGDSGGTAGSGLSSLGGSSNALESGGENNQSLAGNGTSYDDAGAAGVDGTGttgtvgwVDGKWYYFD 300
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1258 NsGAMQTGWIDQGGSWYYLNDSGAMQTGWVNQGDTWYYLDNSGTMKTGWFQVEDKWYYSYPSGALAVN-TTIDG--YAVN 1334
Cdd:COG5263    301 A-GKMVTGWQTINGKWYYFDSDGAMATGWQKINGKWYYFDEDGAMATGWVTDDGKWYYLGSDGAMATGwQKIDGkwYYFD 379

                   ....
gi 1092545404 1335 ADGE 1338
Cdd:COG5263    380 SNGA 383
Glyco_hydro_20 pfam00728
Glycosyl hydrolase family 20, catalytic domain; This domain has a TIM barrel fold.
196-524 6.64e-22

Glycosyl hydrolase family 20, catalytic domain; This domain has a TIM barrel fold.


Pssm-ID: 425840  Cd Length: 344  Bit Score: 98.91  E-value: 6.64e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  196 IDAGRKYFSPEQLKEIIDKAKEYGYTDLHL-LVGNDGLRFMLDDMSMKVGDKTYSSDDVkraIENGTNAYYddpngnhlT 274
Cdd:pfam00728    8 LDVARHFLPVDDIKRTIDAMAAYKLNVLHWhLTDDQGWRLEIKKYPKLTEKGAYRPSDL---DGTPYGGFY--------T 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  275 ESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELG----IENPNFDYFGKKSERTVDLNNKQAVDFTKTLIDKYAN 350
Cdd:pfam00728   77 QEDIREIVAYAAARGIRVIPEIDMPGHARAALAAYPELGcgcgADSPWVSVQWGPPEGQLNPGNEKTYTFLDNVFDEVAD 156
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  351 YFSkkSEIFNIGLDEYAndatdAKGWsvlQAD----KYYPNEGYpeKGYEKFISY-ANDLARIVKSHGLKPMAFNDGIYY 425
Cdd:pfam00728  157 LFP--SDYIHIGGDEVP-----KGCW---EKSpecqARMKEEGL--KSLHELQQYfIKRASKIVSSKGRRLIGWDEILDG 224
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  426 NSDTSfgsfDKDIIVSMWTGGWGGydvasSKLLAEKGHQ-ILNTNDAWYYVLGRNADGQGWYNLDQGLNGIKNT----PI 500
Cdd:pfam00728  225 GVPLL----PKNTTVQSWRGGDEA-----AQKAAKQGYDvIMSPGDFLYLDCGQGGNPTEEPYYWGGFVPLEDVynwdPV 295
                          330       340
                   ....*....|....*....|....
gi 1092545404  501 TSVPKTEGADVPIIGGMVAAWADT 524
Cdd:pfam00728  296 PDTWNDPEQAKHVLGGQANLWTEQ 319
G5 pfam07501
G5 domain; This domain is found in a wide range of extracellular proteins. It is found ...
1069-1139 1.48e-19

G5 domain; This domain is found in a wide range of extracellular proteins. It is found tandemly repeated in up to 8 copies. It is found in the N-terminus of peptidases belonging to the M26 family which cleave human IgA. The domain is also found in proteins involved in metabolism of bacterial cell walls suggesting this domain may have an adhesive function.


Pssm-ID: 462185 [Multi-domain]  Cd Length: 75  Bit Score: 84.14  E-value: 1.48e-19
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1092545404 1069 ELVVKTESIPFKVIRKENPNLPAGQEKVVKAGVLGERTSYVSVLTENGK-ASETVLDSQVTKEPVDQIVEFG 1139
Cdd:pfam07501    2 KTVTEEEEIPFETVTKEDPSLPKGEEKVVQEGKPGEKEVTYKVTYVNGKeVSREVVSEEVTKEPVDEVVAVG 73
YabE COG3583
Uncharacterized conserved protein YabE, contains G5 and tandem DUF348 domains [Function ...
1069-1139 3.69e-19

Uncharacterized conserved protein YabE, contains G5 and tandem DUF348 domains [Function unknown];


Pssm-ID: 442802 [Multi-domain]  Cd Length: 335  Bit Score: 90.31  E-value: 3.69e-19
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1092545404 1069 ELVVKTESIPFKVIRKENPNLPAGQEKVVKAGVLGERTSYVSVLTENGK-ASETVLDSQVTKEPVDQIVEFG 1139
Cdd:COG3583    152 KTVTEEEPIPFETVRKEDPSLPKGETKVVQEGVPGVKEVTYRVTYENGKeVSREVVSEKVTKEPVDEVVAVG 223
GH20_chitobiase-like cd06563
The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl ...
196-524 3.17e-15

The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl hydrolase family 20 (GH20) domain that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin. Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. This GH20 domain family includes an N-acetylglucosamidase (GlcNAcase A) from Pseudoalteromonas piscicida and an N-acetylhexosaminidase (SpHex) from Streptomyces plicatus. SpHex lacks the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119333  Cd Length: 357  Bit Score: 78.77  E-value: 3.17e-15
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  196 IDAGRKYFSPEQLKEIIDKAKEYGYTDLHL-LVGNDGLRfmlddMSMK-------VGDKTYSSDDVKRAIENGTNAYydd 267
Cdd:cd06563      8 LDVSRHFFPVDEVKRFIDLMALYKLNVFHWhLTDDQGWR-----IEIKkypklteVGAWRGPTEIGLPQGGGDGTPY--- 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  268 pnGNHLTESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELGIENPNFDY--FGKKSERTVDLNNKQAVDFTKTLI 345
Cdd:cd06563     80 --GGFYTQEEIREIVAYAAERGITVIPEIDMPGHALAALAAYPELGCTGGPGSVvsVQGVVSNVLCPGKPETYTFLEDVL 157
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  346 DKYANYFskKSEIFNIGLDEYANDAtdakgWSVLQAD-KYYPNEGYpeKGYEKFISY-ANDLARIVKSHGLKPMAFNDGI 423
Cdd:cd06563    158 DEVAELF--PSPYIHIGGDEVPKGQ-----WEKSPACqARMKEEGL--KDEHELQSYfIKRVEKILASKGKKMIGWDEIL 228
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  424 YynsdtsfGSFDKDIIVsMWtggWGGYDVAssKLLAEKGHQILNTNDAWYYVLGRNADGQGWYNLDQGLNGIKNT----P 499
Cdd:cd06563    229 E-------GGLPPNATV-MS---WRGEDGG--IKAAKQGYDVIMSPGQYLYLDYAQSKGPDEPASWAGFNTLEKVysfeP 295
                          330       340
                   ....*....|....*....|....*
gi 1092545404  500 ITSVPKTEGADVpIIGGMVAAWADT 524
Cdd:cd06563    296 VPGGLTPEQAKR-ILGVQANLWTEY 319
Glyco_hydro_20 pfam00728
Glycosyl hydrolase family 20, catalytic domain; This domain has a TIM barrel fold.
641-908 2.12e-14

Glycosyl hydrolase family 20, catalytic domain; This domain has a TIM barrel fold.


Pssm-ID: 425840  Cd Length: 344  Bit Score: 76.18  E-value: 2.12e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  641 IDAGRKYFSLDQLKRIVDKASELGYSDAHLLLGND-GLRFLLDDMTITANGKTYASDDVKKAIIEGtkaYYddpngtalT 719
Cdd:pfam00728    8 LDVARHFLPVDDIKRTIDAMAAYKLNVLHWHLTDDqGWRLEIKKYPKLTEKGAYRPSDLDGTPYGG---FY--------T 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  720 QAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGIANPQANFD-----KVSKTTMDLENQEALNFTKALIGKYMD 794
Cdd:pfam00728   77 QEDIREIVAYAAARGIRVIPEIDMPGHARAALAAYPELGCGCGADSPWvsvqwGPPEGQLNPGNEKTYTFLDNVFDEVAD 156
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  795 YFadKSKIFNYGTDEYANDATNA----QGWY----YLKWYGLYNKFADYsnsLAAMAKERGLQPMAFNDGFyyeDKDDAE 866
Cdd:pfam00728  157 LF--PSDYIHIGGDEVPKGCWEKspecQARMkeegLKSLHELQQYFIKR---ASKIVSSKGRRLIGWDEIL---DGGVPL 228
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|..
gi 1092545404  867 FDKDVLISYWsKGWWGYNLAspqyLASKGYKFLNTNGDWYYI 908
Cdd:pfam00728  229 LPKNTTVQSW-RGGDEAAQK----AAKQGYDVIMSPGDFLYL 265
Chb COG3525
N-acetyl-beta-hexosaminidase [Carbohydrate transport and metabolism];
257-474 7.04e-13

N-acetyl-beta-hexosaminidase [Carbohydrate transport and metabolism];


Pssm-ID: 442747 [Multi-domain]  Cd Length: 578  Bit Score: 72.97  E-value: 7.04e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  257 IENGTNAYYDDPNGNHLTESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELGIENPNFDYFGKK--SERTVDLNN 334
Cdd:COG3525    223 IGHDPQPFDGKPYGGFYTQEDIREIVAYAAARGITVIPEIDMPGHARAAIAAYPELGCTGKPYSVRSVWgvFDNVLNPGK 302
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  335 KQAVDFTKTLIDKYANYFskKSEIFNIGLDEYANDA----------------TDAKGwsvLQAdkYypnegypekgyekF 398
Cdd:COG3525    303 ESTYTFLEDVLDEVAALF--PSPYIHIGGDEVPKGQwekspacqalmkelglKDEHE---LQS--Y-------------F 362
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1092545404  399 IsyaNDLARIVKSHGLKPMAFNDGIyynsdtsFGSFDKDIIVSMWTGGWGGYDvassklLAEKGHQILNTNDAWYY 474
Cdd:COG3525    363 I---RRVEKILASKGRKMIGWDEIL-------EGGLAPNATVMSWRGEDGGIE------AAKAGHDVVMSPGSYLY 422
GH20_SpHex_like cd06568
A subgroup of the Glycosyl hydrolase family 20 (GH20) catalytic domain found in proteins ...
195-420 2.84e-11

A subgroup of the Glycosyl hydrolase family 20 (GH20) catalytic domain found in proteins similar to the N-acetylhexosaminidase from Streptomyces plicatus (SpHex). SpHex catalyzes the hydrolysis of N-acetyl-beta-hexosaminides. An Asp residue within the active site plays a critical role in substrate-assisted catalysis by orienting the 2-acetamido group and stabilizing the transition state. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself. Proteins belonging to this subgroup lack the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases.


Pssm-ID: 119336  Cd Length: 329  Bit Score: 66.59  E-value: 2.84e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  195 SIDAGRKYFSPEQLKEIIDKAKEYGYTDLHL-LVGNDGLRFMLDDmSMKVGdktysSDDVKRAIENGTNAYYddpngnhl 273
Cdd:cd06568      7 MLDVARHFFTVAEVKRYIDLLALYKLNVLHLhLTDDQGWRIEIKS-WPKLT-----EIGGSTEVGGGPGGYY-------- 72
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  274 TESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELgieNPNfdyfGKKSertvDLNNKQAVDFTKTLIDKYANY-- 351
Cdd:cd06568     73 TQEDYKDIVAYAAERHITVVPEIDMPGHTNAALAAYPEL---NCD----GKAK----PLYTGIEVGFSSLDVDKPTTYef 141
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1092545404  352 ----FSKKSEI-----FNIGLDEyaNDATDAkgwsvlqadkyypnegypekgyEKFISYANDLARIVKSHGLKPMAFN 420
Cdd:cd06568    142 vddvFRELAALtpgpyIHIGGDE--AHSTPH----------------------DDYAYFVNRVRAIVAKYGKTPVGWQ 195
Choline_bind_3 pfam19127
Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to ...
1244-1289 3.26e-11

Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to pfam01473.


Pssm-ID: 465978 [Multi-domain]  Cd Length: 47  Bit Score: 59.48  E-value: 3.26e-11
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*...
gi 1092545404 1244 TGWQKVNGTWYYLDNSGAMQTGW-IDQGGSWYYLN-DSGAMqtgWVNQ 1289
Cdd:pfam19127    2 TGWQTINGQTLYFDSDGKQVKGWvVTIDGKWYYFDaDSGEM---VTNR 46
GH20_HexA_HexB-like cd06562
Beta-N-acetylhexosaminidases catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine ...
274-524 7.51e-11

Beta-N-acetylhexosaminidases catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. The hexA and hexB genes encode the alpha- and beta-subunits of the two major beta-N-acetylhexosaminidase isoenzymes, N-acetyl-beta-D-hexosaminidase A (HexA) and beta-N-acetylhexosaminidase B (HexB). Both the alpha and the beta catalytic subunits have a TIM-barrel fold and belong to the glycosyl hydrolase family 20 (GH20). The HexA enzyme is a heterodimer containing one alpha and one beta subunit while the HexB enzyme is a homodimer containing two beta-subunits. Hexosaminidase mutations cause an inability to properly hydrolyze certain sphingolipids which accumulate in lysosomes within the brain, resulting in the lipid storage disorders Tay-Sachs and Sandhoff. Mutations in the alpha subunit cause in a deficiency in the HexA enzyme and result in Tay-Sachs, mutations in the beta-subunit cause in a deficiency in both HexA and HexB enzymes and result in Sandhoff disease. In both disorders GM(2) gangliosides accumulate in lysosomes. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119332 [Multi-domain]  Cd Length: 348  Bit Score: 65.31  E-value: 7.51e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  274 TESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELGIENPNFDYFGKKSERT--VDLNNKQAVDFTKTLIDKYANY 351
Cdd:cd06562     68 TPEDVKEIVEYARLRGIRVIPEIDTPGHTGSWGQGYPELLTGCYAVWRKYCPEPPCgqLNPTNPKTYDFLKTLFKEVSEL 147
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  352 FSkkSEIFNIGLDEYANDAtdakgWsvlQADKYYPN--EGYPEKGYEKFISYAND-LARIVKSHGLKPMafndgIYYNSD 428
Cdd:cd06562    148 FP--DKYFHLGGDEVNFNC-----W---NSNPEIQKfmKKNNGTDYSDLESYFIQrALDIVRSLGKTPI-----VWEEVF 212
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  429 TSFGSF-DKDIIVSMWTGGWggydvaSSKLLAEKGHQ-ILNTNDAWY---YVLGRNADGQGW---YNLDQglngikNTPI 500
Cdd:cd06562    213 DNGVYLlPKDTIVQVWGGSD------ELKNVLAAGYKvILSSYDFWYldcGFGGWVGPGNDWcdpYKNWP------RIYS 280
                          250       260
                   ....*....|....*....|....
gi 1092545404  501 TSVPKTEGadvpIIGGMVAAWADT 524
Cdd:cd06562    281 GTPEQKKL----VLGGEACMWGEQ 300
GH20_chitobiase-like_1 cd06570
A functionally uncharacterized subgroup of the Glycosyl hydrolase family 20 (GH20) catalytic ...
269-475 7.76e-11

A functionally uncharacterized subgroup of the Glycosyl hydrolase family 20 (GH20) catalytic domain found in proteins similar to the chitobiase of Serratia marcescens, a beta-N-1,4-acetylhexosaminidase that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin. Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. This subgroup lacks the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119338  Cd Length: 311  Bit Score: 64.74  E-value: 7.76e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  269 NGNHLTESQMTDLINYAKDKGIGVIPTVNSPGHMDAMLHAMKELGIEnPNFDYFGKK---SERTVDLNNKQAVDFTKTLI 345
Cdd:cd06570     61 DGLYYTQEQIREVVAYARDRGIRVVPEIDVPGHASAIAVAYPELASG-PGPYVIERGwgvFEPLLDPTNEETYTFLDNLF 139
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  346 DKYANYFSkkSEIFNIGLDEyandaTDAKGWSVLQADKYYPNEGYPEKGYEKFISYANDLARIVKSHGLKPMAFnDGIYY 425
Cdd:cd06570    140 GEMAELFP--DEYFHIGGDE-----VDPKQWNENPRIQAFMKEHGLKDAAALQAYFNQRVEKILSKHGKKMIGW-DEVLH 211
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1092545404  426 nsdtsfGSFDKDIIVSMWTGGwggydvASSKLLAEKGHQ-ILNTNdawYYV 475
Cdd:cd06570    212 ------PDLPKNVVIQSWRGH------DSLGEAAKAGYQgILSTG---YYI 247
GH20_GcnA-like cd06565
Glycosyl hydrolase family 20 (GH20) catalytic domain of N-acetyl-beta-D-glucosaminidase (GcnA, ...
269-421 1.21e-10

Glycosyl hydrolase family 20 (GH20) catalytic domain of N-acetyl-beta-D-glucosaminidase (GcnA, also known as BhsA) and related proteins. GcnA is an exoglucosidase which cleaves N-acetyl-beta-D-galactosamine (NAG) and N-acetyl-beta-D-galactosamine residues from 4-methylumbelliferylated (4MU) substrates, as well as cleaving NAG from chito-oligosaccharides (i.e. NAG polymers). In contrast, sulfated forms of the substrate are unable to be cleaved and act instead as mild competitive inhibitors. Additionally, the enzyme is known to be poisoned by several first-row transition metals as well as by mercury. GcnA forms a homodimer with subunits comprised of three domains, an N-terminal zincin-like domain, this central catalytic GH20 domain, and a C-terminal alpha helical domain. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119335  Cd Length: 301  Bit Score: 64.15  E-value: 1.21e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  269 NGNHLTESQMTDLINYAKDKGIGVIPTVNSPGHMDAML--HAMKELGiENPnfdyfgkKSERTVDLNNKQAVDFTKTLID 346
Cdd:cd06565     53 MRGAYTKEEIREIDDYAAELGIEVIPLIQTLGHLEFILkhPEFRHLR-EVD-------DPPQTLCPGEPKTYDFIEEMIR 124
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1092545404  347 KYANYFSkkSEIFNIGLDE-YAndatdakgwsvLQADKYYPNEGYPEKGyEKFISYANDLARIVKSHGLKPMAFND 421
Cdd:cd06565    125 QVLELHP--SKYIHIGMDEaYD-----------LGRGRSLRKHGNLGRG-ELYLEHLKKVLKIIKKRGPKPMMWDD 186
GH20_SpHex_like cd06568
A subgroup of the Glycosyl hydrolase family 20 (GH20) catalytic domain found in proteins ...
640-790 3.91e-10

A subgroup of the Glycosyl hydrolase family 20 (GH20) catalytic domain found in proteins similar to the N-acetylhexosaminidase from Streptomyces plicatus (SpHex). SpHex catalyzes the hydrolysis of N-acetyl-beta-hexosaminides. An Asp residue within the active site plays a critical role in substrate-assisted catalysis by orienting the 2-acetamido group and stabilizing the transition state. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself. Proteins belonging to this subgroup lack the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases.


Pssm-ID: 119336  Cd Length: 329  Bit Score: 63.12  E-value: 3.91e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  640 SIDAGRKYFSLDQLKRIVDKASELGYSDAHLLLGND-GLRfllddMTITANGKTyASDDVKKAIIEGTKAYYddpngtal 718
Cdd:cd06568      7 MLDVARHFFTVAEVKRYIDLLALYKLNVLHLHLTDDqGWR-----IEIKSWPKL-TEIGGSTEVGGGPGGYY-------- 72
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1092545404  719 TQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKL---GIANPQANFDKVSKTTMDLENQEALNFTKALIG 790
Cdd:cd06568     73 TQEDYKDIVAYAAERHITVVPEIDMPGHTNAALAAYPELncdGKAKPLYTGIEVGFSSLDVDKPTTYEFVDDVFR 147
GH20_GcnA-like cd06565
Glycosyl hydrolase family 20 (GH20) catalytic domain of N-acetyl-beta-D-glucosaminidase (GcnA, ...
650-876 4.87e-10

Glycosyl hydrolase family 20 (GH20) catalytic domain of N-acetyl-beta-D-glucosaminidase (GcnA, also known as BhsA) and related proteins. GcnA is an exoglucosidase which cleaves N-acetyl-beta-D-galactosamine (NAG) and N-acetyl-beta-D-galactosamine residues from 4-methylumbelliferylated (4MU) substrates, as well as cleaving NAG from chito-oligosaccharides (i.e. NAG polymers). In contrast, sulfated forms of the substrate are unable to be cleaved and act instead as mild competitive inhibitors. Additionally, the enzyme is known to be poisoned by several first-row transition metals as well as by mercury. GcnA forms a homodimer with subunits comprised of three domains, an N-terminal zincin-like domain, this central catalytic GH20 domain, and a C-terminal alpha helical domain. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119335  Cd Length: 301  Bit Score: 62.22  E-value: 4.87e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  650 LDQLKRIVDKASELGYSdaHLLLgndglrfllddmtitangktYasddvkkaiIEGTKAYYDDP----NGTALTQAEVTE 725
Cdd:cd06565     16 VSYLKKLLRLLALLGAN--GLLL--------------------Y---------YEDTFPYEGEPevgrMRGAYTKEEIRE 64
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  726 LAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGIANpqanfDKVSKTTMDLENQEALNFTKALIGKYMDYFadKSKIFNY 805
Cdd:cd06565     65 IDDYAAELGIEVIPLIQTLGHLEFILKHPEFRHLRE-----VDDPPQTLCPGEPKTYDFIEEMIRQVLELH--PSKYIHI 137
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  806 GTDE--------YANDatnaqgWYYLKWYGLYnkfADYSNSLAAMAKERGLQPMAFNDGFY----YEDKDDaEFDKDVLI 873
Cdd:cd06565    138 GMDEaydlgrgrSLRK------HGNLGRGELY---LEHLKKVLKIIKKRGPKPMMWDDMLRklsiEPEALS-GLPKLVTP 207

                   ...
gi 1092545404  874 SYW 876
Cdd:cd06565    208 VVW 210
Chb COG3525
N-acetyl-beta-hexosaminidase [Carbohydrate transport and metabolism];
719-908 2.03e-09

N-acetyl-beta-hexosaminidase [Carbohydrate transport and metabolism];


Pssm-ID: 442747 [Multi-domain]  Cd Length: 578  Bit Score: 61.80  E-value: 2.03e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  719 TQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGianpqaNFDK---------VSKTTMDLENQEALNFTKALI 789
Cdd:COG3525    240 TQEDIREIVAYAAARGITVIPEIDMPGHARAAIAAYPELG------CTGKpysvrsvwgVFDNVLNPGKESTYTFLEDVL 313
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  790 GKYMDYFadKSKIFNYGTDEYANDA----------------TNA---QGWYYlkwyglynkfadysNSLAAMAKERGLQP 850
Cdd:COG3525    314 DEVAALF--PSPYIHIGGDEVPKGQwekspacqalmkelglKDEhelQSYFI--------------RRVEKILASKGRKM 377
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1092545404  851 MAFNDGFyyedkdDAEFDKDVLISYWSkgwwgyNLASPQYLASKGYKFLNTNGDWYYI 908
Cdd:COG3525    378 IGWDEIL------EGGLAPNATVMSWR------GEDGGIEAAKAGHDVVMSPGSYLYF 423
GH20_chitobiase-like cd06563
The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl ...
641-908 2.33e-09

The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl hydrolase family 20 (GH20) domain that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin. Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. This GH20 domain family includes an N-acetylglucosamidase (GlcNAcase A) from Pseudoalteromonas piscicida and an N-acetylhexosaminidase (SpHex) from Streptomyces plicatus. SpHex lacks the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119333  Cd Length: 357  Bit Score: 60.67  E-value: 2.33e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  641 IDAGRKYFSLDQLKRIVDKASELGYSDAHLLLGND-GLRFLLD---DMTitangkTYASDDVKKAIIEGTKAYYDDPNGT 716
Cdd:cd06563      8 LDVSRHFFPVDEVKRFIDLMALYKLNVFHWHLTDDqGWRIEIKkypKLT------EVGAWRGPTEIGLPQGGGDGTPYGG 81
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  717 ALTQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGI---ANPQANFDKVSKTTMDLENQEALNFTKALIGKYM 793
Cdd:cd06563     82 FYTQEEIREIVAYAAERGITVIPEIDMPGHALAALAAYPELGCtggPGSVVSVQGVVSNVLCPGKPETYTFLEDVLDEVA 161
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  794 DYFadKSKIFNYGTDEYANDAtnaqgwyylkWYGLY--------NKFADYS-------NSLAAMAKERGLQPMAFNDGFy 858
Cdd:cd06563    162 ELF--PSPYIHIGGDEVPKGQ----------WEKSPacqarmkeEGLKDEHelqsyfiKRVEKILASKGKKMIGWDEIL- 228
                          250       260       270       280       290
                   ....*....|....*....|....*....|....*....|....*....|
gi 1092545404  859 yedkdDAEFDKDVLISYWsKGWwgynlASPQYLASKGYKFLNTNGDWYYI 908
Cdd:cd06563    229 -----EGGLPPNATVMSW-RGE-----DGGIKAAKQGYDVIMSPGQYLYL 267
Choline_bind_3 pfam19127
Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to ...
1225-1262 4.79e-09

Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to pfam01473.


Pssm-ID: 465978 [Multi-domain]  Cd Length: 47  Bit Score: 53.31  E-value: 4.79e-09
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|.
gi 1092545404 1225 TGWVK-DGAWYYLDESGVMKTGWQKV-NGTWYYLD-NSGAM 1262
Cdd:pfam19127    2 TGWQTiNGQTLYFDSDGKQVKGWVVTiDGKWYYFDaDSGEM 42
Choline_bind_3 pfam19127
Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to ...
1264-1307 2.39e-08

Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to pfam01473.


Pssm-ID: 465978 [Multi-domain]  Cd Length: 47  Bit Score: 51.39  E-value: 2.39e-08
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*.
gi 1092545404 1264 TGWIDQGGSWYYLNDSGAMQTGW-VNQGDTWYYLD-NSGTMKTGWF 1307
Cdd:pfam19127    2 TGWQTINGQTLYFDSDGKQVKGWvVTIDGKWYYFDaDSGEMVTNRF 47
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
35-187 5.71e-08

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 57.09  E-value: 5.71e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   35 DGVTPTTENQPTIHTV----------------SDSPQPSENRTEETPKAELQPE-----TPKTVETETPSTDKVASL--P 91
Cdd:NF033839   249 DNVNTKVEIENTVHKIfadmdavvtkfkkgltQDTPKEPGNKKPSAPKPGMQPSpqpekKEVKPEPETPKPEVKPQLekP 328
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   92 KTEEKTQEEV----------SSTPSDKEEVVTPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDKTD-KSEADKDKPA 160
Cdd:NF033839   329 KPEVKPQPEKpkpevkpqleTPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPETPKPEVKPQPEKPKPEvKPQPEKPKPE 408
                          170       180
                   ....*....|....*....|....*....
gi 1092545404  161 KKD--ETKAEADKPATEAGKERAATENEK 187
Cdd:NF033839   409 VKPqpEKPKPEVKPQPEKPKPEVKPQPEK 437
GH20_chitobiase-like_1 cd06570
A functionally uncharacterized subgroup of the Glycosyl hydrolase family 20 (GH20) catalytic ...
639-820 2.00e-07

A functionally uncharacterized subgroup of the Glycosyl hydrolase family 20 (GH20) catalytic domain found in proteins similar to the chitobiase of Serratia marcescens, a beta-N-1,4-acetylhexosaminidase that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin. Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. This subgroup lacks the C-terminal PKD (polycystic kidney disease I)-like domain found in the chitobiases. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119338  Cd Length: 311  Bit Score: 54.34  E-value: 2.00e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  639 ISIDAGRKYFSLDQLKRIVDKASELGYSDAHLllgndglrFLLDDMTITANGKTY------ASDDVkkaiiegtkaYYdd 712
Cdd:cd06570      6 LLIDVSRHFIPVAVIKRQLDAMASVKLNVFHW--------HLTDDQGFRIESKKYpklqqkASDGL----------YY-- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  713 pngtalTQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGIA-NPQANFDK--VSKTTMDLENQEALNFTKALI 789
Cdd:cd06570     66 ------TQEQIREVVAYARDRGIRVVPEIDVPGHASAIAVAYPELASGpGPYVIERGwgVFEPLLDPTNEETYTFLDNLF 139
                          170       180       190
                   ....*....|....*....|....*....|.
gi 1092545404  790 GKYMDYFADksKIFNYGTDEyandaTNAQGW 820
Cdd:cd06570    140 GEMAELFPD--EYFHIGGDE-----VDPKQW 163
YSIRK_signal TIGR01168
Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, ...
1-24 2.03e-07

Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.


Pssm-ID: 273479 [Multi-domain]  Cd Length: 39  Bit Score: 48.25  E-value: 2.03e-07
                           10        20
                   ....*....|....*....|....
gi 1092545404    1 MKQEKQQRFSIRKYAVGAASVLIG 24
Cdd:TIGR01168    3 KFNEKQQKYSIRKLSVGVASVLVA 26
YSIRK_signal pfam04650
YSIRK type signal peptide; Many surface proteins found in Streptococcus, Staphylococcus, and ...
4-24 3.57e-07

YSIRK type signal peptide; Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.


Pssm-ID: 428049 [Multi-domain]  Cd Length: 26  Bit Score: 47.38  E-value: 3.57e-07
                           10        20
                   ....*....|....*....|.
gi 1092545404    4 EKQQRFSIRKYAVGAASVLIG 24
Cdd:pfam04650    1 EKKQRYSIRKLSVGVASVLIG 21
Choline_bind_3 pfam19127
Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to ...
1284-1326 4.34e-07

Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to pfam01473.


Pssm-ID: 465978 [Multi-domain]  Cd Length: 47  Bit Score: 47.54  E-value: 4.34e-07
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*
gi 1092545404 1284 TGWVNQGDTWYYLDNSGTMKTGWFQVED-KWYYSYP-SGALAVNT 1326
Cdd:pfam19127    2 TGWQTINGQTLYFDSDGKQVKGWVVTIDgKWYYFDAdSGEMVTNR 46
DUF4045 pfam13254
Domain of unknown function (DUF4045); This presumed domain is functionally uncharacterized. ...
5-216 4.45e-07

Domain of unknown function (DUF4045); This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is typically between 384 and 430 amino acids in length.


Pssm-ID: 433066 [Multi-domain]  Cd Length: 415  Bit Score: 54.02  E-value: 4.45e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404    5 KQQRfsirkyavgaASVLIGFAFQAqavaaDGVTPTTENQPTiHTVSDSPQPSENrteeTPKAELQPETPK-TVETETPS 83
Cdd:pfam13254  186 RQSR----------ASVDLGRPNSF-----KEVTPVGLMRSP-APGGHSKSPSVS----GISADSSPTKEEpSEEADTLS 245
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   84 TDKvASLPkteekTQEEVSSTPSDKEEVvtPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDKTD------KSEADKD 157
Cdd:pfam13254  246 TDK-EQSP-----APTSASEPPPKTKEL--PKDSEEPAAPSKSAEASTEKKEPDTESSPETSSEKSApsllspVSKASID 317
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  158 KPAKKDETKAEADKPAT--------------EAGKERAAT-ENE------KLAKRKIvsidagRKYFSPEQLKEIIDKAK 216
Cdd:pfam13254  318 KPLSSPDRDPLSPKPKPqsppkdfranlrsrEVPKDKSKKdEPEfknvfgKLRKAET------KNYVAPDELKDNILRGK 391
Agg_substance NF033875
LPXTG-anchored aggregation substance; Aggregation substances, as described in Enterococcus, ...
1-217 9.58e-07

LPXTG-anchored aggregation substance; Aggregation substances, as described in Enterococcus, are LPXTG-anchored large surface proteins that contribute to virulence. Several closely related paralogs may be found in a single strain.


Pssm-ID: 411439 [Multi-domain]  Cd Length: 1306  Bit Score: 53.56  E-value: 9.58e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404    1 MKQ--EKQQRFSI---RKYAVGAASVLIGFAFQAQAVAADGVTPTTENQPTIHTVS-DSPQPSENRteETPKAELQPETp 74
Cdd:NF033875     1 MKQqtEVKKRFKMykaKKHWVVAPILFLGVLGVVGLATDNVQAAELDTQPGTTTVQpDNPDPQSGS--ETPKTAVSEEA- 77
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   75 kTVETETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPT-SVEKEAADK-------------KAEEASPKKEEQKEAN 140
Cdd:NF033875    78 -TVQKDTTSQPTKVEEVASEKNGAEQSSATPNDTTNAQQPTvGAEKSAQEQpvvspettneplgQPTEVAPAENEANKST 156
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  141 S--KESDTDKTDKSEADkdkpAKKDETKAEADKPATEAG----KERAATENEklakrkivsIDAGRKyfspEQLKEIIDK 214
Cdd:NF033875   157 SipKEFETPDVDKAVDE----AKKDPNITVVEKPAEDLGnvssKDLAAKEKE---------VDQLQK----EQAKKIAQQ 219

                   ...
gi 1092545404  215 AKE 217
Cdd:NF033875   220 AAE 222
GH20_Sm-chitobiase-like cd06569
The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl ...
267-474 1.11e-06

The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl hydrolase family 20 (GH20) domain that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin. Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119337  Cd Length: 445  Bit Score: 52.68  E-value: 1.11e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  267 DPNGN-HLTESQMTDLINYAKDKGIGVIPTVNSPGH-------MDAMLHAMKELGIENPNFDY----FGKKSE-RTVDLN 333
Cdd:cd06569     87 NNSGSgYYSRADYIEILKYAKARHIEVIPEIDMPGHaraaikaMEARYRKLMAAGKPAEAEEYrlsdPADTSQyLSVQFY 166
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  334 NKQ----AVDFTKTLIDKYAnyfskkSEI-------------FNIGLDEYANDATDAkgwSVLQADKYYPNEGYPEKGYE 396
Cdd:cd06569    167 TDNvinpCMPSTYRFVDKVI------DEIarmhqeagqplttIHFGGDEVPEGAWGG---SPACKAQLFAKEGSVKDVED 237
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  397 KFISYANDLARIVKSHGLKPMAFNDGIYYNSDTSFGSFDKD-IIVSMW-TGGWGGYDVASSklLAEKGHQILNTNDAWYY 474
Cdd:cd06569    238 LKDYFFERVSKILKAHGITLAGWEDGLLGKDTTNVDGFATPyVWNNVWgWGYWGGEDRAYK--LANKGYDVVLSNATNLY 315
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
38-216 1.21e-06

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 52.85  E-value: 1.21e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   38 TPTTENQPTIHTvsdsPQPSENRTEETPKAEL--QPETPKtvetetPSTDKVASLPKTEEKTQEEvSSTPSDKEEVVTPT 115
Cdd:NF033839   316 TPKPEVKPQLEK----PKPEVKPQPEKPKPEVkpQLETPK------PEVKPQPEKPKPEVKPQPE-KPKPEVKPQPETPK 384
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  116 SVEKEAADKKAEEASPKKEEQKEANSKESDTDKTD-KSEADKDKPAKKD--ETKAEADKPATEAGKERAATENEKLAKRK 192
Cdd:NF033839   385 PEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEvKPQPEKPKPEVKPqpEKPKPEVKPQPEKPKPEVKPQPETPKPEV 464
                          170       180
                   ....*....|....*....|....
gi 1092545404  193 IVSIDAGRKYFSPEQLKEIIDKAK 216
Cdd:NF033839   465 KPQPEKPKPEVKPQPEKPKPDNSK 488
PTZ00121 PTZ00121
MAEBL; Provisional
58-217 1.29e-06

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 53.22  E-value: 1.29e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   58 ENRTEETPKaelQPETPKTVETETPSTDKVASlpKTEEKTQEEVSSTPSDKEEVVTPTSVEK---EAADKKAEEASPKKE 134
Cdd:PTZ00121  1321 KKKAEEAKK---KADAAKKKAEEAKKAAEAAK--AEAEAAADEAEAAEEKAEAAEKKKEEAKkkaDAAKKKAEEKKKADE 1395
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  135 EQKEANSKESDTDKTDKSEADKDKpAKKDETKAEADKPATEAGKE----RAATENEKLAKRKIVSIDAGRKYFSPEQLKE 210
Cdd:PTZ00121  1396 AKKKAEEDKKKADELKKAAAAKKK-ADEAKKKAEEKKKADEAKKKaeeaKKADEAKKKAEEAKKAEEAKKKAEEAKKADE 1474

                   ....*..
gi 1092545404  211 IIDKAKE 217
Cdd:PTZ00121  1475 AKKKAEE 1481
GH20_HexA_HexB-like cd06562
Beta-N-acetylhexosaminidases catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine ...
719-908 1.71e-06

Beta-N-acetylhexosaminidases catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. The hexA and hexB genes encode the alpha- and beta-subunits of the two major beta-N-acetylhexosaminidase isoenzymes, N-acetyl-beta-D-hexosaminidase A (HexA) and beta-N-acetylhexosaminidase B (HexB). Both the alpha and the beta catalytic subunits have a TIM-barrel fold and belong to the glycosyl hydrolase family 20 (GH20). The HexA enzyme is a heterodimer containing one alpha and one beta subunit while the HexB enzyme is a homodimer containing two beta-subunits. Hexosaminidase mutations cause an inability to properly hydrolyze certain sphingolipids which accumulate in lysosomes within the brain, resulting in the lipid storage disorders Tay-Sachs and Sandhoff. Mutations in the alpha subunit cause in a deficiency in the HexA enzyme and result in Tay-Sachs, mutations in the beta-subunit cause in a deficiency in both HexA and HexB enzymes and result in Sandhoff disease. In both disorders GM(2) gangliosides accumulate in lysosomes. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119332 [Multi-domain]  Cd Length: 348  Bit Score: 51.83  E-value: 1.71e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  719 TQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAMEKLGIANPQA-NFDKVSKTT--MDLENQEALNFTKALIGKYMDY 795
Cdd:cd06562     68 TPEDVKEIVEYARLRGIRVIPEIDTPGHTGSWGQGYPELLTGCYAVwRKYCPEPPCgqLNPTNPKTYDFLKTLFKEVSEL 147
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  796 FADksKIFNYGTDE-----YANDA------TNAQGWYYLKWYGLYNKFADysnslaAMAKERGLQPMafndgFYYEDKDD 864
Cdd:cd06562    148 FPD--KYFHLGGDEvnfncWNSNPeiqkfmKKNNGTDYSDLESYFIQRAL------DIVRSLGKTPI-----VWEEVFDN 214
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*.
gi 1092545404  865 AEF--DKDVLISywskgWWGyNLASPQYLASKGYKFLNTNGDWYYI 908
Cdd:cd06562    215 GVYllPKDTIVQ-----VWG-GSDELKNVLAAGYKVILSSYDFWYL 254
PTZ00121 PTZ00121
MAEBL; Provisional
58-217 1.72e-06

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 52.84  E-value: 1.72e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   58 ENRTEETPKAElqpETPKTVETETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEEASPKKEE-Q 136
Cdd:PTZ00121  1308 KKKAEEAKKAD---EAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAaK 1384
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  137 KEANSKESDTDKTDKSEADKDKP---AKKDETKAEADKPATEAGKERAATENEKLAKRKIVSIDAGRKYFSPEQLKEIID 213
Cdd:PTZ00121  1385 KKAEEKKKADEAKKKAEEDKKKAdelKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKK 1464

                   ....
gi 1092545404  214 KAKE 217
Cdd:PTZ00121  1465 KAEE 1468
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
38-172 2.10e-06

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 52.08  E-value: 2.10e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   38 TPTTENQPTIHTvsdsPQPSENRTEETPKAEL--QPETPK---TVETETPSTDKVASL--PKTEEKTQEEvSSTPSDKEE 110
Cdd:NF033839   382 TPKPEVKPQPEK----PKPEVKPQPEKPKPEVkpQPEKPKpevKPQPEKPKPEVKPQPekPKPEVKPQPE-KPKPEVKPQ 456
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1092545404  111 VVTPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDK--TDKSEADKDK-PAKKDETKAEA-DKP 172
Cdd:NF033839   457 PETPKPEVKPQPEKPKPEVKPQPEKPKPDNSKPQADDKkpSTPNNLSKDKqPSNQASTNEKAtNKP 522
PTZ00121 PTZ00121
MAEBL; Provisional
57-217 3.87e-06

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 51.68  E-value: 3.87e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   57 SENRTEETPKAELQPETPKtVETETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEEASPKKEEQ 136
Cdd:PTZ00121  1352 AEAAADEAEAAEEKAEAAE-KKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEK 1430
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  137 K---EANSKESDTDKTD--KSEADKDKPAKKDETKAEADKPATEAGKE----RAATENEKLAKRKIVSIDAGRKYFSPEQ 207
Cdd:PTZ00121  1431 KkadEAKKKAEEAKKADeaKKKAEEAKKAEEAKKKAEEAKKADEAKKKaeeaKKADEAKKKAEEAKKKADEAKKAAEAKK 1510
                          170
                   ....*....|
gi 1092545404  208 LKEIIDKAKE 217
Cdd:PTZ00121  1511 KADEAKKAEE 1520
PTZ00121 PTZ00121
MAEBL; Provisional
62-217 4.35e-06

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 51.68  E-value: 4.35e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   62 EETPKAE---LQPETPKTVETETPSTDKVASLPKTEEKTQ-EEVSSTpsdkEEVVTPTSVEKEAADKKAEEASPKKEEQK 137
Cdd:PTZ00121  1240 EEAKKAEeerNNEEIRKFEEARMAHFARRQAAIKAEEARKaDELKKA----EEKKKADEAKKAEEKKKADEAKKKAEEAK 1315
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  138 EANS---------KESDTDKTDKSEADKDKPAKKDETKAEADKpaTEAGKERAATENEKLAKRKIVSIDAGRKYFSPEQL 208
Cdd:PTZ00121  1316 KADEakkkaeeakKKADAAKKKAEEAKKAAEAAKAEAEAAADE--AEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKA 1393

                   ....*....
gi 1092545404  209 KEIIDKAKE 217
Cdd:PTZ00121  1394 DEAKKKAEE 1402
glucan_65_rpt TIGR04035
glucan-binding repeat; This model describes a region of about 63 amino acids that is composed ...
1234-1284 4.69e-06

glucan-binding repeat; This model describes a region of about 63 amino acids that is composed of three repeats of a more broadly distributed family of shorter repeats modeled by pfam01473. While the shorter repeats are often associated with choline binding (and therefore with cell wall binding), the longer repeat described here represents a subgroup of repeat sequences associated with glucan binding, as found in a number glycosylhydrolases. Shah, et al. describe a repeat consensus, WYYFDANGKAVTGAQTINGQTLYFDQDGKQVKG, that corresponds to half of the repeat as modeled here and one and a half copies of the repeat as modeled by pfam01473.


Pssm-ID: 274933 [Multi-domain]  Cd Length: 62  Bit Score: 45.20  E-value: 4.69e-06
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1092545404 1234 YYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWI--DQGGSWYYLNDSGAMQT 1284
Cdd:TIGR04035    1 YYFDADGKAVTGAQTIDGVTYYFDENGKQVKGDFvtNGGGTYYYDKDSGALVT 53
GH20_Sm-chitobiase-like cd06569
The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl ...
712-907 8.60e-06

The chitobiase of Serratia marcescens is a beta-N-1,4-acetylhexosaminidase with a glycosyl hydrolase family 20 (GH20) domain that hydrolyzes the beta-1,4-glycosidic linkages in oligomers derived from chitin. Chitin is degraded by a two step process: i) a chitinase hydrolyzes the chitin to oligosaccharides and disaccharides such as di-N-acetyl-D-glucosamine and chitobiose, ii) chitobiase then further degrades these oligomers into monomers. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by solvent or the enzyme, but by the substrate itself.


Pssm-ID: 119337  Cd Length: 445  Bit Score: 49.98  E-value: 8.60e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  712 DPNGTA-LTQAEVTELAKYAKEKGIGLIPAINSPGHMDAMLVAME-------KLGIANPQANF------DKVSKTTMDLE 777
Cdd:cd06569     87 NNSGSGyYSRADYIEILKYAKARHIEVIPEIDMPGHARAAIKAMEaryrklmAAGKPAEAEEYrlsdpaDTSQYLSVQFY 166
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  778 NQEALN--------FTKALIGKYMDYFAD---KSKIFNYGTDEYANDATNaqGWYYLKW------------YGLYNKFAd 834
Cdd:cd06569    167 TDNVINpcmpstyrFVDKVIDEIARMHQEagqPLTTIHFGGDEVPEGAWG--GSPACKAqlfakegsvkdvEDLKDYFF- 243
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1092545404  835 ysNSLAAMAKERGLQPMAFNDGFYYEDKDdaeFDKDVLISY-WSKGW-WGYNLASPQY--LASKGYKFLNTNGDWYY 907
Cdd:cd06569    244 --ERVSKILKAHGITLAGWEDGLLGKDTT---NVDGFATPYvWNNVWgWGYWGGEDRAykLANKGYDVVLSNATNLY 315
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
1-181 9.62e-06

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 50.29  E-value: 9.62e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404    1 MKQEKQQRFSIRKYAVGAASVLIGF-------AFQAQAVAADGVT--------------------PTTENQptihTVSDS 53
Cdd:NF033609     1 MNMKKKEKHAIRKKSIGVASVLVGTligfgllSSKEADASENSVTqsdsasnesksndsssvsaaPKTDDT----NVSDT 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   54 PQPSENRTEETPKAE--LQPETPKTVET-----ETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKA 126
Cdd:NF033609    77 KTSSNTNNGETSVAQnpAQQETTQSASTnatteETPVTGEATTTATNQANTPATTQSSNTNAEELVNQTSNETTSNDTNT 156
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1092545404  127 EEA--SPKKEEQKEANSKESDTDKTDKSEADKDKPAKKDETKAEADKPA--TEAGKERA 181
Cdd:NF033609   157 VSSvnSPQNSTNAENVSTTQDTSTEATPSNNESAPQSTDASNKDVVNQAvnTSAPRMRA 215
Choline_bind_1 pfam01473
Putative cell wall binding repeat; These repeats are characterized by conserved aromatic ...
1244-1262 9.77e-06

Putative cell wall binding repeat; These repeats are characterized by conserved aromatic residues and glycines are found in multiple tandem copies in a number of proteins. The CW repeat is 20 amino acid residues long. The exact domain boundaries may not be correct. It has been suggested that these repeats in Swiss:P15057 might be responsible for the specific recognition of choline-containing cell walls. Similar but longer repeats are found in the glucosyltransferases and glucan-binding proteins of oral streptococci and shown to be involved in glucan binding as well as in the related dextransucrases of Leuconostoc mesenteroides. Repeats also occur in toxins of Clostridium difficile and other clostridia, though the ligands are not always known.


Pssm-ID: 366661 [Multi-domain]  Cd Length: 19  Bit Score: 43.14  E-value: 9.77e-06
                           10
                   ....*....|....*....
gi 1092545404 1244 TGWQKVNGTWYYLDNSGAM 1262
Cdd:pfam01473    1 TGWVKINGNWYYFDSNGVM 19
Caldesmon pfam02029
Caldesmon;
50-226 3.62e-05

Caldesmon;


Pssm-ID: 460421 [Multi-domain]  Cd Length: 495  Bit Score: 47.94  E-value: 3.62e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   50 VSDSPQPSENRTEETPKaELQPETPKTVETETPSTDKVAslpKTEEKTQEEVSSTpSDKEEVVTPT------SVEKEAAD 123
Cdd:pfam02029   32 VTESVEPNEHNSYEEDS-ELKPSGQGGLDEEEAFLDRTA---KREERRQKRLQEA-LERQKEFDPTiadekeSVAERKEN 106
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  124 KKAEEASP-KKEEQKEANSKESDTDKTDKSEADKDKPAKKDETKAEADKpatEAGKERAATENEKLAKrkivsidagRKY 202
Cdd:pfam02029  107 NEEEENSSwEKEEKRDSRLGRYKEEETEIREKEYQENKWSTEVRQAEEE---GEEEEDKSEEAEEVPT---------ENF 174
                          170       180
                   ....*....|....*....|....
gi 1092545404  203 FSPEQLKEIIDKAKEYGYTDLHLL 226
Cdd:pfam02029  175 AKEEVKDEKIKKEKKVKYESKVFL 198
PTZ00121 PTZ00121
MAEBL; Provisional
60-217 4.06e-05

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 48.21  E-value: 4.06e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   60 RTEETPKAElqpETPKTVEtETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEEASPKKEEQKEA 139
Cdd:PTZ00121  1297 KAEEKKKAD---EAKKKAE-EAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKK 1372
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  140 NSKESDTDKTDKSEADKDKPAKKDETKAEADKPATEAGKERA-----ATENEKLAKRKIVSIDAGRKYFSPEQLKEIIDK 214
Cdd:PTZ00121  1373 KEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAaakkkADEAKKKAEEKKKADEAKKKAEEAKKADEAKKK 1452

                   ...
gi 1092545404  215 AKE 217
Cdd:PTZ00121  1453 AEE 1455
PTZ00121 PTZ00121
MAEBL; Provisional
42-213 6.51e-05

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 47.83  E-value: 6.51e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   42 ENQPTIHTVSDSPQPSENRTEETPKAELQPETPKTVETETPSTDK--VASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEK 119
Cdd:PTZ00121  1630 EEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKkkAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKK 1709
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  120 EAAD--KKAEEASPKKEEQK---EANSKESDTDKTDKSEADKDKPAKKD--ETKAEADKPATEAGKERAATENEKLAKRk 192
Cdd:PTZ00121  1710 KEAEekKKAEELKKAEEENKikaEEAKKEAEEDKKKAEEAKKDEEEKKKiaHLKKEEEKKAEEIRKEKEAVIEEELDEE- 1788
                          170       180
                   ....*....|....*....|.
gi 1092545404  193 ivsiDAGRKYFSPEQLKEIID 213
Cdd:PTZ00121  1789 ----DEKRRMEVDKKIKDIFD 1805
PTZ00121 PTZ00121
MAEBL; Provisional
58-217 7.57e-05

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 47.44  E-value: 7.57e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   58 ENRTEETPKAElqpETPKTVETETPSTDkvaSLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEEASPKKEEQK 137
Cdd:PTZ00121  1437 KKKAEEAKKAD---EAKKKAEEAKKAEE---AKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKK 1510
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  138 EANSKESDTDKTDKSEADKDKPAKKDET--KAEADKPATEAGKE---RAATENEKLAKRKIVSIDAGRKYFSPEQLKEII 212
Cdd:PTZ00121  1511 KADEAKKAEEAKKADEAKKAEEAKKADEakKAEEKKKADELKKAeelKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAE 1590

                   ....*
gi 1092545404  213 DKAKE 217
Cdd:PTZ00121  1591 EARIE 1595
PTZ00121 PTZ00121
MAEBL; Provisional
58-216 1.31e-04

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 46.67  E-value: 1.31e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   58 ENRTEETPKAElqpETPKTVETETPSTDKVaslpKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAAD-KKAEEASPKKEEQ 136
Cdd:PTZ00121  1384 KKKAEEKKKAD---EAKKKAEEDKKKADEL----KKAAAAKKKADEAKKKAEEKKKADEAKKKAEEaKKADEAKKKAEEA 1456
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  137 KEANSKESDTDKTDKSEADKDKpAKKDETKAEADKPATEAGKERAATENEKLAKRKIVSIDAGRKYFSPEQLKEIIDKAK 216
Cdd:PTZ00121  1457 KKAEEAKKKAEEAKKADEAKKK-AEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKK 1535
PRK08691 PRK08691
DNA polymerase III subunits gamma and tau; Validated
51-186 1.42e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236333 [Multi-domain]  Cd Length: 709  Bit Score: 46.24  E-value: 1.42e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   51 SDSPQPSEnrtEETPKAELQPETPKTV-ETETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEEA 129
Cdd:PRK08691   437 EDAPDEAQ---TAAGTAQTSAKSIQTAsEAETPPENQVSKNKAADNETDAPLSEVPSENPIQATPNDEAVETETFAHEAP 513
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1092545404  130 SPKKEEQKEANSKESDTDKTDKSEADKDKPAKKDETKAEADKPATEAGKERAATENE 186
Cdd:PRK08691   514 AEPFYGYGFPDNDCPPEDGAEIPPPDWEHAAPADTAGGGADEEAEAGGIGGNNTPSA 570
PTZ00121 PTZ00121
MAEBL; Provisional
60-192 2.59e-04

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 45.90  E-value: 2.59e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   60 RTEETPKAELQPETPKTVETETPSTDKVASLPKTEEK--TQEEVSSTPSDKEEVVTPTSVEKEAAdKKAEEASPKKEEQK 137
Cdd:PTZ00121  1582 KAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAkiKAEELKKAEEEKKKVEQLKKKEAEEK-KKAEELKKAEEENK 1660
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1092545404  138 EANSKESDTDKTDKSEADKDKPAKKDETKAE--ADKPATEAGKERAATENEKLAKRK 192
Cdd:PTZ00121  1661 IKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAeaLKKEAEEAKKAEELKKKEAEEKKK 1717
PTZ00121 PTZ00121
MAEBL; Provisional
55-217 3.09e-04

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 45.52  E-value: 3.09e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   55 QPSENRTEETPKAELQPETPKTVETETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEEASPKKE 134
Cdd:PTZ00121  1401 EEDKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAE 1480
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  135 EQKEANS--KESDTDKTDKSEADKDKPAKK--DET-KAEADKPATEAGKERAATENEKLAK----RKIVSIDAGRKYFSP 205
Cdd:PTZ00121  1481 EAKKADEakKKAEEAKKKADEAKKAAEAKKkaDEAkKAEEAKKADEAKKAEEAKKADEAKKaeekKKADELKKAEELKKA 1560
                          170
                   ....*....|..
gi 1092545404  206 EQLKEIIDKAKE 217
Cdd:PTZ00121  1561 EEKKKAEEAKKA 1572
Borrelia_P83 pfam05262
Borrelia P83/100 protein; This family consists of several Borrelia P83/P100 antigen proteins.
77-223 3.13e-04

Borrelia P83/100 protein; This family consists of several Borrelia P83/P100 antigen proteins.


Pssm-ID: 114011 [Multi-domain]  Cd Length: 489  Bit Score: 44.99  E-value: 3.13e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   77 VETETPSTDKVAslpkteEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEEAspKKEEQKEANSKESDTDKTDKSEADK 156
Cdd:pfam05262  172 VDTDSISDKKVV------EALREDNEKGVNFRRDMTDLKERESQEDAKRAQQL--KEELDKKQIDADKAQQKADFAQDNA 243
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1092545404  157 DKPAKKDETKAEADKPATEAGKERAATENEKLAKRKIVSIDAGRKyfSPEQLKEIIDKAKEYGYTDL 223
Cdd:pfam05262  244 DKQRDEVRQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEKAQI--EIKKNDEEALKAKDHKAFDL 308
RCSD pfam05177
RCSD region; Proteins contain this region include C.elegans UNC-89. This region is found ...
91-187 3.66e-04

RCSD region; Proteins contain this region include C.elegans UNC-89. This region is found repeated in UNC-89 and shows conservation in prolines, lysines and glutamic acids. Proteins with RCSD are involved in muscle M-line assembly, but the function of this region RCSD is not clear.


Pssm-ID: 428350 [Multi-domain]  Cd Length: 101  Bit Score: 41.18  E-value: 3.66e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   91 PKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKA----EEA--SPKKEEQKEANSKESDTDKTDKSEADKDKPAKKDE 164
Cdd:pfam05177    3 PGKKEKPPLRRTSSRTEKQEEKGRAPEEAEHSPKAVggseEEKpkSPAKEEAVEAQASSPEAANGCGSPTEEKKAGEKVE 82
                           90       100
                   ....*....|....*....|...
gi 1092545404  165 TkaeadKPATEAGKERAATENEK 187
Cdd:pfam05177   83 E-----KKSSEVKEERAENEEDS 100
COG5263 COG5263
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];
1214-1323 4.34e-04

Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];


Pssm-ID: 444077 [Multi-domain]  Cd Length: 486  Bit Score: 44.48  E-value: 4.34e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404 1214 WYYYDHADKVKTGWVKDGAWYYLDESGVMKTGWQKVNGTWYYLDNSGAMQTGWIDQGGSWYYLNDSGAMQTGWVNQ---- 1289
Cdd:COG5263    213 AAGSGAGAKKTGSTAGASGTAYGDSGGTAGSGLSSLGGSSNALESGGENNQSLAGNGTSYDDAGAAGVDGTGTTGTvgwv 292
                           90       100       110
                   ....*....|....*....|....*....|....
gi 1092545404 1290 GDTWYYLDNsGTMKTGWFQVEDKWYYSYPSGALA 1323
Cdd:COG5263    293 DGKWYYFDA-GKMVTGWQTINGKWYYFDSDGAMA 325
PRK05901 PRK05901
RNA polymerase sigma factor; Provisional
83-252 4.65e-04

RNA polymerase sigma factor; Provisional


Pssm-ID: 235640 [Multi-domain]  Cd Length: 509  Bit Score: 44.60  E-value: 4.65e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   83 STDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKeaadKKAEEASPKKEEQKEANSKESDTDKTDKSEA-DKDKPAK 161
Cdd:PRK05901     2 TTASTKAELAAEEEAKKKLKKLAAKSKSKGFITKEEI----KEALESKKKTPEQIDQVLIFLSGMVKDTDDAtESDIPKK 77
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  162 KDETKAEADKPATEAGKERAATENEKLAKRKIVSIDAGRKyFSPEQLKEIIDKAKEYGYTDLHLLVGNDGLRFMLDDMSM 241
Cdd:PRK05901    78 KTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDD-LNYVKDIDVLNQADDDDDDDDDDDLDDDDIDDDDDDEDD 156
                          170
                   ....*....|.
gi 1092545404  242 KVGDKTYSSDD 252
Cdd:PRK05901   157 DEDDDDDDVDD 167
tolA PRK09510
cell envelope integrity inner membrane protein TolA; Provisional
92-189 5.26e-04

cell envelope integrity inner membrane protein TolA; Provisional


Pssm-ID: 236545 [Multi-domain]  Cd Length: 387  Bit Score: 44.03  E-value: 5.26e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   92 KTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDKTDKsEADKDKPAKKDETKAEADK 171
Cdd:PRK09510   162 KAAAEAKKKAEAEAAKKAAAEAKKKAEAEAAAKAAAEAKKKAEAEAKKKAAAEAKKKAAA-EAKAAAAKAAAEAKAAAEK 240
                           90
                   ....*....|....*...
gi 1092545404  172 PATEAGKERAATENEKLA 189
Cdd:PRK09510   241 AAAAKAAEKAAAAKAAAE 258
Choline_bind_1 pfam01473
Putative cell wall binding repeat; These repeats are characterized by conserved aromatic ...
1284-1302 5.35e-04

Putative cell wall binding repeat; These repeats are characterized by conserved aromatic residues and glycines are found in multiple tandem copies in a number of proteins. The CW repeat is 20 amino acid residues long. The exact domain boundaries may not be correct. It has been suggested that these repeats in Swiss:P15057 might be responsible for the specific recognition of choline-containing cell walls. Similar but longer repeats are found in the glucosyltransferases and glucan-binding proteins of oral streptococci and shown to be involved in glucan binding as well as in the related dextransucrases of Leuconostoc mesenteroides. Repeats also occur in toxins of Clostridium difficile and other clostridia, though the ligands are not always known.


Pssm-ID: 366661 [Multi-domain]  Cd Length: 19  Bit Score: 38.52  E-value: 5.35e-04
                           10
                   ....*....|....*....
gi 1092545404 1284 TGWVNQGDTWYYLDNSGTM 1302
Cdd:pfam01473    1 TGWVKINGNWYYFDSNGVM 19
PTZ00121 PTZ00121
MAEBL; Provisional
60-262 5.35e-04

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 44.75  E-value: 5.35e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   60 RTEETPKAELQPETPKTVETETPSTDKVASLPKTEEKTQEEVSSTPSDKEEvvtpTSVEKEAADKKAEEASPKKEEQKEA 139
Cdd:PTZ00121  1608 KAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEE----NKIKAAEEAKKAEEDKKKAEEAKKA 1683
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  140 NSKESDTDKTDKSEADKDKPA----KKDET---KAEADKPATEAGKERA--ATENEKLAKRKI--VSIDAGRKYFSPEQL 208
Cdd:PTZ00121  1684 EEDEKKAAEALKKEAEEAKKAeelkKKEAEekkKAEELKKAEEENKIKAeeAKKEAEEDKKKAeeAKKDEEEKKKIAHLK 1763
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....
gi 1092545404  209 KEIIDKAKEYGYTDLHLLvgNDGLRFMLDDMSMKVGDKTYSSDDVKRAIENGTN 262
Cdd:PTZ00121  1764 KEEEKKAEEIRKEKEAVI--EEELDEEDEKRRMEVDKKIKDIFDNFANIIEGGK 1815
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
2-218 6.44e-04

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 44.23  E-value: 6.44e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404    2 KQEKQQRFSIRKYAVGAASVLIGFAFQAqavaadGVTPTTENQPT-IHTVSDSPQPSENRTEETPKA-------ELQPET 73
Cdd:NF033838     5 KSERKVHYSIRKFSIGVASVVVASLFLG------GVVHAEEVRGGnNPTVTSSGNESQKEHAKEVEShlekilsEIQKSL 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   74 PKTVETETPSTDKVASLPKTE---------EKTQEEVSSTPSD---------KEEVVTPTSVEKEaADKKAEEASPKKEE 135
Cdd:NF033838    79 DKRKHTQNVALNKKLSDIKTEylyelnvlkEKSEAELTSKTKKeldaafeqfKKDTLEPGKKVAE-ATKKVEEAEKKAKD 157
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  136 QKEANSKESDTD-----KTDKSEAD-KDKPAKKDETKAEADKPATEA--GKERAATENEKLAKRKIVSIDAGRKYFSPEQ 207
Cdd:NF033838   158 QKEEDRRNYPTNtyktlELEIAESDvEVKKAELELVKEEAKEPRDEEkiKQAKAKVESKKAEATRLEKIKTDREKAEEEA 237
                          250
                   ....*....|.
gi 1092545404  208 LKEIIDKAKEY 218
Cdd:NF033838   238 KRRADAKLKEA 248
tolA PRK09510
cell envelope integrity inner membrane protein TolA; Provisional
55-192 8.25e-04

cell envelope integrity inner membrane protein TolA; Provisional


Pssm-ID: 236545 [Multi-domain]  Cd Length: 387  Bit Score: 43.26  E-value: 8.25e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   55 QPSENRTEETPKAELQPETPKTVETETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEEASPKKE 134
Cdd:PRK09510    71 QKSAKRAEEQRKKKEQQQAEELQQKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAALKQKQAEEAAAKAAAAAKAKAE 150
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1092545404  135 EQK---EANSKESDTDKTDKSEADKDKPAKKdETKAEADKPATEAGKERAATENEKLAKRK 192
Cdd:PRK09510   151 AEAkraAAAAKKAAAEAKKKAEAEAAKKAAA-EAKKKAEAEAAAKAAAEAKKKAEAEAKKK 210
Choline_bind_1 pfam01473
Putative cell wall binding repeat; These repeats are characterized by conserved aromatic ...
1264-1282 9.71e-04

Putative cell wall binding repeat; These repeats are characterized by conserved aromatic residues and glycines are found in multiple tandem copies in a number of proteins. The CW repeat is 20 amino acid residues long. The exact domain boundaries may not be correct. It has been suggested that these repeats in Swiss:P15057 might be responsible for the specific recognition of choline-containing cell walls. Similar but longer repeats are found in the glucosyltransferases and glucan-binding proteins of oral streptococci and shown to be involved in glucan binding as well as in the related dextransucrases of Leuconostoc mesenteroides. Repeats also occur in toxins of Clostridium difficile and other clostridia, though the ligands are not always known.


Pssm-ID: 366661 [Multi-domain]  Cd Length: 19  Bit Score: 37.75  E-value: 9.71e-04
                           10
                   ....*....|....*....
gi 1092545404 1264 TGWIDQGGSWYYLNDSGAM 1282
Cdd:pfam01473    1 TGWVKINGNWYYFDSNGVM 19
tolA_full TIGR02794
TolA protein; TolA couples the inner membrane complex of itself with TolQ and TolR to the ...
119-201 1.39e-03

TolA protein; TolA couples the inner membrane complex of itself with TolQ and TolR to the outer membrane complex of TolB and OprL (also called Pal). Most of the length of the protein consists of low-complexity sequence that may differ in both length and composition from one species to another, complicating efforts to discriminate TolA (the most divergent gene in the tol-pal system) from paralogs such as TonB. Selection of members of the seed alignment and criteria for setting scoring cutoffs are based largely conserved operon struction. //The Tol-Pal complex is required for maintaining outer membrane integrity. Also involved in transport (uptake) of colicins and filamentous DNA, and implicated in pathogenesis. Transport is energized by the proton motive force. TolA is an inner membrane protein that interacts with periplasmic TolB and with outer membrane porins ompC, phoE and lamB. [Transport and binding proteins, Other, Cellular processes, Pathogenesis]


Pssm-ID: 274303 [Multi-domain]  Cd Length: 346  Bit Score: 42.53  E-value: 1.39e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  119 KEAADKKAEEASPKK---EEQKEANSKESDTDKTDKSEADKDKPAKKDETKAEAdkpatEAGKERAATENEKLA---KRK 192
Cdd:TIGR02794  145 KEEAAKQAEEEAKAKaaaEAKKKAEEAKKKAEAEAKAKAEAEAKAKAEEAKAKA-----EAAKAKAAAEAAAKAeaeAAA 219

                   ....*....
gi 1092545404  193 IVSIDAGRK 201
Cdd:TIGR02794  220 AAAAEAERK 228
glucan_65_rpt TIGR04035
glucan-binding repeat; This model describes a region of about 63 amino acids that is composed ...
1274-1329 1.81e-03

glucan-binding repeat; This model describes a region of about 63 amino acids that is composed of three repeats of a more broadly distributed family of shorter repeats modeled by pfam01473. While the shorter repeats are often associated with choline binding (and therefore with cell wall binding), the longer repeat described here represents a subgroup of repeat sequences associated with glucan binding, as found in a number glycosylhydrolases. Shah, et al. describe a repeat consensus, WYYFDANGKAVTGAQTINGQTLYFDQDGKQVKG, that corresponds to half of the repeat as modeled here and one and a half copies of the repeat as modeled by pfam01473.


Pssm-ID: 274933 [Multi-domain]  Cd Length: 62  Bit Score: 37.88  E-value: 1.81e-03
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1092545404 1274 YYLNDSGAMQTGWVNQGDTWYYLDNSGTM-KTGWFQVEDKWYYSYP-SGALAVNTTID 1329
Cdd:TIGR04035    1 YYFDADGKAVTGAQTIDGVTYYFDENGKQvKGDFVTNGGGTYYYDKdSGALVTNRFVT 58
RNA_polI_A34 pfam08208
DNA-directed RNA polymerase I subunit RPA34.5; This is a family of proteins conserved from ...
83-166 2.92e-03

DNA-directed RNA polymerase I subunit RPA34.5; This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription.


Pssm-ID: 462395  Cd Length: 205  Bit Score: 40.44  E-value: 2.92e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   83 STDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDKTDKSEADKDKPAKK 162
Cdd:pfam08208  122 SGPGPPGTIGSSSEESEGEEEKRKVPAAKKKSKKKKKKKAETEEEEEVPKKKKKKKKKKKEKKEPEKKEKKEKKSKKSKK 201

                   ....
gi 1092545404  163 DETK 166
Cdd:pfam08208  202 EKKK 205
rplD PRK14907
50S ribosomal protein L4; Provisional
57-187 3.04e-03

50S ribosomal protein L4; Provisional


Pssm-ID: 184900 [Multi-domain]  Cd Length: 295  Bit Score: 41.09  E-value: 3.04e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   57 SENRTEETPKAELQPETPKTVETETPSTDKVASLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKA-EEASPKKEE 135
Cdd:PRK14907    13 TEEKKPAAKKATTSKETAKTKKTAKTTSTKAAKKAAKVKKTKSVKTTTKKVTVKFEKTESVKKESVAKKTvKKEAVSAEV 92
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1092545404  136 QKEANSKESDTDKTDKSEADKDKPakkdETKAEADKPATEAGKERAATENEK 187
Cdd:PRK14907    93 FEASNKLFKNTSKLPKKLFASEKI----YSQAIFDTILSERASRRQGTHKVK 140
valS PRK14900
valyl-tRNA synthetase; Provisional
40-181 3.17e-03

valyl-tRNA synthetase; Provisional


Pssm-ID: 237855 [Multi-domain]  Cd Length: 1052  Bit Score: 41.90  E-value: 3.17e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   40 TTENQPTIHTVSDSPQPSENRTEETPKAElqpETPKTVET-----ETPSTDKVASLPKTEEKTQEEVSStpSDKEEVVTP 114
Cdd:PRK14900   915 TMEIQNEQKPTQDGPAAEAQPAQENTVVE---SAEKAVAAvseaaQQAATAVASGIEKVAEAVRKTVRR--SVKKAAATR 989
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1092545404  115 TSVEKEAADKkaeeASPKKEEQKEANSKESDTDKTDKSEADKDKPAKKDETKAEADKPATEAGKERA 181
Cdd:PRK14900   990 AAMKKKVAKK----APAKKAAAKKAAAKKAAAKKKVAKKAPAKKVARKPAAKKAAKKPARKAAGRKA 1052
PTZ00121 PTZ00121
MAEBL; Provisional
60-210 3.38e-03

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 42.05  E-value: 3.38e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   60 RTEETPKAElqpETPKTVETETPSTDKVASLPKTEEKTQEEVSSTPSD------------KEEVVTPTSVEKEAADKKAE 127
Cdd:PTZ00121  1559 KAEEKKKAE---EAKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEekkmkaeeakkaEEAKIKAEELKKAEEEKKKV 1635
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  128 EASPKKEEQK----EANSKESDTDKTDKSE----ADKDKPAKKDETKAEADKPATEAGKERAATENEKLAKRKIVSIDAG 199
Cdd:PTZ00121  1636 EQLKKKEAEEkkkaEELKKAEEENKIKAAEeakkAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEK 1715
                          170
                   ....*....|.
gi 1092545404  200 RKyfsPEQLKE 210
Cdd:PTZ00121  1716 KK---AEELKK 1723
Choline_bind_1 pfam01473
Putative cell wall binding repeat; These repeats are characterized by conserved aromatic ...
1225-1242 3.98e-03

Putative cell wall binding repeat; These repeats are characterized by conserved aromatic residues and glycines are found in multiple tandem copies in a number of proteins. The CW repeat is 20 amino acid residues long. The exact domain boundaries may not be correct. It has been suggested that these repeats in Swiss:P15057 might be responsible for the specific recognition of choline-containing cell walls. Similar but longer repeats are found in the glucosyltransferases and glucan-binding proteins of oral streptococci and shown to be involved in glucan binding as well as in the related dextransucrases of Leuconostoc mesenteroides. Repeats also occur in toxins of Clostridium difficile and other clostridia, though the ligands are not always known.


Pssm-ID: 366661 [Multi-domain]  Cd Length: 19  Bit Score: 35.82  E-value: 3.98e-03
                           10
                   ....*....|....*....
gi 1092545404 1225 TGWVKD-GAWYYLDESGVM 1242
Cdd:pfam01473    1 TGWVKInGNWYYFDSNGVM 19
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
2-186 4.66e-03

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 41.29  E-value: 4.66e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404    2 KQEKQQRFSIRKYAVGAASVLIGFAFQAQavaadgVTPTTENQPTIHTVSDSPQPSENRTEETPKAELQPETPKTVETET 81
Cdd:NF033839     5 NHERKMRYSIRKFSIGVASVAVASLFMGS------VVHATEKEGSTQAATSSNRGNESQAEQRKELDLERDKAKKAVSEY 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   82 pSTDKVASLPKTEEKTQEevsstpSDKEEVVTPTSVEKEAADKKAEEASPKKEEQKEANSKESDTDK----------TDK 151
Cdd:NF033839    79 -KEKKVKEIYKKSTKERH------KNTVDLVNKLQNIKNEYLNKIVESTSKSQLQKLMMESQSKVDEavskfekdssSSS 151
                          170       180       190
                   ....*....|....*....|....*....|....*
gi 1092545404  152 SEADKDKPAKKDETKAEADKPATEAGKERAATENE 186
Cdd:NF033839   152 SSGSSTKPETPQPENPEHQKPTTPAPDTKPSPQPE 186
PRK13108 PRK13108
prolipoprotein diacylglyceryl transferase; Reviewed
41-162 4.96e-03

prolipoprotein diacylglyceryl transferase; Reviewed


Pssm-ID: 237284 [Multi-domain]  Cd Length: 460  Bit Score: 41.12  E-value: 4.96e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   41 TENQPTIHTVSDSPQPSENRTEETPKAELQPETPKTVETETPSTDKvaslpktEEKTQEEVSSTPSDKEEVVTPTSVEKE 120
Cdd:PRK13108   343 EVAAESVVQVADRDGESTPAVEETSEADIEREQPGDLAGQAPAAHQ-------VDAEAASAAPEEPAALASEAHDETEPE 415
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|..
gi 1092545404  121 AADKKAEEASPKKEEQKEANSKESDTDKTDKSEADKDKPAKK 162
Cdd:PRK13108   416 VPEKAAPIPDPAKPDELAVAGPGDDPAEPDGIRRQDDFSSRR 457
PRK05035 PRK05035
electron transport complex protein RnfC; Provisional
66-192 5.41e-03

electron transport complex protein RnfC; Provisional


Pssm-ID: 235334 [Multi-domain]  Cd Length: 695  Bit Score: 41.09  E-value: 5.41e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   66 KAELQPETPKTVETETPSTDKVA---SLPKTEEKTQEEVSSTPSDKEEVVTPTSVEKEAADKKAEeasPKKEEQKEANSK 142
Cdd:PRK05035   561 KAAQQAANAEAEEEVDPKKAAVAaaiARAKAKKAAQQAASAEPEEQVAEVDPKKAAVAAAIARAK---AKKAEQQANAEP 637
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1092545404  143 ESDTD-KTDKSEADKDKPAKKDETKAEADKPATEAGKERAATENEKLAKRK 192
Cdd:PRK05035   638 EEPVDpRKAAVAAAIARAKARKAAQQQANAEPEEAEDPKKAAVAAAIARAK 688
CAF-1_p150 pfam11600
Chromatin assembly factor 1 complex p150 subunit, N-terminal; CAF-1_p150 is a polypeptide ...
107-198 8.47e-03

Chromatin assembly factor 1 complex p150 subunit, N-terminal; CAF-1_p150 is a polypeptide subunit of CAF-1, which functions in depositing newly synthesized and acetylated histones H3/H4 into chromatin during DNA replication and repair. CAF-1_p150 includes the HP1 interaction site, the PEST, KER and ED interacting sites. CAF-1_p150 interacts directly with newly synthesized and acetylated histones through the acidic KER and ED domains. The PEST domain is associated with proteins that undergo rapid proteolysis.


Pssm-ID: 402959 [Multi-domain]  Cd Length: 164  Bit Score: 38.52  E-value: 8.47e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  107 DKEEVVTPTSVEKEAADKKAEEASPKKEEQKEANSKEsdtdKTDKSEADKDKPAKKDETKAEADKPATEAG----KERAA 182
Cdd:pfam11600   34 EKEEKERLKEEAKAEKERAKEEARRKKEEEKELKEKE----RREKKEKDEKEKAEKLRLKEEKRKEKQEALeaklEEKRK 109
                           90
                   ....*....|....*.
gi 1092545404  183 TENEKLAKRKIVSIDA 198
Cdd:pfam11600  110 KEEEKRLKEEEKRIKA 125
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
57-217 9.00e-03

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 40.38  E-value: 9.00e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404   57 SENRTEETPKAELQPETPKTVETETPSTDK------VASLPKTEEKTQEEV-SSTPSDKEEVVTPTSVEKEA----ADKK 125
Cdd:NF033838   233 AEEEAKRRADAKLKEAVEKNVATSEQDKPKrrakrgVLGEPATPDKKENDAkSSDSSVGEETLPSPSLKPEKkvaeAEKK 312
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1092545404  126 AEEASPKKEEQKEANSKE--SDTDKT---DKSEAD-KDKPAKKDETKAEADKPATEagkeraatENEKLAKRKIVSIDAG 199
Cdd:NF033838   313 VEEAKKKAKDQKEEDRRNypTNTYKTlelEIAESDvKVKEAELELVKEEAKEPRNE--------EKIKQAKAKVESKKAE 384
                          170
                   ....*....|....*...
gi 1092545404  200 RKYFspEQLKEIIDKAKE 217
Cdd:NF033838   385 ATRL--EKIKTDRKKAEE 400
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH