NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1815986075|ref|WP_163132957|]
View 

PKD domain-containing protein [Agarivorans sp. Alg241-V36]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
GH38-57_N_LamB_YdjC_SF super family cl49606
Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein ...
77-441 0e+00

Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily.


The actual alignment was detected with superfamily member cd11663:

Pssm-ID: 483946  Cd Length: 363  Bit Score: 587.69  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075   77 SGAPMPHDDLVSYYQHHAKKGAYLSWPMDTARNNNGNHPQSQTHVTMSASVINNVQSFGELGNLDGY-NLGWGAYWRDTQ 155
Cdd:cd11663      1 SGAPMPHDDLVSYYSHHAKTGAYLTWPWSVAQTLRTNHPQAQMHVTMSGAVVNNVNDLSTLGNVSGYsNTNWGAPWKNAY 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  156 NGTKTSGGYNALDTIHFSGHHTMGPLVGNDYFLKDLIYQNVTLAQDYFLGDSFKSSKGFFPTELGFSERIIPVLTKLGIE 235
Cdd:cd11663     81 NTLKTPAGNRTLDLIHFTGHHSMGPLVGNDYMLKDLIYQNATLAQPYFLGSDFQSSKGFFPTELGFSERIIPVLEKLGIQ 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  236 WSVLGNVHYSRTLRDYPYLNDPGKDTLISPPNRADLQNESNVGAWTELHMFNEQQVTYNKFPFASIPHWVQYIDPETGEQ 315
Cdd:cd11663    161 WSVIGNNHFSRTLRDYPLLNDPGSDTMVSPPNRADLQNVSTVGSWVSQPMFNEQQVVRNKYPFASTPHWVRYVDPATGAE 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  316 HKVAGIPVEQASSWEEGYQGSITADVLKAFEGDAAalgRTQYFTIAHDGDNSSGRAGDGGTWANSGNVTYADSSVRGMGV 395
Cdd:cd11663    241 SRVVGVPVAQAESWEEGYQGQVTADALKPFEGLVP---QKQFFVIAHDGDNSSGRAGSEETWRNAGNVTYADSGVKGMGI 317
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....*.
gi 1815986075  396 DEYLKAYPIPADDIVHVQDGSWIDTRDSSADPTWYHWHIPMGVWRG 441
Cdd:cd11663    318 DEYLRTNTPAASDVVHVQDGSWIDTRDSSSDPQWHHWKLPFGIWKG 363
myxo_dep_M36 super family cl45606
myxosortase-dependent M36 family metallopeptidase; Members of this bacterial protein family ...
974-1320 1.86e-28

myxosortase-dependent M36 family metallopeptidase; Members of this bacterial protein family have an M36 family metallopeptidase domain, like fungalysin (see PF02128), and a C-terminal MYXO-CTERM domain (see TIGR03901), suggesting processing and surface-anchoring by the still-unknown putative transpeptidase, myxosortase. Members of this family include MXAN_3564 (mepA), part of the effector cargo of outer membrane vesicles that the species produces in large numbers during predation on other microbes.


The actual alignment was detected with superfamily member NF038112:

Pssm-ID: 468355 [Multi-domain]  Cd Length: 1597  Bit Score: 125.15  E-value: 1.86e-28
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  974 NNAPVAVISPaSQSVAKGTVVTLSGAGSSDSDGSIASYSWS---------TGESTESIS-----VTVNDTQTISLTVTDN 1039
Cdd:NF038112  1185 NRRPVANAGP-DQTVLERTTVTLNGSGSFDPDGDPLTYAWTqvsgpavtlTGADTATPSftapeVTADTVLTFQLVVSDG 1263
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1040 QGKTATSSVTLTVI-PNKVPVASISPAnQTVAAGTTVTLDGAGsSDEDGSIASYLWS---------TGATTSSIS----- 1104
Cdd:NF038112  1264 TKTSAPDTVTVLVRnVNRAPVAVAGAP-ATVDERSTVTLDGSG-TDADGDALTYAWTqtsgpavtlTGATTATATftape 1341
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1105 VVVNASQTISLTVTDNegaAATAEAVLSVesdeKAKNFNQlyfRGTANGWATTAMDLVADNTWQAV-IDFDGQAeqrfkl 1183
Cdd:NF038112  1342 VTADTQLTFTLTVSDG---TASATDTVTV----TVRNVNR---APVANAGADQTVDERSTVTLSGSaTDPDGDA------ 1405
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1184 dVSGDWTQNYGD----TNSD-GVLEQTGGDIFTDVVGSYLLEVND-----QTLAYSITELNANQAPNAIIGASTTqVDIG 1253
Cdd:NF038112  1406 -LTYAWTQTAGPtvtlTGADtATASFTAPEVAADTELTFQLTVSAdgqasADVTVTVTVRNVNRAPVAHAGESIT-VDEG 1483
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1815986075 1254 QSITYSAAGsSDSDGVIASYLWSNGD------TSETTTVTYNTAGSNSIG------LTVTDDGGKTAQASVVVEVIDPN 1320
Cdd:NF038112  1484 STVTLDASA-TDPDGDTLTYAWTQVAgpsvtlTGADSAKLTFTAPEVSADttltfsLTVTDGSGSSGPVVVTVTVKNVN 1561
CBM_21 super family cl23798
Carbohydrate/starch-binding module (family 21); This family consists of several eukaryotic ...
876-949 3.05e-14

Carbohydrate/starch-binding module (family 21); This family consists of several eukaryotic proteins that are thought to be involved in the regulation of glycogen metabolism. For instance, the mouse PTG protein has been shown to interact with glycogen synthase, phosphorylase kinase, phosphorylase a: these three enzymes have key roles in the regulation of glycogen metabolism. PTG also binds the catalytic subunit of protein phosphatase 1 (PP1C) and localizes it to glycogen. Subsets of similar interactions have been observed with several other members of this family, such as the yeast PIG1, PIG2, GAC1 and GIP2 proteins. While the precise function of these proteins is not known, they may serve a scaffold function, bringing together the key enzymes in glycogen metabolism. This family is a carbohydrate binding domain.


The actual alignment was detected with superfamily member smart01066:

Pssm-ID: 474061 [Multi-domain]  Cd Length: 83  Bit Score: 69.31  E-value: 3.05e-14
                            10        20        30        40        50        60        70
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1815986075   876 TSTLYYQ---DTNGWGQVCLHYSVDGTvTWTTAPGEPMQSLGDNWYSLTVDLEDGNQLEFVTNNCSGAWDNNGGQNY 949
Cdd:smart01066    3 TVTVYYNgllATSGAKNVYLHYGFGEN-NWTDVPDVRMEKTGEGWVKATIPVKEAYKLNFCFKDGAGNWDNNGGANY 78
CBM_SusE-F_like super family cl28889
carbohydrate-binding modules from Bacteroides thetaiotaomicron SusE, SusF and similar proteins; ...
1348-1416 4.84e-05

carbohydrate-binding modules from Bacteroides thetaiotaomicron SusE, SusF and similar proteins; This group includes five starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins contain an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. CBM-Fa (the CBM unique to SusF), does not bind insoluble starch; CBM-Fb and CBM-Fc both do, deletion of one or the other results in a decrease in the overall affinity of SusF for starch. Both CBM-Eb and CBM-Ec are needed for SusE to bind tightly to starch. CBM-Ec has an additional starch-binding loop that may mediate interactions with partially unwound single helical forms of starch or small starch-breakdown products. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.


The actual alignment was detected with superfamily member cd12956:

Pssm-ID: 475118  Cd Length: 93  Bit Score: 43.50  E-value: 4.84e-05
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1348 ANNTWQVSVDFDGqnNQRFKFDLNGDWSTNYGDNNNDGT-LEQTGGDIFTGIVGSYVVEVNDATLQYRII 1416
Cdd:cd12956     26 TDGTFVSYATLAG--DGEIKFRPNNDWGENYGDDGDDGTfLSSGGDNIAVSAGGTYKITLNLNNNTYTIE 93
 
Name Accession Description Interval E-value
GH119_BcIgtZ-like cd11663
putative catalytic domain of glycoside hydrolase family 119 (GH119); The prokaryotic subgroup ...
77-441 0e+00

putative catalytic domain of glycoside hydrolase family 119 (GH119); The prokaryotic subgroup is represented by IgtZ, an alpha-amylase from a Bacillus circulans strain. The GH119 family is related to GH57, a chiefly prokaryotic family with the majority of thermostable enzymes coming from extremophiles (many of these are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). GH57s cleave alpha-glycosidic bonds by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation.


Pssm-ID: 212128  Cd Length: 363  Bit Score: 587.69  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075   77 SGAPMPHDDLVSYYQHHAKKGAYLSWPMDTARNNNGNHPQSQTHVTMSASVINNVQSFGELGNLDGY-NLGWGAYWRDTQ 155
Cdd:cd11663      1 SGAPMPHDDLVSYYSHHAKTGAYLTWPWSVAQTLRTNHPQAQMHVTMSGAVVNNVNDLSTLGNVSGYsNTNWGAPWKNAY 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  156 NGTKTSGGYNALDTIHFSGHHTMGPLVGNDYFLKDLIYQNVTLAQDYFLGDSFKSSKGFFPTELGFSERIIPVLTKLGIE 235
Cdd:cd11663     81 NTLKTPAGNRTLDLIHFTGHHSMGPLVGNDYMLKDLIYQNATLAQPYFLGSDFQSSKGFFPTELGFSERIIPVLEKLGIQ 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  236 WSVLGNVHYSRTLRDYPYLNDPGKDTLISPPNRADLQNESNVGAWTELHMFNEQQVTYNKFPFASIPHWVQYIDPETGEQ 315
Cdd:cd11663    161 WSVIGNNHFSRTLRDYPLLNDPGSDTMVSPPNRADLQNVSTVGSWVSQPMFNEQQVVRNKYPFASTPHWVRYVDPATGAE 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  316 HKVAGIPVEQASSWEEGYQGSITADVLKAFEGDAAalgRTQYFTIAHDGDNSSGRAGDGGTWANSGNVTYADSSVRGMGV 395
Cdd:cd11663    241 SRVVGVPVAQAESWEEGYQGQVTADALKPFEGLVP---QKQFFVIAHDGDNSSGRAGSEETWRNAGNVTYADSGVKGMGI 317
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....*.
gi 1815986075  396 DEYLKAYPIPADDIVHVQDGSWIDTRDSSADPTWYHWHIPMGVWRG 441
Cdd:cd11663    318 DEYLRTNTPAASDVVHVQDGSWIDTRDSSSDPQWHHWKLPFGIWKG 363
myxo_dep_M36 NF038112
myxosortase-dependent M36 family metallopeptidase; Members of this bacterial protein family ...
974-1320 1.86e-28

myxosortase-dependent M36 family metallopeptidase; Members of this bacterial protein family have an M36 family metallopeptidase domain, like fungalysin (see PF02128), and a C-terminal MYXO-CTERM domain (see TIGR03901), suggesting processing and surface-anchoring by the still-unknown putative transpeptidase, myxosortase. Members of this family include MXAN_3564 (mepA), part of the effector cargo of outer membrane vesicles that the species produces in large numbers during predation on other microbes.


Pssm-ID: 468355 [Multi-domain]  Cd Length: 1597  Bit Score: 125.15  E-value: 1.86e-28
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  974 NNAPVAVISPaSQSVAKGTVVTLSGAGSSDSDGSIASYSWS---------TGESTESIS-----VTVNDTQTISLTVTDN 1039
Cdd:NF038112  1185 NRRPVANAGP-DQTVLERTTVTLNGSGSFDPDGDPLTYAWTqvsgpavtlTGADTATPSftapeVTADTVLTFQLVVSDG 1263
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1040 QGKTATSSVTLTVI-PNKVPVASISPAnQTVAAGTTVTLDGAGsSDEDGSIASYLWS---------TGATTSSIS----- 1104
Cdd:NF038112  1264 TKTSAPDTVTVLVRnVNRAPVAVAGAP-ATVDERSTVTLDGSG-TDADGDALTYAWTqtsgpavtlTGATTATATftape 1341
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1105 VVVNASQTISLTVTDNegaAATAEAVLSVesdeKAKNFNQlyfRGTANGWATTAMDLVADNTWQAV-IDFDGQAeqrfkl 1183
Cdd:NF038112  1342 VTADTQLTFTLTVSDG---TASATDTVTV----TVRNVNR---APVANAGADQTVDERSTVTLSGSaTDPDGDA------ 1405
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1184 dVSGDWTQNYGD----TNSD-GVLEQTGGDIFTDVVGSYLLEVND-----QTLAYSITELNANQAPNAIIGASTTqVDIG 1253
Cdd:NF038112  1406 -LTYAWTQTAGPtvtlTGADtATASFTAPEVAADTELTFQLTVSAdgqasADVTVTVTVRNVNRAPVAHAGESIT-VDEG 1483
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1815986075 1254 QSITYSAAGsSDSDGVIASYLWSNGD------TSETTTVTYNTAGSNSIG------LTVTDDGGKTAQASVVVEVIDPN 1320
Cdd:NF038112  1484 STVTLDASA-TDPDGDTLTYAWTQVAgpsvtlTGADSAKLTFTAPEVSADttltfsLTVTDGSGSSGPVVVTVTVKNVN 1561
CBM_25 smart01066
Carbohydrate binding domain;
876-949 3.05e-14

Carbohydrate binding domain;


Pssm-ID: 198134 [Multi-domain]  Cd Length: 83  Bit Score: 69.31  E-value: 3.05e-14
                            10        20        30        40        50        60        70
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1815986075   876 TSTLYYQ---DTNGWGQVCLHYSVDGTvTWTTAPGEPMQSLGDNWYSLTVDLEDGNQLEFVTNNCSGAWDNNGGQNY 949
Cdd:smart01066    3 TVTVYYNgllATSGAKNVYLHYGFGEN-NWTDVPDVRMEKTGEGWVKATIPVKEAYKLNFCFKDGAGNWDNNGGANY 78
myxo_dep_M36 NF038112
myxosortase-dependent M36 family metallopeptidase; Members of this bacterial protein family ...
785-1057 5.34e-10

myxosortase-dependent M36 family metallopeptidase; Members of this bacterial protein family have an M36 family metallopeptidase domain, like fungalysin (see PF02128), and a C-terminal MYXO-CTERM domain (see TIGR03901), suggesting processing and surface-anchoring by the still-unknown putative transpeptidase, myxosortase. Members of this family include MXAN_3564 (mepA), part of the effector cargo of outer membrane vesicles that the species produces in large numbers during predation on other microbes.


Pssm-ID: 468355 [Multi-domain]  Cd Length: 1597  Bit Score: 64.29  E-value: 5.34e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  785 TDNVVVG----DKRPTALIDASGgEVELGTVVTLSAAGSsDEEGPIASYLWS---------TGETSPSIT-----VTLNE 846
Cdd:NF038112  1269 PDTVTVLvrnvNRAPVAVAGAPA-TVDERSTVTLDGSGT-DADGDALTYAWTqtsgpavtlTGATTATATftapeVTADT 1346
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  847 RTTISVVVTD-SVGQQASTSVTYRIIGQS-VTSTLYYQDTNGWGQVCLHYS---VDG---TVTWTTAPGEPMQSLGDNWY 918
Cdd:NF038112  1347 QLTFTLTVSDgTASATDTVTVTVRNVNRApVANAGADQTVDERSTVTLSGSatdPDGdalTYAWTQTAGPTVTLTGADTA 1426
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  919 SLTVDLEDGNQLEFVTNNcsgAWDNNGGQNyqidegdwnvagGAIVAGIPDGLDGNNAPVAViSPASQSVAKGTVVTLSG 998
Cdd:NF038112  1427 TASFTAPEVAADTELTFQ---LTVSADGQA------------SADVTVTVTVRNVNRAPVAH-AGESITVDEGSTVTLDA 1490
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1815986075  999 AGSsDSDGSIASYSWS---------TGESTESISVTVNDTQ-----TISLTVTDNQGKTATSSVTLTVIPNKV 1057
Cdd:NF038112  1491 SAT-DPDGDTLTYAWTqvagpsvtlTGADSAKLTFTAPEVSadttlTFSLTVTDGSGSSGPVVVTVTVKNVNR 1562
PKD_4 pfam18911
PKD domain; This entry is composed of PKD domains found in bacterial surface proteins.
1236-1316 6.69e-10

PKD domain; This entry is composed of PKD domains found in bacterial surface proteins.


Pssm-ID: 436824 [Multi-domain]  Cd Length: 85  Bit Score: 56.90  E-value: 6.69e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1236 NQAPNAIIGASTTqVDIGQSITYSAAGSSDSDGVIASYLWSNGD----TSETTTVTYNTAGSNSIGLTVTDD-GGKTAQA 1310
Cdd:pfam18911    1 NAAPVADAGGDRI-VAEGETVTFDASASDDPDGDILSYRWDFGDgttaTGANVSHTYAAPGTYTVTLTVTDDsGASNSTA 79

                   ....*.
gi 1815986075 1311 SVVVEV 1316
Cdd:pfam18911   80 TDTVTV 85
CBM53 pfam16760
Starch/carbohydrate-binding module (family 53);
878-952 6.74e-09

Starch/carbohydrate-binding module (family 53);


Pssm-ID: 465261 [Multi-domain]  Cd Length: 76  Bit Score: 53.84  E-value: 6.74e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  878 TLYYqDTNGWGQVCLHYSVDGtvtWTTAPGEPMQSL----GDNWYSLTVDL-EDGNQLEFVTNNCSGAWDNNGGQNYQID 952
Cdd:pfam16760    1 NIYY-NGSLAKEVYIHGGFNG---WKNVQDVPMEKLpptgGGDWFSATVPVpEDAYVLDFVFKDGAGNWDNNNGQNYHIP 76
COG4625 COG4625
Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function ...
806-1325 3.17e-07

Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function unknown];


Pssm-ID: 443664 [Multi-domain]  Cd Length: 900  Bit Score: 55.17  E-value: 3.17e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  806 VELGTVVTLSAAGSSDEEGPIASYLWSTGETSPSITVTLNERTTISVVVTDSVGQQASTSVTYRIIGQSVTSTLYYQDTN 885
Cdd:COG4625      1 GGGGGGGGGGGGGGGGTGGGGAGGGGGAGGGAGGGGAGGGGGGGGGGGGAGGGGGGGGTGGGGGGGGGGGGGGAGGGGGG 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  886 GwgqvclHYSVDGTVTWTTAPGEPMQSLGDNWYSLTVDLEDGNQLEFVTNNCSGAWDNNGGQNYQIDEGDWNVAGGAIVA 965
Cdd:COG4625     81 G------GGGGGGGGTGGVGGGGGGGGGGGGGGGGGGGGGGGGSAGGGGGGAGGAGGGGGGGAGGGGGGGGGGGAGGGGG 154
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  966 GIPDGLDGNNAPVAVISPASQSVAKGTVVTLSGAGSSDSDGSIASYSWSTGESTESISVTVNDTQTISLTVTDNQGKTAT 1045
Cdd:COG4625    155 GGAGGAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGNGGGGGGGGGGGGGGGGGGGGAGGGGGGGGGGGGGGGGGGGG 234
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1046 SSVTLTVIPNKVPVASISPANQTVAAGTTVTLDGAGSSDEDGSIASYLWSTGATTSSISVVVNASQTISLTVTDNEGAAA 1125
Cdd:COG4625    235 GGGGGGGGGGGGGAGGGGGGGGGNGGGGGAGGGGGGGGGGSGGGGGGGGGGGSGGGGGGGGGGGGGGGGGGGGGGGGGGG 314
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1126 TAEAVLSVESDEKAKNFNQLYFRGTANGWATTAMDLVADNTWQAVIDFDGQAEQRFKLDVSGDWTQNYGDTNSDGVLEQT 1205
Cdd:COG4625    315 GGGGGGGGGGGGGGGGGGGAGGGGGSGGAGAGGGGAGGGGAGGGGGGGTGGGGGGGGGGGGGSGGGGAGGGGGSGGGGGG 394
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1206 GGDIFTDVVGSYLLEVNDQTLAYSITELNANQAPNAIIGASTTQVDIGQSITYSAAGSSDSDGVIASYLWSNGDTSETTT 1285
Cdd:COG4625    395 GAGGGGGGGGAGGTGGGGAGGGGGAAGGGGGGTGAGGGGGGGGTGAGGGGATGGGGGGGGGAGGSGGGAGAGGGSGSGAG 474
                          490       500       510       520
                   ....*....|....*....|....*....|....*....|
gi 1815986075 1286 VTYNTAGSNSIGLTVTDDGGKTAQASVVVEVIDPNANFTS 1325
Cdd:COG4625    475 TLTLTGNNTYTGTTTVNGGGNYTQSAGSTLAVEVDAANSD 514
CBM_SusE-F_like_u1 cd12967
Uncharacterized subgroup of the CBM-SusE-F_like superfamily; The CBM SusE-F_like superfamily ...
1145-1231 6.91e-07

Uncharacterized subgroup of the CBM-SusE-F_like superfamily; The CBM SusE-F_like superfamily includes starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins have an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.


Pssm-ID: 240566  Cd Length: 91  Bit Score: 48.44  E-value: 6.91e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1145 LYFRGTANGWATTAMDLVA----DNTWQAVIDFDGQAEqrFKLDVSGDWTQ-NYGDTNSDGVLEQTGGDIFTDVVGSYLL 1219
Cdd:cd12967      2 LYVPGNYQGWNPATAPALYspngDGKYEGYVYLPGNFE--FKFTTAPNWDGdYGGDGGGGGLLDGGGGNIKAPEAGYYKV 79
                           90
                   ....*....|..
gi 1815986075 1220 EVNDQTLAYSIT 1231
Cdd:cd12967     80 TVDLNDLTYSLT 91
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
831-1115 2.66e-06

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 52.31  E-value: 2.66e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  831 WSTGEtSPSITVTLNERTTISVVVTDSVGQQASTSVTYriiGQSVTSTLYYQDTNGWGQVCLHySVDGTVTWTTAPGEPM 910
Cdd:NF033849   285 WSHTQ-STSESESTGQSSSVGTSESQSHGTTEGTSTTD---SSSHSQSSSYNVSSGTGVSSSH-SDGTSQSTSISHSESS 359
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  911 QSLGDNWYSLTVDLEDGNQLEFVTNNCSGawdNNGGqnyqidegdwnvAGGAIVAGipdGLDGNNAPVAVisPASQSVAK 990
Cdd:NF033849   360 SESTGTSVGHSTSSSVSSSESSSRSSSSG---VSGG------------FSGGIAGG---GVTSEGLGASQ--GGSEGWGS 419
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  991 GTVVTLSGAGSSDSDGSIASYSWSTGeSTESISVTVNDTQTISLTVTDNQGKTATSSVTltvipNKVPVASISPANQTVA 1070
Cdd:NF033849   420 GDSVQSVSQSYGSSSSTGTSSGHSDS-SSHSTSSGQADSVSQGTSWSEGTGTSQGQSVG-----TSESWSTSQSETDSVG 493
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|....*....
gi 1815986075 1071 --AGTTVTL-DGAGSSDEDGSIASYLWST-GATTSSISVVVNASQTISL 1115
Cdd:NF033849   494 dsTGTSESVsQGDGRSTGRSESQGTSLGTsGGRTSGAGGSMGLGPSISL 542
PKD smart00089
Repeats in polycystic kidney disease 1 (PKD1) and other proteins; Polycystic kidney disease 1 ...
1240-1315 1.27e-05

Repeats in polycystic kidney disease 1 (PKD1) and other proteins; Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.


Pssm-ID: 214510 [Multi-domain]  Cd Length: 79  Bit Score: 44.75  E-value: 1.27e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  1240 NAIIGASTTQVDIGQSITYSAagSSDSDGVIASYLWSNGD----TSETTTVTYNTAGSNSIGLTVTDDGGK-TAQASVVV 1314
Cdd:smart00089    1 VADVSASPTVGVAGESVTFTA--TSSDDGSIVSYTWDFGDgtssTGPTVTHTYTKPGTYTVTLTVTNAVGSaSATVTVVV 78

                    .
gi 1815986075  1315 E 1315
Cdd:smart00089   79 Q 79
CBM_SusE-F_like cd12956
carbohydrate-binding modules from Bacteroides thetaiotaomicron SusE, SusF and similar proteins; ...
1348-1416 4.84e-05

carbohydrate-binding modules from Bacteroides thetaiotaomicron SusE, SusF and similar proteins; This group includes five starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins contain an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. CBM-Fa (the CBM unique to SusF), does not bind insoluble starch; CBM-Fb and CBM-Fc both do, deletion of one or the other results in a decrease in the overall affinity of SusF for starch. Both CBM-Eb and CBM-Ec are needed for SusE to bind tightly to starch. CBM-Ec has an additional starch-binding loop that may mediate interactions with partially unwound single helical forms of starch or small starch-breakdown products. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.


Pssm-ID: 240562  Cd Length: 93  Bit Score: 43.50  E-value: 4.84e-05
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1348 ANNTWQVSVDFDGqnNQRFKFDLNGDWSTNYGDNNNDGT-LEQTGGDIFTGIVGSYVVEVNDATLQYRII 1416
Cdd:cd12956     26 TDGTFVSYATLAG--DGEIKFRPNNDWGENYGDDGDDGTfLSSGGDNIAVSAGGTYKITLNLNNNTYTIE 93
tand_rpt_95 NF012211
tandem-95 repeat; This 95-amino acid repeat occurs in tandem in proteins that may be several ...
987-1060 2.04e-04

tandem-95 repeat; This 95-amino acid repeat occurs in tandem in proteins that may be several thousand amino acids long.


Pssm-ID: 333740 [Multi-domain]  Cd Length: 98  Bit Score: 41.85  E-value: 2.04e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  987 SVAKGTVVTLSG----AGSSDSDGSIASYSWSTGESTESISVTVNDTQTISLT--------------VTDNQGKTATSSV 1048
Cdd:NF012211     1 STDEDGSITITQeqllANASDVDGDDLTVSNVSYSGPTNGTVTDNGDGTYTYTpnenfngddsftytVSDGTGATATATV 80
                           90
                   ....*....|...
gi 1815986075 1049 TLTVIP-NKVPVA 1060
Cdd:NF012211    81 FVTVNPvNDAPVA 93
PHA03255 PHA03255
BDLF3; Provisional
992-1138 5.43e-04

BDLF3; Provisional


Pssm-ID: 165513 [Multi-domain]  Cd Length: 234  Bit Score: 43.35  E-value: 5.43e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  992 TVVTLSGAGSSDSDGSIASYSWSTGESTESISVTVNDTQTISLTVTDNQGKTATSSVTLT---VIPNKVPVASISPANQT 1068
Cdd:PHA03255    25 TSSGSSTASAGNVTGTTAVTTPSPSASGPSTNQSTTLTTTSAPITTTAILSTNTTTVTSTgttVTPVPTTSNASTINVTT 104
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1815986075 1069 VAAGTTVTLDGAGSSDEDGSIASYLWSTGATTSSISVVVNAsqTISLTVTDNEGAAATAE--AVLSVESDEK 1138
Cdd:PHA03255   105 KVTAQNITATEAGTGTSTGVTSNVTTRSSSTTSATTRITNA--TTLAPTLSSKGTSNATKttAELPTVPDER 174
COG4625 COG4625
Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function ...
758-1120 7.51e-04

Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function unknown];


Pssm-ID: 443664 [Multi-domain]  Cd Length: 900  Bit Score: 44.00  E-value: 7.51e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  758 GAGQFSSVGGKTVEDVNGAIAGEAMFFTDNVVVGDKRPTALIDASGGEVELGTVVTLSAAGSSDEEGPIASYLWSTGETS 837
Cdd:COG4625    163 GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGNGGGGGGGGGGGGGGGGGGGGAGGGGGGGGGGGGGGGGGGGGGGGGGGGG 242
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  838 PSITVTLNERTTISVVVTDSVGQQASTSVTYRIIGQSVTSTLYYQDTNGWGQVCLHYSVDGTVTWTTAPGepmqSLGDNW 917
Cdd:COG4625    243 GGGGGAGGGGGGGGGNGGGGGAGGGGGGGGGGSGGGGGGGGGGGSGGGGGGGGGGGGGGGGGGGGGGGGG----GGGGGG 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  918 YSLTVDLEDGNQLEFVTNNCSGAWDNNGGQNYQIDEGDWNVAGGAIVAGIPDGLDGNNAPVAVISPASQSVAKGTVVTLS 997
Cdd:COG4625    319 GGGGGGGGGGGGGGGAGGGGGSGGAGAGGGGAGGGGAGGGGGGGTGGGGGGGGGGGGGSGGGGAGGGGGSGGGGGGGAGG 398
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  998 GAGSSDSDGSIASYSWSTGESTESISVTVNDTQTISLTVTDNQGKTATSSVTLTVIPNKVPVASISPANQTVAAGTTVTL 1077
Cdd:COG4625    399 GGGGGGAGGTGGGGAGGGGGAAGGGGGGTGAGGGGGGGGTGAGGGGATGGGGGGGGGAGGSGGGAGAGGGSGSGAGTLTL 478
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|...
gi 1815986075 1078 DGAGSSDEDGSIASYLWSTGATTSSISVVVNASQTISLTVTDN 1120
Cdd:COG4625    479 TGNNTYTGTTTVNGGGNYTQSAGSTLAVEVDAANSDRLVVTGT 521
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
985-1207 6.41e-03

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 41.14  E-value: 6.41e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  985 SQSVAKGTVVTLS---GAGSSDSDGSIASYSWSTGESTeSISVTVNDTQTISLTVTDNQGKTATSSVTLTV-IPNKVPVA 1060
Cdd:NF033849   322 SSSHSQSSSYNVSsgtGVSSSHSDGTSQSTSISHSESS-SESTGTSVGHSTSSSVSSSESSSRSSSSGVSGgFSGGIAGG 400
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1061 SISPANQTVAAGTTVtldGAGSSDEDGSIASYLWSTGATTSSISVVVNASQTISLTVTDNEGAAATAEAVLSVESDEKAk 1140
Cdd:NF033849   401 GVTSEGLGASQGGSE---GWGSGDSVQSVSQSYGSSSSTGTSSGHSDSSSHSTSSGQADSVSQGTSWSEGTGTSQGQSV- 476
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1141 nfnqlyfrGTANGWATTAM--DLVADNTwqavidfdGQAEQrfkldVSGDWTQNYGDTNSDGV-LEQTGG 1207
Cdd:NF033849   477 --------GTSESWSTSQSetDSVGDST--------GTSES-----VSQGDGRSTGRSESQGTsLGTSGG 525
 
Name Accession Description Interval E-value
GH119_BcIgtZ-like cd11663
putative catalytic domain of glycoside hydrolase family 119 (GH119); The prokaryotic subgroup ...
77-441 0e+00

putative catalytic domain of glycoside hydrolase family 119 (GH119); The prokaryotic subgroup is represented by IgtZ, an alpha-amylase from a Bacillus circulans strain. The GH119 family is related to GH57, a chiefly prokaryotic family with the majority of thermostable enzymes coming from extremophiles (many of these are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). GH57s cleave alpha-glycosidic bonds by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation.


Pssm-ID: 212128  Cd Length: 363  Bit Score: 587.69  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075   77 SGAPMPHDDLVSYYQHHAKKGAYLSWPMDTARNNNGNHPQSQTHVTMSASVINNVQSFGELGNLDGY-NLGWGAYWRDTQ 155
Cdd:cd11663      1 SGAPMPHDDLVSYYSHHAKTGAYLTWPWSVAQTLRTNHPQAQMHVTMSGAVVNNVNDLSTLGNVSGYsNTNWGAPWKNAY 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  156 NGTKTSGGYNALDTIHFSGHHTMGPLVGNDYFLKDLIYQNVTLAQDYFLGDSFKSSKGFFPTELGFSERIIPVLTKLGIE 235
Cdd:cd11663     81 NTLKTPAGNRTLDLIHFTGHHSMGPLVGNDYMLKDLIYQNATLAQPYFLGSDFQSSKGFFPTELGFSERIIPVLEKLGIQ 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  236 WSVLGNVHYSRTLRDYPYLNDPGKDTLISPPNRADLQNESNVGAWTELHMFNEQQVTYNKFPFASIPHWVQYIDPETGEQ 315
Cdd:cd11663    161 WSVIGNNHFSRTLRDYPLLNDPGSDTMVSPPNRADLQNVSTVGSWVSQPMFNEQQVVRNKYPFASTPHWVRYVDPATGAE 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  316 HKVAGIPVEQASSWEEGYQGSITADVLKAFEGDAAalgRTQYFTIAHDGDNSSGRAGDGGTWANSGNVTYADSSVRGMGV 395
Cdd:cd11663    241 SRVVGVPVAQAESWEEGYQGQVTADALKPFEGLVP---QKQFFVIAHDGDNSSGRAGSEETWRNAGNVTYADSGVKGMGI 317
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....*.
gi 1815986075  396 DEYLKAYPIPADDIVHVQDGSWIDTRDSSADPTWYHWHIPMGVWRG 441
Cdd:cd11663    318 DEYLRTNTPAASDVVHVQDGSWIDTRDSSSDPQWHHWKLPFGIWKG 363
GH57N_like cd01022
N-terminal catalytic domain of heat stable retaining glycoside hydrolase family 57; Glycoside ...
83-425 4.10e-44

N-terminal catalytic domain of heat stable retaining glycoside hydrolase family 57; Glycoside hydrolase family 57(GH57) is a chiefly prokaryotic family with the majority of thermostable enzymes coming from extremophiles (many of these are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57s cleave alpha-glycosidic bonds by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation.


Pssm-ID: 212096 [Multi-domain]  Cd Length: 313  Bit Score: 162.99  E-value: 4.10e-44
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075   83 HDDLVSYYQHHAKKGAYLSWPMDTARNNNGNhPQSQTHVTMSASVINNVQSFGELGNLDG-YNLGWGAYWRDTQngtktS 161
Cdd:cd01022     16 DQPLGEEWLHEAIAGCYIPLLELLEDLVDEG-PDPKVALTISGVLLEQLADPVVQKGFTSrYNDNVLDALKELV-----D 89
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  162 GGYnaLDTIHFSGHHTMGPLVGNdyflKDLIYQNVTLAQDYFLGDSFKSSKGFFPTELGFSERIIPVLTKLGIEWSVLGN 241
Cdd:cd01022     90 TGQ--VELLGCGYTHAYLPLLGP----KEDVRAQIEAGLDTFERLFGRRPKGVWLPECAYRPGLEKVLREAGIEYFVVDP 163
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  242 VHYSRTlrdypylndPGKDTLISPPNRADLQNESnvgawteLHMFNEQQVTYNKFPFASIPHWVQYidpetgeqhkvagi 321
Cdd:cd01022    164 DHFSRA---------GDGETAPHRPYWLPLGGRG-------LIIFARDQGLSQKIWFRSLGYPGDP-------------- 213
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  322 PVEQASSWEEGYQGSITADVLKAFegdaaalGRTQYFTIAHDGDNSSGRAGDG-GTWANSGNVTYADSSVRGMGVDEYLK 400
Cdd:cd01022    214 AMEQAEEHAADFAGYLERVLRELF-------GRPAVVVIALDGENFGHRWFEGvGFLRELLELLTSSEKLKLVTPSEYLE 286
                          330       340
                   ....*....|....*....|....*
gi 1815986075  401 AYPiPADDIVHVQDGSWIDTRDSSA 425
Cdd:cd01022    287 ALE-PRGGVVELADGSWGAGGDFSI 310
myxo_dep_M36 NF038112
myxosortase-dependent M36 family metallopeptidase; Members of this bacterial protein family ...
974-1320 1.86e-28

myxosortase-dependent M36 family metallopeptidase; Members of this bacterial protein family have an M36 family metallopeptidase domain, like fungalysin (see PF02128), and a C-terminal MYXO-CTERM domain (see TIGR03901), suggesting processing and surface-anchoring by the still-unknown putative transpeptidase, myxosortase. Members of this family include MXAN_3564 (mepA), part of the effector cargo of outer membrane vesicles that the species produces in large numbers during predation on other microbes.


Pssm-ID: 468355 [Multi-domain]  Cd Length: 1597  Bit Score: 125.15  E-value: 1.86e-28
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  974 NNAPVAVISPaSQSVAKGTVVTLSGAGSSDSDGSIASYSWS---------TGESTESIS-----VTVNDTQTISLTVTDN 1039
Cdd:NF038112  1185 NRRPVANAGP-DQTVLERTTVTLNGSGSFDPDGDPLTYAWTqvsgpavtlTGADTATPSftapeVTADTVLTFQLVVSDG 1263
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1040 QGKTATSSVTLTVI-PNKVPVASISPAnQTVAAGTTVTLDGAGsSDEDGSIASYLWS---------TGATTSSIS----- 1104
Cdd:NF038112  1264 TKTSAPDTVTVLVRnVNRAPVAVAGAP-ATVDERSTVTLDGSG-TDADGDALTYAWTqtsgpavtlTGATTATATftape 1341
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1105 VVVNASQTISLTVTDNegaAATAEAVLSVesdeKAKNFNQlyfRGTANGWATTAMDLVADNTWQAV-IDFDGQAeqrfkl 1183
Cdd:NF038112  1342 VTADTQLTFTLTVSDG---TASATDTVTV----TVRNVNR---APVANAGADQTVDERSTVTLSGSaTDPDGDA------ 1405
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1184 dVSGDWTQNYGD----TNSD-GVLEQTGGDIFTDVVGSYLLEVND-----QTLAYSITELNANQAPNAIIGASTTqVDIG 1253
Cdd:NF038112  1406 -LTYAWTQTAGPtvtlTGADtATASFTAPEVAADTELTFQLTVSAdgqasADVTVTVTVRNVNRAPVAHAGESIT-VDEG 1483
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1815986075 1254 QSITYSAAGsSDSDGVIASYLWSNGD------TSETTTVTYNTAGSNSIG------LTVTDDGGKTAQASVVVEVIDPN 1320
Cdd:NF038112  1484 STVTLDASA-TDPDGDTLTYAWTQVAgpsvtlTGADSAKLTFTAPEVSADttltfsLTVTDGSGSSGPVVVTVTVKNVN 1561
CBM_25 smart01066
Carbohydrate binding domain;
876-949 3.05e-14

Carbohydrate binding domain;


Pssm-ID: 198134 [Multi-domain]  Cd Length: 83  Bit Score: 69.31  E-value: 3.05e-14
                            10        20        30        40        50        60        70
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1815986075   876 TSTLYYQ---DTNGWGQVCLHYSVDGTvTWTTAPGEPMQSLGDNWYSLTVDLEDGNQLEFVTNNCSGAWDNNGGQNY 949
Cdd:smart01066    3 TVTVYYNgllATSGAKNVYLHYGFGEN-NWTDVPDVRMEKTGEGWVKATIPVKEAYKLNFCFKDGAGNWDNNGGANY 78
myxo_dep_M36 NF038112
myxosortase-dependent M36 family metallopeptidase; Members of this bacterial protein family ...
785-1057 5.34e-10

myxosortase-dependent M36 family metallopeptidase; Members of this bacterial protein family have an M36 family metallopeptidase domain, like fungalysin (see PF02128), and a C-terminal MYXO-CTERM domain (see TIGR03901), suggesting processing and surface-anchoring by the still-unknown putative transpeptidase, myxosortase. Members of this family include MXAN_3564 (mepA), part of the effector cargo of outer membrane vesicles that the species produces in large numbers during predation on other microbes.


Pssm-ID: 468355 [Multi-domain]  Cd Length: 1597  Bit Score: 64.29  E-value: 5.34e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  785 TDNVVVG----DKRPTALIDASGgEVELGTVVTLSAAGSsDEEGPIASYLWS---------TGETSPSIT-----VTLNE 846
Cdd:NF038112  1269 PDTVTVLvrnvNRAPVAVAGAPA-TVDERSTVTLDGSGT-DADGDALTYAWTqtsgpavtlTGATTATATftapeVTADT 1346
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  847 RTTISVVVTD-SVGQQASTSVTYRIIGQS-VTSTLYYQDTNGWGQVCLHYS---VDG---TVTWTTAPGEPMQSLGDNWY 918
Cdd:NF038112  1347 QLTFTLTVSDgTASATDTVTVTVRNVNRApVANAGADQTVDERSTVTLSGSatdPDGdalTYAWTQTAGPTVTLTGADTA 1426
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  919 SLTVDLEDGNQLEFVTNNcsgAWDNNGGQNyqidegdwnvagGAIVAGIPDGLDGNNAPVAViSPASQSVAKGTVVTLSG 998
Cdd:NF038112  1427 TASFTAPEVAADTELTFQ---LTVSADGQA------------SADVTVTVTVRNVNRAPVAH-AGESITVDEGSTVTLDA 1490
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1815986075  999 AGSsDSDGSIASYSWS---------TGESTESISVTVNDTQ-----TISLTVTDNQGKTATSSVTLTVIPNKV 1057
Cdd:NF038112  1491 SAT-DPDGDTLTYAWTqvagpsvtlTGADSAKLTFTAPEVSadttlTFSLTVTDGSGSSGPVVVTVTVKNVNR 1562
PKD_4 pfam18911
PKD domain; This entry is composed of PKD domains found in bacterial surface proteins.
1236-1316 6.69e-10

PKD domain; This entry is composed of PKD domains found in bacterial surface proteins.


Pssm-ID: 436824 [Multi-domain]  Cd Length: 85  Bit Score: 56.90  E-value: 6.69e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1236 NQAPNAIIGASTTqVDIGQSITYSAAGSSDSDGVIASYLWSNGD----TSETTTVTYNTAGSNSIGLTVTDD-GGKTAQA 1310
Cdd:pfam18911    1 NAAPVADAGGDRI-VAEGETVTFDASASDDPDGDILSYRWDFGDgttaTGANVSHTYAAPGTYTVTLTVTDDsGASNSTA 79

                   ....*.
gi 1815986075 1311 SVVVEV 1316
Cdd:pfam18911   80 TDTVTV 85
PKD_4 pfam18911
PKD domain; This entry is composed of PKD domains found in bacterial surface proteins.
974-1052 1.60e-09

PKD domain; This entry is composed of PKD domains found in bacterial surface proteins.


Pssm-ID: 436824 [Multi-domain]  Cd Length: 85  Bit Score: 55.74  E-value: 1.60e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  974 NNAPVAVISPaSQSVAKGTVVTLSGAGSSDSDGSIASYSWSTGESTESISVTV------NDTQTISLTVTDNQG-KTATS 1046
Cdd:pfam18911    1 NAAPVADAGG-DRIVAEGETVTFDASASDDPDGDILSYRWDFGDGTTATGANVshtyaaPGTYTVTLTVTDDSGaSNSTA 79

                   ....*.
gi 1815986075 1047 SVTLTV 1052
Cdd:pfam18911   80 TDTVTV 85
CBM53 pfam16760
Starch/carbohydrate-binding module (family 53);
878-952 6.74e-09

Starch/carbohydrate-binding module (family 53);


Pssm-ID: 465261 [Multi-domain]  Cd Length: 76  Bit Score: 53.84  E-value: 6.74e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  878 TLYYqDTNGWGQVCLHYSVDGtvtWTTAPGEPMQSL----GDNWYSLTVDL-EDGNQLEFVTNNCSGAWDNNGGQNYQID 952
Cdd:pfam16760    1 NIYY-NGSLAKEVYIHGGFNG---WKNVQDVPMEKLpptgGGDWFSATVPVpEDAYVLDFVFKDGAGNWDNNNGQNYHIP 76
COG4625 COG4625
Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function ...
806-1325 3.17e-07

Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function unknown];


Pssm-ID: 443664 [Multi-domain]  Cd Length: 900  Bit Score: 55.17  E-value: 3.17e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  806 VELGTVVTLSAAGSSDEEGPIASYLWSTGETSPSITVTLNERTTISVVVTDSVGQQASTSVTYRIIGQSVTSTLYYQDTN 885
Cdd:COG4625      1 GGGGGGGGGGGGGGGGTGGGGAGGGGGAGGGAGGGGAGGGGGGGGGGGGAGGGGGGGGTGGGGGGGGGGGGGGAGGGGGG 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  886 GwgqvclHYSVDGTVTWTTAPGEPMQSLGDNWYSLTVDLEDGNQLEFVTNNCSGAWDNNGGQNYQIDEGDWNVAGGAIVA 965
Cdd:COG4625     81 G------GGGGGGGGTGGVGGGGGGGGGGGGGGGGGGGGGGGGSAGGGGGGAGGAGGGGGGGAGGGGGGGGGGGAGGGGG 154
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  966 GIPDGLDGNNAPVAVISPASQSVAKGTVVTLSGAGSSDSDGSIASYSWSTGESTESISVTVNDTQTISLTVTDNQGKTAT 1045
Cdd:COG4625    155 GGAGGAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGNGGGGGGGGGGGGGGGGGGGGAGGGGGGGGGGGGGGGGGGGG 234
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1046 SSVTLTVIPNKVPVASISPANQTVAAGTTVTLDGAGSSDEDGSIASYLWSTGATTSSISVVVNASQTISLTVTDNEGAAA 1125
Cdd:COG4625    235 GGGGGGGGGGGGGAGGGGGGGGGNGGGGGAGGGGGGGGGGSGGGGGGGGGGGSGGGGGGGGGGGGGGGGGGGGGGGGGGG 314
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1126 TAEAVLSVESDEKAKNFNQLYFRGTANGWATTAMDLVADNTWQAVIDFDGQAEQRFKLDVSGDWTQNYGDTNSDGVLEQT 1205
Cdd:COG4625    315 GGGGGGGGGGGGGGGGGGGAGGGGGSGGAGAGGGGAGGGGAGGGGGGGTGGGGGGGGGGGGGSGGGGAGGGGGSGGGGGG 394
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1206 GGDIFTDVVGSYLLEVNDQTLAYSITELNANQAPNAIIGASTTQVDIGQSITYSAAGSSDSDGVIASYLWSNGDTSETTT 1285
Cdd:COG4625    395 GAGGGGGGGGAGGTGGGGAGGGGGAAGGGGGGTGAGGGGGGGGTGAGGGGATGGGGGGGGGAGGSGGGAGAGGGSGSGAG 474
                          490       500       510       520
                   ....*....|....*....|....*....|....*....|
gi 1815986075 1286 VTYNTAGSNSIGLTVTDDGGKTAQASVVVEVIDPNANFTS 1325
Cdd:COG4625    475 TLTLTGNNTYTGTTTVNGGGNYTQSAGSTLAVEVDAANSD 514
AidA COG3468
Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular ...
991-1403 3.37e-07

Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442691 [Multi-domain]  Cd Length: 846  Bit Score: 54.95  E-value: 3.37e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  991 GTVVTLSGAGSSDSDGSIASYSWSTGESTESISVTVNDTQTISLTVTDNQGKTATSSVTLTVIPNKVPVASISPANQTVA 1070
Cdd:COG3468     18 GGGGGLGGTGGGNAGLGIGNGGGGGAASGSGAGGVAGNGGGGGGGAGGGGGGAGSGGGLAGAGSGGTGGNSTGGGGGNSG 97
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1071 AGTTVTLDGAGSSDEDGSIASYLWSTGATTSSISVVVNASQTISLTVTDNEGAAATAEAVLSVESDEKAKNFNQLYFRGT 1150
Cdd:COG3468     98 TGGTGGGGGGGGSGNGGGGGGGGGGGGTGGGGGGGTGSAGGGGGGGGGGTGVGGTGAAAAGGGTGSGGGGSGGGGGAGGG 177
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1151 ANGWATTAMDLVADNTWQAVIDFDGQAEQRFKLDVSGDWTQNYGDTNSDGVLEQTGGDIFTDVVGSYLLEVNDQTLAYSI 1230
Cdd:COG3468    178 GGGGAGGSGGAGSTGSGAGGGGGGSGGGGGAAGTGGGGGGGGGAGGATGGAGSGGNTGGGVGGGGGSAGGTGGGGLTGGG 257
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1231 TELNANQAPNAIIGASTTQVDIGQSITYSAAGSSDSDGVIASYLWSNGDTSETTTVTYNTAGSNSIGLTVTDDGGKTAQA 1310
Cdd:COG3468    258 AAGTGGGGGGTGTGSGGGGGGGANGGGSGGGGGASGTGGGGTASTGGGGGGGGGNGGGGGGGSNAGGGSGGGGGGGGGGG 337
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1311 SVVVEVIDPNANFTSNFEQLYFRGTANAWQTSAMSLVANNTWQVSVDFDGQNNQRFKFDLNGDWSTNYGDNNNDGTLEQT 1390
Cdd:COG3468    338 GGGTTLNGAGSAGGGTGAALAGTGGSGSGGGGGGGSGGGGGAGGGGANTGSDGVGTGLTTGGTGNNGGGGVGGGGGGGLT 417
                          410
                   ....*....|...
gi 1815986075 1391 GGDIFTGIVGSYV 1403
Cdd:COG3468    418 LTGGTLTVNGNYT 430
CBM_SusE-F_like_u1 cd12967
Uncharacterized subgroup of the CBM-SusE-F_like superfamily; The CBM SusE-F_like superfamily ...
1145-1231 6.91e-07

Uncharacterized subgroup of the CBM-SusE-F_like superfamily; The CBM SusE-F_like superfamily includes starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins have an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.


Pssm-ID: 240566  Cd Length: 91  Bit Score: 48.44  E-value: 6.91e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1145 LYFRGTANGWATTAMDLVA----DNTWQAVIDFDGQAEqrFKLDVSGDWTQ-NYGDTNSDGVLEQTGGDIFTDVVGSYLL 1219
Cdd:cd12967      2 LYVPGNYQGWNPATAPALYspngDGKYEGYVYLPGNFE--FKFTTAPNWDGdYGGDGGGGGLLDGGGGNIKAPEAGYYKV 79
                           90
                   ....*....|..
gi 1815986075 1220 EVNDQTLAYSIT 1231
Cdd:cd12967     80 TVDLNDLTYSLT 91
AidA COG3468
Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular ...
942-1373 1.01e-06

Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442691 [Multi-domain]  Cd Length: 846  Bit Score: 53.41  E-value: 1.01e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  942 DNNGGQNYQIDEGDWNVAGGAIVAGIPDGLDGNNAPVAVISPASQSVAKGTVVTLSGAGSSDSDGSIASYSWSTGESTES 1021
Cdd:COG3468     10 TGLGGGGTGGGGGLGGTGGGNAGLGIGNGGGGGAASGSGAGGVAGNGGGGGGGAGGGGGGAGSGGGLAGAGSGGTGGNST 89
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1022 ISVTVNDTQTISLTVTDNQGKTATSSVTLTVIPNKVPVASISPANQTVAAGTTVTLDGAGSSDEDGSIASYLWSTGATTS 1101
Cdd:COG3468     90 GGGGGNSGTGGTGGGGGGGGSGNGGGGGGGGGGGGTGGGGGGGTGSAGGGGGGGGGGTGVGGTGAAAAGGGTGSGGGGSG 169
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1102 SISVVVNASQTISLTVTDNEGAA--ATAEAVLSVESDEKAKNFNQLYFRGTANGWATTAMDLVADNTWQAVIDFDGQAEQ 1179
Cdd:COG3468    170 GGGGAGGGGGGGAGGSGGAGSTGsgAGGGGGGSGGGGGAAGTGGGGGGGGGAGGATGGAGSGGNTGGGVGGGGGSAGGTG 249
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1180 RFKLDVSGDWTQNYGDTNSDGVLEQTGGDIFTDVVGSYLLEVNDQTLAYSITELNANQAPNAIIGASTTQVDIGQSITYS 1259
Cdd:COG3468    250 GGGLTGGGAAGTGGGGGGTGTGSGGGGGGGANGGGSGGGGGASGTGGGGTASTGGGGGGGGGNGGGGGGGSNAGGGSGGG 329
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1260 AAGSSDSDGVIASYLWSNGDTSETT-TVTYNTAGSNSIGLTVTDDGGKTAQASVVVEVIDPNANFTSNFEQLYFRGTANA 1338
Cdd:COG3468    330 GGGGGGGGGGGTTLNGAGSAGGGTGaALAGTGGSGSGGGGGGGSGGGGGAGGGGANTGSDGVGTGLTTGGTGNNGGGGVG 409
                          410       420       430
                   ....*....|....*....|....*....|....*.
gi 1815986075 1339 WQTSAMSLVANNTWQVSVDFDGQN-NQRFKFDLNGD 1373
Cdd:COG3468    410 GGGGGGLTLTGGTLTVNGNYTGNNgTLVLNTVLGDD 445
COG4625 COG4625
Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function ...
957-1410 1.46e-06

Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function unknown];


Pssm-ID: 443664 [Multi-domain]  Cd Length: 900  Bit Score: 52.86  E-value: 1.46e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  957 NVAGGAIVAGIPDGLDGNNAPVAVISPASQSVAKGTVVTLSGAGSSDSDGSIASYSWSTGESTESISVTVNDTQTISLTV 1036
Cdd:COG4625     57 GGTGGGGGGGGGGGGGGAGGGGGGGGGGGGGGGTGGVGGGGGGGGGGGGGGGGGGGGGGGGSAGGGGGGAGGAGGGGGGG 136
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1037 TDNQGKTATSSVTLTVIPNKVPVASISPANQTVAAGTTVTLDGAGSSDEDGSIASYLWSTGATTSSISVVVNASQ--TIS 1114
Cdd:COG4625    137 AGGGGGGGGGGGAGGGGGGGAGGAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGNGGGGGGGGGGGGGGGGGGggAGG 216
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1115 LTVTDNEGAAATAEAVLSVESDEKAKNFNQLYFRGTANGWATTAMDLVADNTWQAVIDFDGQAEQRFKLDVSGDWTQNYG 1194
Cdd:COG4625    217 GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGAGGGGGGGGGNGGGGGAGGGGGGGGGGSGGGGGGGGGGGSGGGGGGGGG 296
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1195 DTNSDGVLEQTGGDIFTDVVGSYLLEVNDQTLAYSITELNANQAPNAIIGASTTQVDIGQSITYSAAGSSDSDGVIASYL 1274
Cdd:COG4625    297 GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGAGGGGGSGGAGAGGGGAGGGGAGGGGGGGTGGGGGGGGGGGGG 376
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1275 WSNGDTSETTTVTYNTAGSNSIGLTVTDDGGKTAQASVVVEVIDPNANFTSNFEQLYFRGTANAWQTSAMSLVANNTWQV 1354
Cdd:COG4625    377 SGGGGAGGGGGSGGGGGGGAGGGGGGGGAGGTGGGGAGGGGGAAGGGGGGTGAGGGGGGGGTGAGGGGATGGGGGGGGGA 456
                          410       420       430       440       450
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1815986075 1355 SVDFDGQNNQRFKFDLNGDWSTNYGDNNNDGTLEQTGGDIFTGIVGSYVVEVNDAT 1410
Cdd:COG4625    457 GGSGGGAGAGGGSGSGAGTLTLTGNNTYTGTTTVNGGGNYTQSAGSTLAVEVDAAN 512
CBM_25 pfam03423
Carbohydrate binding domain (family 25);
877-968 1.61e-06

Carbohydrate binding domain (family 25);


Pssm-ID: 367491  Cd Length: 101  Bit Score: 47.87  E-value: 1.61e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  877 STLYYqdTNGWGQVCLHYSVDGTvTWTTAPGEPMQS-LGDNWYSLTVDLEDGNQLEFVTNNCSGAWDNNGGQNYQIDEGD 955
Cdd:pfam03423    6 VTIYY--KKGFNSPYIHYRPAGG-SWTAAPGVKMQDaEISGYAKITVDIGSASQLEAAFNDGNNNWDSNNTKNYSFSTGT 82
                           90
                   ....*....|...
gi 1815986075  956 WNVAGGAIVAGIP 968
Cdd:pfam03423   83 STYTPGNSGNAGT 95
AidA COG3468
Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular ...
911-1326 2.62e-06

Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442691 [Multi-domain]  Cd Length: 846  Bit Score: 52.26  E-value: 2.62e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  911 QSLGDNWYSLTVDLEDGNQLEFVTNNCSGAWDNNGGQNYQIDEGDWNVAGGAIVAGIPDGLDGNNAPVAVISPASQSVAK 990
Cdd:COG3468     42 GAASGSGAGGVAGNGGGGGGGAGGGGGGAGSGGGLAGAGSGGTGGNSTGGGGGNSGTGGTGGGGGGGGSGNGGGGGGGGG 121
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  991 GTVVTLSGAGSSDSDGSIASYSwstgestesISVTVNDTQTISLTVTDNQGKTATSSVTLTVIPNKVPVASISPANQTVA 1070
Cdd:COG3468    122 GGGTGGGGGGGTGSAGGGGGGG---------GGGTGVGGTGAAAAGGGTGSGGGGSGGGGGAGGGGGGGAGGSGGAGSTG 192
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1071 AGTTVTLDGAGSSDEDGSIASYLWSTGATTSSISVVVNASQTISLTVTDNEGAAATAEAVLSVESDEKAKNFNQLYFRGT 1150
Cdd:COG3468    193 SGAGGGGGGSGGGGGAAGTGGGGGGGGGAGGATGGAGSGGNTGGGVGGGGGSAGGTGGGGLTGGGAAGTGGGGGGTGTGS 272
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1151 ANGWATTAMDLVADNTWQAVIDFDGqaeqrfkldvsgdwtqNYGDTNSDGVLEQTGGDIFTDVVGSYLLEVNDQTLAYSI 1230
Cdd:COG3468    273 GGGGGGGANGGGSGGGGGASGTGGG----------------GTASTGGGGGGGGGNGGGGGGGSNAGGGSGGGGGGGGGG 336
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1231 TELNANQAPNAIIGASTTQVDIGQSITYSAAGSSDSDGVIASYLWSNGDTSETTTVTYNTAGSNSIGLTVTDDGGKTAQA 1310
Cdd:COG3468    337 GGGGTTLNGAGSAGGGTGAALAGTGGSGSGGGGGGGSGGGGGAGGGGANTGSDGVGTGLTTGGTGNNGGGGVGGGGGGGL 416
                          410
                   ....*....|....*.
gi 1815986075 1311 SVVVEVIDPNANFTSN 1326
Cdd:COG3468    417 TLTGGTLTVNGNYTGN 432
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
831-1115 2.66e-06

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 52.31  E-value: 2.66e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  831 WSTGEtSPSITVTLNERTTISVVVTDSVGQQASTSVTYriiGQSVTSTLYYQDTNGWGQVCLHySVDGTVTWTTAPGEPM 910
Cdd:NF033849   285 WSHTQ-STSESESTGQSSSVGTSESQSHGTTEGTSTTD---SSSHSQSSSYNVSSGTGVSSSH-SDGTSQSTSISHSESS 359
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  911 QSLGDNWYSLTVDLEDGNQLEFVTNNCSGawdNNGGqnyqidegdwnvAGGAIVAGipdGLDGNNAPVAVisPASQSVAK 990
Cdd:NF033849   360 SESTGTSVGHSTSSSVSSSESSSRSSSSG---VSGG------------FSGGIAGG---GVTSEGLGASQ--GGSEGWGS 419
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  991 GTVVTLSGAGSSDSDGSIASYSWSTGeSTESISVTVNDTQTISLTVTDNQGKTATSSVTltvipNKVPVASISPANQTVA 1070
Cdd:NF033849   420 GDSVQSVSQSYGSSSSTGTSSGHSDS-SSHSTSSGQADSVSQGTSWSEGTGTSQGQSVG-----TSESWSTSQSETDSVG 493
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|....*....
gi 1815986075 1071 --AGTTVTL-DGAGSSDEDGSIASYLWST-GATTSSISVVVNASQTISL 1115
Cdd:NF033849   494 dsTGTSESVsQGDGRSTGRSESQGTSLGTsGGRTSGAGGSMGLGPSISL 542
CBM26 pfam16738
Starch-binding module 26; CBM26 is a carbohydrate-binding module that binds starch.
880-948 7.27e-06

Starch-binding module 26; CBM26 is a carbohydrate-binding module that binds starch.


Pssm-ID: 407007  Cd Length: 68  Bit Score: 44.96  E-value: 7.27e-06
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1815986075  880 YYQDTNGWGQVCLHY-----SVDGTVTWttaPGEPMQSLGDNWYSLTVDLEDGNQLEFvtnncsgawdNNGGQN 948
Cdd:pfam16738    2 YFKNPSGWGTPYIYYwddspGSSVGLAW---PGVAMTDEGNGWYSYTLPGATSAKVIF----------NNGGGN 62
AidA COG3468
Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular ...
848-1296 7.64e-06

Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442691 [Multi-domain]  Cd Length: 846  Bit Score: 50.72  E-value: 7.64e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  848 TTISVVVTDSVGQQASTSVTYRIIGQSVTSTLYYQDTNGwgqvclhYSVDGTVTWTTAPGEPMQSLGDNWYSLTVDLEDG 927
Cdd:COG3468      3 SGGGGGATGLGGGGTGGGGGLGGTGGGNAGLGIGNGGGG-------GAASGSGAGGVAGNGGGGGGGAGGGGGGAGSGGG 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  928 NQLEFVTNNCSGAWDNNGGQNYqiDEGDWNVAGGAIVAGIPDGLDGNNAPVAVISPASQSVAKGTVVTLSGAGSSDSDGS 1007
Cdd:COG3468     76 LAGAGSGGTGGNSTGGGGGNSG--TGGTGGGGGGGGSGNGGGGGGGGGGGGTGGGGGGGTGSAGGGGGGGGGGTGVGGTG 153
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1008 IASYSWSTGESTESISVTVNDTQTISLTVTDNQGKTATSSVTLTVipnkvpvASISPANQTVAAGTTVTLDGAGSSDEDG 1087
Cdd:COG3468    154 AAAAGGGTGSGGGGSGGGGGAGGGGGGGAGGSGGAGSTGSGAGGG-------GGGSGGGGGAAGTGGGGGGGGGAGGATG 226
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1088 SIASYLWSTGATTSSISVVVNASQTISLTVTDNEGAAATAEAVLSVESDEKAKNFNQLYFRGTANGWATTAMDLVADNTW 1167
Cdd:COG3468    227 GAGSGGNTGGGVGGGGGSAGGTGGGGLTGGGAAGTGGGGGGTGTGSGGGGGGGANGGGSGGGGGASGTGGGGTASTGGGG 306
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1168 QAVIDFDGQAEQRFKLDVSGDWTQNYGDTNSDGVLEQTGGDIFTDVVGSYLLEVNDQTLAYSITELNANQAPNAIIGAST 1247
Cdd:COG3468    307 GGGGGNGGGGGGGSNAGGGSGGGGGGGGGGGGGGTTLNGAGSAGGGTGAALAGTGGSGSGGGGGGGSGGGGGAGGGGANT 386
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....*....
gi 1815986075 1248 TQVDIGQSITYSAAGSSDSDGVIASYLWSNGDTSETTTVTYNTAGSNSI 1296
Cdd:COG3468    387 GSDGVGTGLTTGGTGNNGGGGVGGGGGGGLTLTGGTLTVNGNYTGNNGT 435
PKD smart00089
Repeats in polycystic kidney disease 1 (PKD1) and other proteins; Polycystic kidney disease 1 ...
1240-1315 1.27e-05

Repeats in polycystic kidney disease 1 (PKD1) and other proteins; Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.


Pssm-ID: 214510 [Multi-domain]  Cd Length: 79  Bit Score: 44.75  E-value: 1.27e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  1240 NAIIGASTTQVDIGQSITYSAagSSDSDGVIASYLWSNGD----TSETTTVTYNTAGSNSIGLTVTDDGGK-TAQASVVV 1314
Cdd:smart00089    1 VADVSASPTVGVAGESVTFTA--TSSDDGSIVSYTWDFGDgtssTGPTVTHTYTKPGTYTVTLTVTNAVGSaSATVTVVV 78

                    .
gi 1815986075  1315 E 1315
Cdd:smart00089   79 Q 79
CBM_SusE-F_like cd12956
carbohydrate-binding modules from Bacteroides thetaiotaomicron SusE, SusF and similar proteins; ...
1348-1416 4.84e-05

carbohydrate-binding modules from Bacteroides thetaiotaomicron SusE, SusF and similar proteins; This group includes five starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins contain an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. CBM-Fa (the CBM unique to SusF), does not bind insoluble starch; CBM-Fb and CBM-Fc both do, deletion of one or the other results in a decrease in the overall affinity of SusF for starch. Both CBM-Eb and CBM-Ec are needed for SusE to bind tightly to starch. CBM-Ec has an additional starch-binding loop that may mediate interactions with partially unwound single helical forms of starch or small starch-breakdown products. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.


Pssm-ID: 240562  Cd Length: 93  Bit Score: 43.50  E-value: 4.84e-05
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1348 ANNTWQVSVDFDGqnNQRFKFDLNGDWSTNYGDNNNDGT-LEQTGGDIFTGIVGSYVVEVNDATLQYRII 1416
Cdd:cd12956     26 TDGTFVSYATLAG--DGEIKFRPNNDWGENYGDDGDDGTfLSSGGDNIAVSAGGTYKITLNLNNNTYTIE 93
tand_rpt_95 NF012211
tandem-95 repeat; This 95-amino acid repeat occurs in tandem in proteins that may be several ...
987-1060 2.04e-04

tandem-95 repeat; This 95-amino acid repeat occurs in tandem in proteins that may be several thousand amino acids long.


Pssm-ID: 333740 [Multi-domain]  Cd Length: 98  Bit Score: 41.85  E-value: 2.04e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  987 SVAKGTVVTLSG----AGSSDSDGSIASYSWSTGESTESISVTVNDTQTISLT--------------VTDNQGKTATSSV 1048
Cdd:NF012211     1 STDEDGSITITQeqllANASDVDGDDLTVSNVSYSGPTNGTVTDNGDGTYTYTpnenfngddsftytVSDGTGATATATV 80
                           90
                   ....*....|...
gi 1815986075 1049 TLTVIP-NKVPVA 1060
Cdd:NF012211    81 FVTVNPvNDAPVA 93
PKD cd00146
polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an ...
1250-1316 2.67e-04

polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases.


Pssm-ID: 238084 [Multi-domain]  Cd Length: 81  Bit Score: 40.94  E-value: 2.67e-04
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1815986075 1250 VDIGQSITYSAagSSDSDGVIASYLWSNGD------TSETTTVTYNTAGSNSIGLTVTDDGGKTAQASVVVEV 1316
Cdd:cd00146     11 AELGASVTFSA--SDSSGGSIVSYKWDFGDgevsssGEPTVTHTYTKPGTYTVTLTVTNAVGSSSTKTTTVVV 81
PHA03255 PHA03255
BDLF3; Provisional
992-1138 5.43e-04

BDLF3; Provisional


Pssm-ID: 165513 [Multi-domain]  Cd Length: 234  Bit Score: 43.35  E-value: 5.43e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  992 TVVTLSGAGSSDSDGSIASYSWSTGESTESISVTVNDTQTISLTVTDNQGKTATSSVTLT---VIPNKVPVASISPANQT 1068
Cdd:PHA03255    25 TSSGSSTASAGNVTGTTAVTTPSPSASGPSTNQSTTLTTTSAPITTTAILSTNTTTVTSTgttVTPVPTTSNASTINVTT 104
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1815986075 1069 VAAGTTVTLDGAGSSDEDGSIASYLWSTGATTSSISVVVNAsqTISLTVTDNEGAAATAE--AVLSVESDEK 1138
Cdd:PHA03255   105 KVTAQNITATEAGTGTSTGVTSNVTTRSSSTTSATTRITNA--TTLAPTLSSKGTSNATKttAELPTVPDER 174
COG4625 COG4625
Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function ...
758-1120 7.51e-04

Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function unknown];


Pssm-ID: 443664 [Multi-domain]  Cd Length: 900  Bit Score: 44.00  E-value: 7.51e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  758 GAGQFSSVGGKTVEDVNGAIAGEAMFFTDNVVVGDKRPTALIDASGGEVELGTVVTLSAAGSSDEEGPIASYLWSTGETS 837
Cdd:COG4625    163 GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGNGGGGGGGGGGGGGGGGGGGGAGGGGGGGGGGGGGGGGGGGGGGGGGGGG 242
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  838 PSITVTLNERTTISVVVTDSVGQQASTSVTYRIIGQSVTSTLYYQDTNGWGQVCLHYSVDGTVTWTTAPGepmqSLGDNW 917
Cdd:COG4625    243 GGGGGAGGGGGGGGGNGGGGGAGGGGGGGGGGSGGGGGGGGGGGSGGGGGGGGGGGGGGGGGGGGGGGGG----GGGGGG 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  918 YSLTVDLEDGNQLEFVTNNCSGAWDNNGGQNYQIDEGDWNVAGGAIVAGIPDGLDGNNAPVAVISPASQSVAKGTVVTLS 997
Cdd:COG4625    319 GGGGGGGGGGGGGGGAGGGGGSGGAGAGGGGAGGGGAGGGGGGGTGGGGGGGGGGGGGSGGGGAGGGGGSGGGGGGGAGG 398
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  998 GAGSSDSDGSIASYSWSTGESTESISVTVNDTQTISLTVTDNQGKTATSSVTLTVIPNKVPVASISPANQTVAAGTTVTL 1077
Cdd:COG4625    399 GGGGGGAGGTGGGGAGGGGGAAGGGGGGTGAGGGGGGGGTGAGGGGATGGGGGGGGGAGGSGGGAGAGGGSGSGAGTLTL 478
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|...
gi 1815986075 1078 DGAGSSDEDGSIASYLWSTGATTSSISVVVNASQTISLTVTDN 1120
Cdd:COG4625    479 TGNNTYTGTTTVNGGGNYTQSAGSTLAVEVDAANSDRLVVTGT 521
PKD cd00146
polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an ...
980-1052 7.78e-04

polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases.


Pssm-ID: 238084 [Multi-domain]  Cd Length: 81  Bit Score: 39.79  E-value: 7.78e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  980 VISPASQSVAKGTVVTLSGAGSSDSDGSIASYSW--------STGESTESISVTVNDTQTISLTVTDNQGKTATSSVTLT 1051
Cdd:cd00146      1 PTASVSAPPVAELGASVTFSASDSSGGSIVSYKWdfgdgevsSSGEPTVTHTYTKPGTYTVTLTVTNAVGSSSTKTTTVV 80

                   .
gi 1815986075 1052 V 1052
Cdd:cd00146     81 V 81
CBM_SusE-F_like cd12956
carbohydrate-binding modules from Bacteroides thetaiotaomicron SusE, SusF and similar proteins; ...
1149-1231 1.04e-03

carbohydrate-binding modules from Bacteroides thetaiotaomicron SusE, SusF and similar proteins; This group includes five starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins contain an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. CBM-Fa (the CBM unique to SusF), does not bind insoluble starch; CBM-Fb and CBM-Fc both do, deletion of one or the other results in a decrease in the overall affinity of SusF for starch. Both CBM-Eb and CBM-Ec are needed for SusE to bind tightly to starch. CBM-Ec has an additional starch-binding loop that may mediate interactions with partially unwound single helical forms of starch or small starch-breakdown products. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.


Pssm-ID: 240562  Cd Length: 93  Bit Score: 39.65  E-value: 1.04e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1149 GTANGWATTAMDLVA----DNTWQAVIDFDGQAEQRFKLDvsGDWTQNYGDTNSDG-VLEQTGGDIFTDVVGSYLLEVND 1223
Cdd:cd12956      8 ATPNGWDGPPDKPFTydatDGTFVSYATLAGDGEIKFRPN--NDWGENYGDDGDDGtFLSSGGDNIAVSAGGTYKITLNL 85

                   ....*...
gi 1815986075 1224 QTLAYSIT 1231
Cdd:cd12956     86 NNNTYTIE 93
PKD smart00089
Repeats in polycystic kidney disease 1 (PKD1) and other proteins; Polycystic kidney disease 1 ...
978-1052 1.16e-03

Repeats in polycystic kidney disease 1 (PKD1) and other proteins; Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.


Pssm-ID: 214510 [Multi-domain]  Cd Length: 79  Bit Score: 38.97  E-value: 1.16e-03
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075   978 VAVISPASQSVAKGTVVTLSGagSSDSDGSIASYSWSTGESTESISVTVN------DTQTISLTVTDNQGkTATSSVTLT 1051
Cdd:smart00089    1 VADVSASPTVGVAGESVTFTA--TSSDDGSIVSYTWDFGDGTSSTGPTVThtytkpGTYTVTLTVTNAVG-SASATVTVV 77

                    .
gi 1815986075  1052 V 1052
Cdd:smart00089   78 V 78
CBM_SusE-F_like_u1 cd12967
Uncharacterized subgroup of the CBM-SusE-F_like superfamily; The CBM SusE-F_like superfamily ...
1330-1415 1.27e-03

Uncharacterized subgroup of the CBM-SusE-F_like superfamily; The CBM SusE-F_like superfamily includes starch-specific CBMs (carbohydrate-binding modules) of SusE and SusF, two cell surface lipoproteins within the Sus (Starch-utilization system) system of the human gut symbiont Bacteroides thetaiotaomicron. These CBMs have no enzymatic activity. The precise mechanistic roles of SusE and SusF in starch metabolism are unclear. Both proteins have an N-terminal domain which may belong to the immunoglobulin superfamily (IgSF), followed by two or three tandem starch-binding CBMs. SusF has three CBMs (CBM-Fa, -Fb, and -Fc; F denotes SusF, and they are labeled alphabetically from the N- to C- terminus). SusE has two CBMs (CBM-Eb and -Ec, corresponding to CBM-Fb and -Fc). Each starch-binding site contains an arc of aromatic amino acids for hydrophobic stacking with glucose, and hydrogen-bonding acceptors and donors for interacting with the O-2 and O-3 of glucose. These five CBMs show differences in their affinity for various different starch oligosaccharides, and they also contribute differently to binding insoluble starch. Proteins in this group are present in the species of the Gram-negative Bacteroidetes phylum.


Pssm-ID: 240566  Cd Length: 91  Bit Score: 39.20  E-value: 1.27e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1330 LYFRGTANAWQTsamslvANNTWQVSVDFDGQ--------NNQRFKFDLNGDWS-TNYGDNNNDGTLEQTGGDIFTGIVG 1400
Cdd:cd12967      2 LYVPGNYQGWNP------ATAPALYSPNGDGKyegyvylpGNFEFKFTTAPNWDgDYGGDGGGGGLLDGGGGNIKAPEAG 75
                           90
                   ....*....|....*
gi 1815986075 1401 SYVVEVNDATLQYRI 1415
Cdd:cd12967     76 YYKVTVDLNDLTYSL 90
CADG smart00736
Dystroglycan-type cadherin-like domains; Cadherin-homologous domains present in metazoan ...
1253-1320 2.15e-03

Dystroglycan-type cadherin-like domains; Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions.


Pssm-ID: 214795 [Multi-domain]  Cd Length: 97  Bit Score: 38.86  E-value: 2.15e-03
                            10        20        30        40        50        60
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1815986075  1253 GQSITYSAAGSSDSDgvIASYLWSNGDTSE-TTTVTYNTAGSNSIGLTVTDDGGKTAQASVVVEVIDPN 1320
Cdd:smart00736   29 GDTLTYSATLSDGSA--LPSWLSFDSDTGTlSGTPTNSDVGSLSLKVTATDSSGASASDTFTITVVNTN 95
FhaB COG3210
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ...
740-1304 2.83e-03

Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442443 [Multi-domain]  Cd Length: 1698  Bit Score: 42.45  E-value: 2.83e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  740 ATDAKGNVSRSEIQQVYVGAGQFSSVGGKTVEDVNGAIAGEAMFFTDNVVVGDKRPTALIDASGGEVELGTVVTLSAAGS 819
Cdd:COG3210    225 STLTGGVVAAGTGAGVISTGGTDISSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVT 304
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  820 SDEEGPIASYLWSTGETSPSITVTLNERTTISVVVTDSVGQQASTSVTYRIIGQSVTSTLYYQDTNGWGQVCLHYSVDGT 899
Cdd:COG3210    305 GAGGTGVLGGGTAAGITTTNTVGGNGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTV 384
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  900 VTWTTAPGEPMQSLGDNWYSLTVDLEDGNQLEFVTNNCSGAWDNNGGQNYQIDEGDWNVAGGAIVAGIPDGLDGNNAPVA 979
Cdd:COG3210    385 GTATASTGNASSTTVLGSGSLATGNTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNT 464
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  980 VISPASQSVAKGTVVTLSGAGSSDSDGSIAS--YSWSTGESTESISVTVNDTQTISLTVTDNQGKTATSSVTLTVIPNKV 1057
Cdd:COG3210    465 DVSGTGTVTNSAGNTTSATTLAGGGIGTVTTnaTISNNAGGDANGIATGLTGITAGGGGGGNATSGGTGGDGTTLSGSGL 544
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1058 PVASISPANQTVAAGTTVTLDGAGSSDEDGSIASYLWSTGATTSSISVVVNASQTISLTVTDNEGAAATAEAVLSVESDE 1137
Cdd:COG3210    545 TTTVSGGASGTTAASGSNTANTLGVLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAG 624
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1138 KAKNFNQLYFRGTANGWATTAMDLVADNTWQAVIDFDGQAEQRFKLDVSGDWTQNYGDTNSDGVLEQTGGDIFTDVVGSY 1217
Cdd:COG3210    625 ANATGGGAGLTGSAVGAALSGTGSGTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAGN 704
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1218 LLEVNDQTLAYSITELNANQAPNAIIGASTTQVDIGQSITYSAAGSSDSDGVIASYLWSNGDTSETTTVTYNTAGSNSIG 1297
Cdd:COG3210    705 TLTISTGSITVTGQIGALANANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLANANGNTSAG 784

                   ....*..
gi 1815986075 1298 LTVTDDG 1304
Cdd:COG3210    785 ATLDNAG 791
PKD smart00089
Repeats in polycystic kidney disease 1 (PKD1) and other proteins; Polycystic kidney disease 1 ...
1059-1124 3.93e-03

Repeats in polycystic kidney disease 1 (PKD1) and other proteins; Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.


Pssm-ID: 214510 [Multi-domain]  Cd Length: 79  Bit Score: 37.82  E-value: 3.93e-03
                            10        20        30        40        50        60        70
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1815986075  1059 VASISPANQTVAAGTTVTLDGagSSDEDGSIASYLWSTG----ATTSSISVVVNASQ--TISLTVTDNEGAA 1124
Cdd:smart00089    1 VADVSASPTVGVAGESVTFTA--TSSDDGSIVSYTWDFGdgtsSTGPTVTHTYTKPGtyTVTLTVTNAVGSA 70
PKD pfam00801
PKD domain; This domain was first identified in the Polycystic kidney disease protein PKD1. ...
1243-1305 4.50e-03

PKD domain; This domain was first identified in the Polycystic kidney disease protein PKD1. This domain has been predicted to contain an Ig-like fold.


Pssm-ID: 395646 [Multi-domain]  Cd Length: 70  Bit Score: 37.37  E-value: 4.50e-03
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1815986075 1243 IGASTTQVDIGQSITYSAAgssDSDGVIASYLWSNGDTSETT------TVTYNTAGSNSIGLTVTDDGG 1305
Cdd:pfam00801    1 VSASGTVVAAGQPVTFTAT---LADGSNVTYTWDFGDSPGTSgsgptvTHTYLSPGTYTVTLTASNAVG 66
ser_rich_anae_1 NF033849
serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 ...
985-1207 6.41e-03

serine-rich protein; This serine-rich protein belongs to a family with large size (over 1000 amino acids), which a highly serine-rich central region that averages over 300 aa in length. Species encoding members of this family of proteins tend to be anaerobic bacteria, including Gram-positive bacteria of the human gut microbiome and Chloroflexi from marine sediments.


Pssm-ID: 468206 [Multi-domain]  Cd Length: 1122  Bit Score: 41.14  E-value: 6.41e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075  985 SQSVAKGTVVTLS---GAGSSDSDGSIASYSWSTGESTeSISVTVNDTQTISLTVTDNQGKTATSSVTLTV-IPNKVPVA 1060
Cdd:NF033849   322 SSSHSQSSSYNVSsgtGVSSSHSDGTSQSTSISHSESS-SESTGTSVGHSTSSSVSSSESSSRSSSSGVSGgFSGGIAGG 400
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1061 SISPANQTVAAGTTVtldGAGSSDEDGSIASYLWSTGATTSSISVVVNASQTISLTVTDNEGAAATAEAVLSVESDEKAk 1140
Cdd:NF033849   401 GVTSEGLGASQGGSE---GWGSGDSVQSVSQSYGSSSSTGTSSGHSDSSSHSTSSGQADSVSQGTSWSEGTGTSQGQSV- 476
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1815986075 1141 nfnqlyfrGTANGWATTAM--DLVADNTwqavidfdGQAEQrfkldVSGDWTQNYGDTNSDGV-LEQTGG 1207
Cdd:NF033849   477 --------GTSESWSTSQSetDSVGDST--------GTSES-----VSQGDGRSTGRSESQGTsLGTSGG 525
PKD pfam00801
PKD domain; This domain was first identified in the Polycystic kidney disease protein PKD1. ...
1062-1126 8.78e-03

PKD domain; This domain was first identified in the Polycystic kidney disease protein PKD1. This domain has been predicted to contain an Ig-like fold.


Pssm-ID: 395646 [Multi-domain]  Cd Length: 70  Bit Score: 36.21  E-value: 8.78e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1815986075 1062 ISPANQTVAAGTTVTLDGAgssDEDGSIASYLWSTGATTSSISVVVNASQT--------ISLTVTDNEGAAAT 1126
Cdd:pfam00801    1 VSASGTVVAAGQPVTFTAT---LADGSNVTYTWDFGDSPGTSGSGPTVTHTylspgtytVTLTASNAVGSANA 70
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH