NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1487552165|gb|AYG20375|]
View 

RHS element protein [Escherichia coli str. K-12 substr. MG1655]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
RHS_core NF041261
RHS element core protein;
1-1258 0e+00

RHS element core protein;


:

Pssm-ID: 469161 [Multi-domain]  Cd Length: 1261  Bit Score: 2631.99  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165    1 MSGKPAARQGDMTQYGGPIVQGSAGVRIGAPTGVACSVCPGGMTSGNPVNPLLGAKVLPGETDLALPGPLPFILSRTYSS 80
Cdd:NF041261     1 MSGKPAARQGDMTQYGGPIVQGSAGVRIGAPTGVACSVCPGGMTSGNPVNPLLGAKVLPGETDIALPGPLPFILSRTYSS 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165   81 YRTKTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDNGGRSIHFEPLLPGEAVYSRSESMWLVRGGKAAQPDGHTLARLWG 160
Cdd:NF041261    81 YRTRTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDNGGRSIHFEPLFPGEAVYSRSESLWLVRGGVAAQPDGHTLAALWQ 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  161 ALPPDIRLSPHLYLATNSAQGPWWILGWSERVPGAEDVLPAPLPPYRVLTGMADRFGRTLTYRREAAGDLAGEITGVTDG 240
Cdd:NF041261   161 ALPEDIRLSPHLYLATNSAQGPWWILGWSERVPGADEVLPAPLPPYRVLTGMVDRFGRTLTFHREAAGDLAGEITGVTDG 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  241 AGREFRLVLTTQAQRAEEAR---TSSLSSSDSSRPLSASAFPDTLPG-TEYGPDRGIRLSAVWLMHDPAYPESLPAAPLV 316
Cdd:NF041261   241 AGREFRLVLTTQAQRAEEARkqrTSSLSSPDGPRPLSSSAFPDTLPGgTEYGPDNGIRLSAVWLTHDPAYPESLPAAPLV 320
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  317 RYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRMVAHRYAGRPEMRYRYDDTGRVVEQLNPAGLSYRYLYEQDRITVTDS 396
Cdd:NF041261   321 RYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRMVAHRYAGRPEMCYRYDDTGRVTEQLNPAGLSYRYQYEQDRITITDS 400
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  397 LNRREVLHTEGGAGLKRVVKKELADGSVTRSGYDAAGRLTAQTDAAGRRTEYGLNVVSGDITDITTPDGRETKFYYNDGN 476
Cdd:NF041261   401 LNRREVLHTEGEGGLKRVVKKEHADGSVTRSGYDAAGRLTAQTDAAGRRTEYSLNVVSGDITDITTPDGRETKFYYNDGN 480
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  477 QLTAVVSPDGLESRREYDEPGRLVSETSRSGETVRYRYDDAHSELPATTTDATGSTRQMTWSRYGQLLAFTDCSGYQTRY 556
Cdd:NF041261   481 QLTSVTSPDGLESRREYDEPGRLVSETSRSGETTRYRYDDPHSELPATTTDATGSTKQMTWSRYGQLLAFTDCSGYQTRY 560
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  557 EYDRFGQMTAVHREEGISLYRRYDNRGRLTSVKDAQGRETRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVSTTQGGLT 636
Cdd:NF041261   561 EYDRFGQMTAVHREEGISTYRRYDNRGQLTSVKDAQGRETRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVSTTQGGLT 640
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  637 RSMEYDAAGRVISLTNENGSHSVFSYDALDRLVQQGGFDGRTQRYHYDLTGKLTQSEDEGLVILWYYDESDRITHRTVNG 716
Cdd:NF041261   641 RSMEYDAAGRITTLTNENGSHSTFLYDALDRLVQQRGFDGRTQRYHYDLTGKLTQSEDEGLVTLWHYDESDRITHRTVNG 720
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  717 EPAEQWQYDGHGWLTDISHLSEGHRVAVHYGYDDKGRLTGECQTVENPETGELLWQHETKHAYNEQGLANRVTPDSLPPV 796
Cdd:NF041261   721 EPAEQWQYDEHGWLTDISHLSEGHRVAVHYGYDDKGRLTGERQTVENPETGELLWQHETGHAYNEQGLANRVTPDSLPPV 800
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  797 EWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVRSFGsMAGSNAAYELTSTYTPAGQLQSQHLNSLVYDRDYGWSDNGDL 876
Cdd:NF041261   801 EWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVRSFG-GAGSNAAYELTTAYTPAGQLQSQHLNSLVYDRDYTWNDNGDL 879
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  877 VRISGPRQTREYGYSATGRLESVRTLAPDLDIRIPYATDPAGNRLPDPELHPDSTLTVWPDNRIAEDAHYVYRHDEYGRL 956
Cdd:NF041261   880 VRISGPRQTREYGYSATGRLTGVHTTAANLDIRIPYATDPAGNRLPDPELHPDSTLTAWPDNRIAEDAHYVYRYDEYGRL 959
                          970       980       990      1000      1010      1020      1030      1040
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  957 TEKTDRIPAGVIRTDDERTHHYHYDSQHRLVFYTRIQHGEPLVESRYLYDPLGRRMAKRVWRRERDLTGWMSLSRKPEVT 1036
Cdd:NF041261   960 TEKTDRIPEGVIRTDDERTHHYHYDSQHRLVFYTRIQHGEPLVESRYLYDPLGRRMAKRVWRRERDLTGWMSLSRKPEVT 1039
                         1050      1060      1070      1080      1090      1100      1110      1120
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1037 WYGWDGDRLTTVQTDTTRIQTVYEPGSFTPLIRVETENGEREKAQRRSLAETLQQEGSENGHGVVFPAELVRLLDRLEEE 1116
Cdd:NF041261  1040 WYGWDGDRLTTVQTDTTRIQTVYQPGSFTPLIRVETENGERAKAQRRSLAETLQQEGSENGHGVVFPAELVRMLDRLEEE 1119
                         1130      1140      1150      1160      1170      1180      1190      1200
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1117 IRADRVSSESRAWLAQCGLTVEQLARQVEPEYTPARKAHLYHCDHRGLPLALISEDGNTAWSAEYDEWGNQLNEENPHHV 1196
Cdd:NF041261  1120 IRADRVSEESRAWLAQCGLTVEQMARQVEPEYTPARKLHLYHCDHRGLPLALISEEGNTAWQGEYDEWGNLLNEENPHHL 1199
                         1210      1220      1230      1240      1250      1260
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1487552165 1197 YQPYRLPGQQHDEESGLYYNRHRYYDPLQGRYITQDPMGLKGGWNLYQYPLNPLQQIDPMGL 1258
Cdd:NF041261  1200 QQPYRLPGQQYDEESGLYYNRNRYYDPLQGRYITQDPIGLKGGWNLYQYPLNPIRFIDPLGL 1261
DUF4329 pfam14220
Domain of unknown function (DUF4329); This domain is functionally uncharacterized. It is found ...
1292-1409 1.26e-38

Domain of unknown function (DUF4329); This domain is functionally uncharacterized. It is found in bacteria and eukaryotes, and is approximately 130 amino acids in length. It is often found in association with pfam05593 and pfam03527. There is a single completely conserved residue D and a highly conserved HTH motif which may be functionally important.


:

Pssm-ID: 433783  Cd Length: 114  Bit Score: 139.82  E-value: 1.26e-38
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1292 DAALDALKETQNRSLCNDMEYSGIVCKDTNGKYFASkAETDNLRKESypLKRKCPTGTDRVAAYHTHGADSHgDYVDEFF 1371
Cdd:pfam14220    1 DAAKDALEEYNGRSIRENREYCGFILTDDEGKYVYT-APTRGGEASS--GNPPVPNGQTVVASYHTHGAYDS-NYDSEVF 76
                           90       100       110
                   ....*....|....*....|....*....|....*...
gi 1487552165 1372 SSSDKNLVRSKDNNLEAFYLATPDGRFEALNNKGEYIF 1409
Cdd:pfam14220   77 SVQDKKIVLSDMQNGVNGYVATPGGRLWYIDPSRSYAR 114
 
Name Accession Description Interval E-value
RHS_core NF041261
RHS element core protein;
1-1258 0e+00

RHS element core protein;


Pssm-ID: 469161 [Multi-domain]  Cd Length: 1261  Bit Score: 2631.99  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165    1 MSGKPAARQGDMTQYGGPIVQGSAGVRIGAPTGVACSVCPGGMTSGNPVNPLLGAKVLPGETDLALPGPLPFILSRTYSS 80
Cdd:NF041261     1 MSGKPAARQGDMTQYGGPIVQGSAGVRIGAPTGVACSVCPGGMTSGNPVNPLLGAKVLPGETDIALPGPLPFILSRTYSS 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165   81 YRTKTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDNGGRSIHFEPLLPGEAVYSRSESMWLVRGGKAAQPDGHTLARLWG 160
Cdd:NF041261    81 YRTRTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDNGGRSIHFEPLFPGEAVYSRSESLWLVRGGVAAQPDGHTLAALWQ 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  161 ALPPDIRLSPHLYLATNSAQGPWWILGWSERVPGAEDVLPAPLPPYRVLTGMADRFGRTLTYRREAAGDLAGEITGVTDG 240
Cdd:NF041261   161 ALPEDIRLSPHLYLATNSAQGPWWILGWSERVPGADEVLPAPLPPYRVLTGMVDRFGRTLTFHREAAGDLAGEITGVTDG 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  241 AGREFRLVLTTQAQRAEEAR---TSSLSSSDSSRPLSASAFPDTLPG-TEYGPDRGIRLSAVWLMHDPAYPESLPAAPLV 316
Cdd:NF041261   241 AGREFRLVLTTQAQRAEEARkqrTSSLSSPDGPRPLSSSAFPDTLPGgTEYGPDNGIRLSAVWLTHDPAYPESLPAAPLV 320
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  317 RYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRMVAHRYAGRPEMRYRYDDTGRVVEQLNPAGLSYRYLYEQDRITVTDS 396
Cdd:NF041261   321 RYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRMVAHRYAGRPEMCYRYDDTGRVTEQLNPAGLSYRYQYEQDRITITDS 400
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  397 LNRREVLHTEGGAGLKRVVKKELADGSVTRSGYDAAGRLTAQTDAAGRRTEYGLNVVSGDITDITTPDGRETKFYYNDGN 476
Cdd:NF041261   401 LNRREVLHTEGEGGLKRVVKKEHADGSVTRSGYDAAGRLTAQTDAAGRRTEYSLNVVSGDITDITTPDGRETKFYYNDGN 480
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  477 QLTAVVSPDGLESRREYDEPGRLVSETSRSGETVRYRYDDAHSELPATTTDATGSTRQMTWSRYGQLLAFTDCSGYQTRY 556
Cdd:NF041261   481 QLTSVTSPDGLESRREYDEPGRLVSETSRSGETTRYRYDDPHSELPATTTDATGSTKQMTWSRYGQLLAFTDCSGYQTRY 560
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  557 EYDRFGQMTAVHREEGISLYRRYDNRGRLTSVKDAQGRETRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVSTTQGGLT 636
Cdd:NF041261   561 EYDRFGQMTAVHREEGISTYRRYDNRGQLTSVKDAQGRETRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVSTTQGGLT 640
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  637 RSMEYDAAGRVISLTNENGSHSVFSYDALDRLVQQGGFDGRTQRYHYDLTGKLTQSEDEGLVILWYYDESDRITHRTVNG 716
Cdd:NF041261   641 RSMEYDAAGRITTLTNENGSHSTFLYDALDRLVQQRGFDGRTQRYHYDLTGKLTQSEDEGLVTLWHYDESDRITHRTVNG 720
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  717 EPAEQWQYDGHGWLTDISHLSEGHRVAVHYGYDDKGRLTGECQTVENPETGELLWQHETKHAYNEQGLANRVTPDSLPPV 796
Cdd:NF041261   721 EPAEQWQYDEHGWLTDISHLSEGHRVAVHYGYDDKGRLTGERQTVENPETGELLWQHETGHAYNEQGLANRVTPDSLPPV 800
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  797 EWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVRSFGsMAGSNAAYELTSTYTPAGQLQSQHLNSLVYDRDYGWSDNGDL 876
Cdd:NF041261   801 EWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVRSFG-GAGSNAAYELTTAYTPAGQLQSQHLNSLVYDRDYTWNDNGDL 879
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  877 VRISGPRQTREYGYSATGRLESVRTLAPDLDIRIPYATDPAGNRLPDPELHPDSTLTVWPDNRIAEDAHYVYRHDEYGRL 956
Cdd:NF041261   880 VRISGPRQTREYGYSATGRLTGVHTTAANLDIRIPYATDPAGNRLPDPELHPDSTLTAWPDNRIAEDAHYVYRYDEYGRL 959
                          970       980       990      1000      1010      1020      1030      1040
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  957 TEKTDRIPAGVIRTDDERTHHYHYDSQHRLVFYTRIQHGEPLVESRYLYDPLGRRMAKRVWRRERDLTGWMSLSRKPEVT 1036
Cdd:NF041261   960 TEKTDRIPEGVIRTDDERTHHYHYDSQHRLVFYTRIQHGEPLVESRYLYDPLGRRMAKRVWRRERDLTGWMSLSRKPEVT 1039
                         1050      1060      1070      1080      1090      1100      1110      1120
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1037 WYGWDGDRLTTVQTDTTRIQTVYEPGSFTPLIRVETENGEREKAQRRSLAETLQQEGSENGHGVVFPAELVRLLDRLEEE 1116
Cdd:NF041261  1040 WYGWDGDRLTTVQTDTTRIQTVYQPGSFTPLIRVETENGERAKAQRRSLAETLQQEGSENGHGVVFPAELVRMLDRLEEE 1119
                         1130      1140      1150      1160      1170      1180      1190      1200
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1117 IRADRVSSESRAWLAQCGLTVEQLARQVEPEYTPARKAHLYHCDHRGLPLALISEDGNTAWSAEYDEWGNQLNEENPHHV 1196
Cdd:NF041261  1120 IRADRVSEESRAWLAQCGLTVEQMARQVEPEYTPARKLHLYHCDHRGLPLALISEEGNTAWQGEYDEWGNLLNEENPHHL 1199
                         1210      1220      1230      1240      1250      1260
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1487552165 1197 YQPYRLPGQQHDEESGLYYNRHRYYDPLQGRYITQDPMGLKGGWNLYQYPLNPLQQIDPMGL 1258
Cdd:NF041261  1200 QQPYRLPGQQYDEESGLYYNRNRYYDPLQGRYITQDPIGLKGGWNLYQYPLNPIRFIDPLGL 1261
DUF4329 pfam14220
Domain of unknown function (DUF4329); This domain is functionally uncharacterized. It is found ...
1292-1409 1.26e-38

Domain of unknown function (DUF4329); This domain is functionally uncharacterized. It is found in bacteria and eukaryotes, and is approximately 130 amino acids in length. It is often found in association with pfam05593 and pfam03527. There is a single completely conserved residue D and a highly conserved HTH motif which may be functionally important.


Pssm-ID: 433783  Cd Length: 114  Bit Score: 139.82  E-value: 1.26e-38
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1292 DAALDALKETQNRSLCNDMEYSGIVCKDTNGKYFASkAETDNLRKESypLKRKCPTGTDRVAAYHTHGADSHgDYVDEFF 1371
Cdd:pfam14220    1 DAAKDALEEYNGRSIRENREYCGFILTDDEGKYVYT-APTRGGEASS--GNPPVPNGQTVVASYHTHGAYDS-NYDSEVF 76
                           90       100       110
                   ....*....|....*....|....*....|....*...
gi 1487552165 1372 SSSDKNLVRSKDNNLEAFYLATPDGRFEALNNKGEYIF 1409
Cdd:pfam14220   77 SVQDKKIVLSDMQNGVNGYVATPGGRLWYIDPSRSYAR 114
RhsA COG3209
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ...
151-1297 5.47e-30

Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];


Pssm-ID: 442442 [Multi-domain]  Cd Length: 1103  Bit Score: 129.49  E-value: 5.47e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  151 DGHTLARLWGALPPDIRLSPHLYLATNSAQGPWWILGWSERVPGAEDVLPAPLPPYRVLTGMADRFGRTLTYRREAAGDL 230
Cdd:COG3209     33 GSTVLLAKGGLSTAAAAGGAATLTARSASTTDVVGTLTGAGGTSAGGVTALGDASAAGGGYVGGAAAGGGATLTGLAAAT 112
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  231 AGEITGVTDGAGREFRLVLTTQAQRAEEARTSSLSSSDSSRPLSASAFPDTLPGTEYGPDRGIRLSAVWLMHDPAYPESL 310
Cdd:COG3209    113 ASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGAVTLA 192
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  311 PAAPLVRYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRMVAHRyagrpemRYRYDDTGRVVEQLNPAGLSYRYLYEQDR 390
Cdd:COG3209    193 TGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAA-------TVTGSATGAAGAGAAVATAATTLGGTTGA 265
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  391 ITVTDSLNRREVLHTEGGAGLKRVVKKELADGSVTRSGYDAAGRLTAQTDAAGRRTEYGLNVVSGDITDITTPDGRETKF 470
Cdd:COG3209    266 GTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGT 345
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  471 YYNDGNQLTAVVSPDGLESRREYDEPGRLVSETSRSGETVRYRYDDAHSELPATTTDATGSTRQMTWSRYGQLLAFTDCS 550
Cdd:COG3209    346 TTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGA 425
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  551 GYQTRYEYDRFGQMTAVHREEGISLYRRYDNRGRLTSVKDAQGRETRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVST 630
Cdd:COG3209    426 LTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDT 505
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  631 TQGGLTRSMEYDAAGRVISLTNENGSHSVFSYDALDRLVQQGGFDGRTQRYHYDLTGKLTQSEDEGLVILWYYDESDRIT 710
Cdd:COG3209    506 LGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTT 585
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  711 HRTVNGEPAEQWQYDGHGWLTDISHlSEGHRVAVHYGYDDKGRLTGECQTVENPETGELLWQHETKHAYNEQGLANRVTP 790
Cdd:COG3209    586 GGTATTTTVTTTTTTSTAGTTTTTT-SGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTGTTTTRATGTTG 664
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  791 DSLPPVEWLTYGSGYLAGmkLGGTPLVEYTRDRLHRETVRSFGSMAGSNAAYELTSTYTPAGQLQSQHLNSLVYDRDYGW 870
Cdd:COG3209    665 TGTGVTAGLTTLATGGTT--VGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTG 742
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  871 SDNGDLVRISGPRQTREYGYSATGRLESVRT--LAPDLDIRIPYATDPAGNRlpdpelhpdsTLTVWPDNRIAEdahyvY 948
Cdd:COG3209    743 TLTTTSTTTTTTAGALTYTYDALGRLTSETTpgGVTQGTYTTRYTYDALGRL----------TSVTYPDGETVT-----Y 807
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  949 RHDEYGRLTEKtdrIPAGVIRTDDERTHHYHYDSQHRLVFYTRIQHGEPLVEsRYLYDPLGR-RMAKRVWRRER---DLT 1024
Cdd:COG3209    808 TYDALGRLTSV---ITVGSGGGTDLQDRTYTYDAAGNITSITDALRAGTLTQ-TYTYDALGRlTSATDPGTTESytyDAN 883
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1025 GWMSLSRKPEVTWYGWDG-DRLTTVQT-DTTRIQTVYEPgsftplirvetengerekaqrrslaetlqqegsenghgvvf 1102
Cdd:COG3209    884 GNLTSRTDGGTTTYTYDAlGRLVSVTKpDGTTTTYTYDA----------------------------------------- 922
                          970       980       990      1000      1010      1020      1030      1040
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1103 paelvrlldrleeeiradrvssesrawlaqcgltveqlarqvepeytparkahLYHCDHRGLPLALISEDGNTAWSAEYD 1182
Cdd:COG3209    923 -----------------------------------------------------LGHTDHLGSVRALTDASGQVVWRYDYD 949
                         1050      1060      1070      1080      1090      1100      1110      1120
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1183 EWGNQLNEENPHhVYQPYRLPGQQHDEESGLYYNRHRYYDPLQGRYITQDPMGLKGGWNLYQYPL-NPLQQIDPMGLLQT 1261
Cdd:COG3209    950 PFGNLLAETSGA-AANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPDPIGLAGGLNLYAYVGnNPVNYVDPLGLAAL 1028
                         1130      1140      1150
                   ....*....|....*....|....*....|....*.
gi 1487552165 1262 WDDARSGACTGGVCGVLSRIIGPSKFDSTADAALDA 1297
Cdd:COG3209   1029 LGTTGLGGGAGVGAGAAGGGAAAAGGSAGAGAAGGG 1064
Rhs_assc_core TIGR03696
RHS repeat-associated core domain; This model represents a conserved unique core sequence ...
1181-1258 2.19e-29

RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.


Pssm-ID: 274730 [Multi-domain]  Cd Length: 77  Bit Score: 112.21  E-value: 2.19e-29
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1487552165 1181 YDEWGNQLNEENphHVYQPYRLPGQQHDEESGLYYNRHRYYDPLQGRYITQDPMGLKGGWNLYQY-PLNPLQQIDPMGL 1258
Cdd:TIGR03696    1 YDPYGEVLSESG--AAPNPLRFTGQYYDAETGLYYNGARYYDPELGRFLSPDPIGLGGGLNLYAYvGNNPVNWVDPLGL 77
DUF6531 pfam20148
Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins.
46-123 1.47e-18

Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins.


Pssm-ID: 466309 [Multi-domain]  Cd Length: 74  Bit Score: 81.04  E-value: 1.47e-18
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1487552165   46 GNPVNPLLGAKVLPgETDLALPGPLPFILSRTYSSYRTKTpapvGVFGPGWKAPSDIRLQLRDDG-LILNDNGGRSIHF 123
Cdd:pfam20148    1 GDPVNVATGNKVLE-ETDFSLPGPLPLVWTRTYNSSSERD----GPLGPGWSHPYDQRLELEGDGgVVYIDADGREVTF 74
PAAR_2 cd14738
proline-alanine-alanine-arginine (PAAR) domain; This domain is found in the PAAR ...
1-29 7.59e-08

proline-alanine-alanine-arginine (PAAR) domain; This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli.


Pssm-ID: 269823  Cd Length: 94  Bit Score: 51.48  E-value: 7.59e-08
                           10        20
                   ....*....|....*....|....*....
gi 1487552165    1 MSGKPAARQGDMTQYGGPIVQGSAGVRIG 29
Cdd:cd14738     66 IGGKPAARMGDSTAHGGVIVSGVPTVLIG 94
 
Name Accession Description Interval E-value
RHS_core NF041261
RHS element core protein;
1-1258 0e+00

RHS element core protein;


Pssm-ID: 469161 [Multi-domain]  Cd Length: 1261  Bit Score: 2631.99  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165    1 MSGKPAARQGDMTQYGGPIVQGSAGVRIGAPTGVACSVCPGGMTSGNPVNPLLGAKVLPGETDLALPGPLPFILSRTYSS 80
Cdd:NF041261     1 MSGKPAARQGDMTQYGGPIVQGSAGVRIGAPTGVACSVCPGGMTSGNPVNPLLGAKVLPGETDIALPGPLPFILSRTYSS 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165   81 YRTKTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDNGGRSIHFEPLLPGEAVYSRSESMWLVRGGKAAQPDGHTLARLWG 160
Cdd:NF041261    81 YRTRTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDNGGRSIHFEPLFPGEAVYSRSESLWLVRGGVAAQPDGHTLAALWQ 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  161 ALPPDIRLSPHLYLATNSAQGPWWILGWSERVPGAEDVLPAPLPPYRVLTGMADRFGRTLTYRREAAGDLAGEITGVTDG 240
Cdd:NF041261   161 ALPEDIRLSPHLYLATNSAQGPWWILGWSERVPGADEVLPAPLPPYRVLTGMVDRFGRTLTFHREAAGDLAGEITGVTDG 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  241 AGREFRLVLTTQAQRAEEAR---TSSLSSSDSSRPLSASAFPDTLPG-TEYGPDRGIRLSAVWLMHDPAYPESLPAAPLV 316
Cdd:NF041261   241 AGREFRLVLTTQAQRAEEARkqrTSSLSSPDGPRPLSSSAFPDTLPGgTEYGPDNGIRLSAVWLTHDPAYPESLPAAPLV 320
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  317 RYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRMVAHRYAGRPEMRYRYDDTGRVVEQLNPAGLSYRYLYEQDRITVTDS 396
Cdd:NF041261   321 RYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRMVAHRYAGRPEMCYRYDDTGRVTEQLNPAGLSYRYQYEQDRITITDS 400
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  397 LNRREVLHTEGGAGLKRVVKKELADGSVTRSGYDAAGRLTAQTDAAGRRTEYGLNVVSGDITDITTPDGRETKFYYNDGN 476
Cdd:NF041261   401 LNRREVLHTEGEGGLKRVVKKEHADGSVTRSGYDAAGRLTAQTDAAGRRTEYSLNVVSGDITDITTPDGRETKFYYNDGN 480
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  477 QLTAVVSPDGLESRREYDEPGRLVSETSRSGETVRYRYDDAHSELPATTTDATGSTRQMTWSRYGQLLAFTDCSGYQTRY 556
Cdd:NF041261   481 QLTSVTSPDGLESRREYDEPGRLVSETSRSGETTRYRYDDPHSELPATTTDATGSTKQMTWSRYGQLLAFTDCSGYQTRY 560
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  557 EYDRFGQMTAVHREEGISLYRRYDNRGRLTSVKDAQGRETRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVSTTQGGLT 636
Cdd:NF041261   561 EYDRFGQMTAVHREEGISTYRRYDNRGQLTSVKDAQGRETRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVSTTQGGLT 640
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  637 RSMEYDAAGRVISLTNENGSHSVFSYDALDRLVQQGGFDGRTQRYHYDLTGKLTQSEDEGLVILWYYDESDRITHRTVNG 716
Cdd:NF041261   641 RSMEYDAAGRITTLTNENGSHSTFLYDALDRLVQQRGFDGRTQRYHYDLTGKLTQSEDEGLVTLWHYDESDRITHRTVNG 720
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  717 EPAEQWQYDGHGWLTDISHLSEGHRVAVHYGYDDKGRLTGECQTVENPETGELLWQHETKHAYNEQGLANRVTPDSLPPV 796
Cdd:NF041261   721 EPAEQWQYDEHGWLTDISHLSEGHRVAVHYGYDDKGRLTGERQTVENPETGELLWQHETGHAYNEQGLANRVTPDSLPPV 800
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  797 EWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVRSFGsMAGSNAAYELTSTYTPAGQLQSQHLNSLVYDRDYGWSDNGDL 876
Cdd:NF041261   801 EWLTYGSGYLAGMKLGGTPLVEYTRDRLHRETVRSFG-GAGSNAAYELTTAYTPAGQLQSQHLNSLVYDRDYTWNDNGDL 879
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  877 VRISGPRQTREYGYSATGRLESVRTLAPDLDIRIPYATDPAGNRLPDPELHPDSTLTVWPDNRIAEDAHYVYRHDEYGRL 956
Cdd:NF041261   880 VRISGPRQTREYGYSATGRLTGVHTTAANLDIRIPYATDPAGNRLPDPELHPDSTLTAWPDNRIAEDAHYVYRYDEYGRL 959
                          970       980       990      1000      1010      1020      1030      1040
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  957 TEKTDRIPAGVIRTDDERTHHYHYDSQHRLVFYTRIQHGEPLVESRYLYDPLGRRMAKRVWRRERDLTGWMSLSRKPEVT 1036
Cdd:NF041261   960 TEKTDRIPEGVIRTDDERTHHYHYDSQHRLVFYTRIQHGEPLVESRYLYDPLGRRMAKRVWRRERDLTGWMSLSRKPEVT 1039
                         1050      1060      1070      1080      1090      1100      1110      1120
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1037 WYGWDGDRLTTVQTDTTRIQTVYEPGSFTPLIRVETENGEREKAQRRSLAETLQQEGSENGHGVVFPAELVRLLDRLEEE 1116
Cdd:NF041261  1040 WYGWDGDRLTTVQTDTTRIQTVYQPGSFTPLIRVETENGERAKAQRRSLAETLQQEGSENGHGVVFPAELVRMLDRLEEE 1119
                         1130      1140      1150      1160      1170      1180      1190      1200
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1117 IRADRVSSESRAWLAQCGLTVEQLARQVEPEYTPARKAHLYHCDHRGLPLALISEDGNTAWSAEYDEWGNQLNEENPHHV 1196
Cdd:NF041261  1120 IRADRVSEESRAWLAQCGLTVEQMARQVEPEYTPARKLHLYHCDHRGLPLALISEEGNTAWQGEYDEWGNLLNEENPHHL 1199
                         1210      1220      1230      1240      1250      1260
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1487552165 1197 YQPYRLPGQQHDEESGLYYNRHRYYDPLQGRYITQDPMGLKGGWNLYQYPLNPLQQIDPMGL 1258
Cdd:NF041261  1200 QQPYRLPGQQYDEESGLYYNRNRYYDPLQGRYITQDPIGLKGGWNLYQYPLNPIRFIDPLGL 1261
DUF4329 pfam14220
Domain of unknown function (DUF4329); This domain is functionally uncharacterized. It is found ...
1292-1409 1.26e-38

Domain of unknown function (DUF4329); This domain is functionally uncharacterized. It is found in bacteria and eukaryotes, and is approximately 130 amino acids in length. It is often found in association with pfam05593 and pfam03527. There is a single completely conserved residue D and a highly conserved HTH motif which may be functionally important.


Pssm-ID: 433783  Cd Length: 114  Bit Score: 139.82  E-value: 1.26e-38
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1292 DAALDALKETQNRSLCNDMEYSGIVCKDTNGKYFASkAETDNLRKESypLKRKCPTGTDRVAAYHTHGADSHgDYVDEFF 1371
Cdd:pfam14220    1 DAAKDALEEYNGRSIRENREYCGFILTDDEGKYVYT-APTRGGEASS--GNPPVPNGQTVVASYHTHGAYDS-NYDSEVF 76
                           90       100       110
                   ....*....|....*....|....*....|....*...
gi 1487552165 1372 SSSDKNLVRSKDNNLEAFYLATPDGRFEALNNKGEYIF 1409
Cdd:pfam14220   77 SVQDKKIVLSDMQNGVNGYVATPGGRLWYIDPSRSYAR 114
RhsA COG3209
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ...
151-1297 5.47e-30

Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];


Pssm-ID: 442442 [Multi-domain]  Cd Length: 1103  Bit Score: 129.49  E-value: 5.47e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  151 DGHTLARLWGALPPDIRLSPHLYLATNSAQGPWWILGWSERVPGAEDVLPAPLPPYRVLTGMADRFGRTLTYRREAAGDL 230
Cdd:COG3209     33 GSTVLLAKGGLSTAAAAGGAATLTARSASTTDVVGTLTGAGGTSAGGVTALGDASAAGGGYVGGAAAGGGATLTGLAAAT 112
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  231 AGEITGVTDGAGREFRLVLTTQAQRAEEARTSSLSSSDSSRPLSASAFPDTLPGTEYGPDRGIRLSAVWLMHDPAYPESL 310
Cdd:COG3209    113 ASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGAVTLA 192
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  311 PAAPLVRYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRMVAHRyagrpemRYRYDDTGRVVEQLNPAGLSYRYLYEQDR 390
Cdd:COG3209    193 TGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAA-------TVTGSATGAAGAGAAVATAATTLGGTTGA 265
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  391 ITVTDSLNRREVLHTEGGAGLKRVVKKELADGSVTRSGYDAAGRLTAQTDAAGRRTEYGLNVVSGDITDITTPDGRETKF 470
Cdd:COG3209    266 GTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGT 345
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  471 YYNDGNQLTAVVSPDGLESRREYDEPGRLVSETSRSGETVRYRYDDAHSELPATTTDATGSTRQMTWSRYGQLLAFTDCS 550
Cdd:COG3209    346 TTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGA 425
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  551 GYQTRYEYDRFGQMTAVHREEGISLYRRYDNRGRLTSVKDAQGRETRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVST 630
Cdd:COG3209    426 LTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDT 505
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  631 TQGGLTRSMEYDAAGRVISLTNENGSHSVFSYDALDRLVQQGGFDGRTQRYHYDLTGKLTQSEDEGLVILWYYDESDRIT 710
Cdd:COG3209    506 LGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTT 585
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  711 HRTVNGEPAEQWQYDGHGWLTDISHlSEGHRVAVHYGYDDKGRLTGECQTVENPETGELLWQHETKHAYNEQGLANRVTP 790
Cdd:COG3209    586 GGTATTTTVTTTTTTSTAGTTTTTT-SGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTGTTTTRATGTTG 664
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  791 DSLPPVEWLTYGSGYLAGmkLGGTPLVEYTRDRLHRETVRSFGSMAGSNAAYELTSTYTPAGQLQSQHLNSLVYDRDYGW 870
Cdd:COG3209    665 TGTGVTAGLTTLATGGTT--VGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTG 742
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  871 SDNGDLVRISGPRQTREYGYSATGRLESVRT--LAPDLDIRIPYATDPAGNRlpdpelhpdsTLTVWPDNRIAEdahyvY 948
Cdd:COG3209    743 TLTTTSTTTTTTAGALTYTYDALGRLTSETTpgGVTQGTYTTRYTYDALGRL----------TSVTYPDGETVT-----Y 807
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  949 RHDEYGRLTEKtdrIPAGVIRTDDERTHHYHYDSQHRLVFYTRIQHGEPLVEsRYLYDPLGR-RMAKRVWRRER---DLT 1024
Cdd:COG3209    808 TYDALGRLTSV---ITVGSGGGTDLQDRTYTYDAAGNITSITDALRAGTLTQ-TYTYDALGRlTSATDPGTTESytyDAN 883
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1025 GWMSLSRKPEVTWYGWDG-DRLTTVQT-DTTRIQTVYEPgsftplirvetengerekaqrrslaetlqqegsenghgvvf 1102
Cdd:COG3209    884 GNLTSRTDGGTTTYTYDAlGRLVSVTKpDGTTTTYTYDA----------------------------------------- 922
                          970       980       990      1000      1010      1020      1030      1040
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1103 paelvrlldrleeeiradrvssesrawlaqcgltveqlarqvepeytparkahLYHCDHRGLPLALISEDGNTAWSAEYD 1182
Cdd:COG3209    923 -----------------------------------------------------LGHTDHLGSVRALTDASGQVVWRYDYD 949
                         1050      1060      1070      1080      1090      1100      1110      1120
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165 1183 EWGNQLNEENPHhVYQPYRLPGQQHDEESGLYYNRHRYYDPLQGRYITQDPMGLKGGWNLYQYPL-NPLQQIDPMGLLQT 1261
Cdd:COG3209    950 PFGNLLAETSGA-AANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPDPIGLAGGLNLYAYVGnNPVNYVDPLGLAAL 1028
                         1130      1140      1150
                   ....*....|....*....|....*....|....*.
gi 1487552165 1262 WDDARSGACTGGVCGVLSRIIGPSKFDSTADAALDA 1297
Cdd:COG3209   1029 LGTTGLGGGAGVGAGAAGGGAAAAGGSAGAGAAGGG 1064
Rhs_assc_core TIGR03696
RHS repeat-associated core domain; This model represents a conserved unique core sequence ...
1181-1258 2.19e-29

RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.


Pssm-ID: 274730 [Multi-domain]  Cd Length: 77  Bit Score: 112.21  E-value: 2.19e-29
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1487552165 1181 YDEWGNQLNEENphHVYQPYRLPGQQHDEESGLYYNRHRYYDPLQGRYITQDPMGLKGGWNLYQY-PLNPLQQIDPMGL 1258
Cdd:TIGR03696    1 YDPYGEVLSESG--AAPNPLRFTGQYYDAETGLYYNGARYYDPELGRFLSPDPIGLGGGLNLYAYvGNNPVNWVDPLGL 77
DUF6531 pfam20148
Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins.
46-123 1.47e-18

Domain of unknown function (DUF6531); This putative domain is found in a range of RHS proteins.


Pssm-ID: 466309 [Multi-domain]  Cd Length: 74  Bit Score: 81.04  E-value: 1.47e-18
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1487552165   46 GNPVNPLLGAKVLPgETDLALPGPLPFILSRTYSSYRTKTpapvGVFGPGWKAPSDIRLQLRDDG-LILNDNGGRSIHF 123
Cdd:pfam20148    1 GDPVNVATGNKVLE-ETDFSLPGPLPLVWTRTYNSSSERD----GPLGPGWSHPYDQRLELEGDGgVVYIDADGREVTF 74
RhsA COG3209
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ...
59-755 5.96e-16

Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];


Pssm-ID: 442442 [Multi-domain]  Cd Length: 1103  Bit Score: 83.65  E-value: 5.96e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165   59 PGETDLALPGPLPFILSRTYSSYRTKTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDNGGRSIHFEPLLPGEAVYSRSES 138
Cdd:COG3209    180 PATGVGTGAVTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTL 259
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  139 MWLVRGGKAAQPDGHTLARLWGALPPDIRLSPHLYLATNSAQGPWWILGWSERVPGAEDVLPAPLPPYRVLTGMADRFGR 218
Cdd:COG3209    260 GGTTGAGTGASGAGLDASTGTGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTG 339
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  219 TLTYRREAAGDLAGEITGVTDGAGREFRLVLTTQAQRAEEARTSSLSSSDSSRPLSASAFPDTLPGTEYGPDRGIRLSAV 298
Cdd:COG3209    340 TGTGGTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGP 419
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  299 WLMHDPAYPESLPAAPLVRYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRMVAHRYAGRPEMRYRYDDTGRVVEQLNPA 378
Cdd:COG3209    420 ATAAGALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTD 499
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  379 GLSYRYLYEQDRITVTDSLNRREVLHTEGGAGLKRVVKKELADGSVTRSGYDAAGRLTAQTDAAGRRTEYGLNVVSGDIT 458
Cdd:COG3209    500 TTLDDTLGGTTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGAS 579
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  459 DITTPDGRETKFYYNDGNQLTAVVSPDGLESRREYDEPGRLVSETSRSGETVRYRYDDAHSELPATTTDATGSTRQMTWS 538
Cdd:COG3209    580 TTTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTGTTTTRA 659
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  539 RYGQLLAFTDCSGYQTRYEYDRFGQMTAVHREEGISLYRRYDNRGRLTSVKDAQGRETRYEYNAAGDLTAVITPDGNRSE 618
Cdd:COG3209    660 TGTTGTGTGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGG 739
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1487552165  619 TQYDAWGKAVSTTQGGLTRSMEYDAAGRVISLTNENGSHSV-----FSYDALDRLVQQGGFDGRTQRYHYDLTGKLTQ-- 691
Cdd:COG3209    740 TTGTLTTTSTTTTTTAGALTYTYDALGRLTSETTPGGVTQGtyttrYTYDALGRLTSVTYPDGETVTYTYDALGRLTSvi 819
                          650       660       670       680       690       700       710
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1487552165  692 -----SEDEGLVILWYYDESDRITHRT---VNGEPAEQWQYDGHGWLTDISHLSEGHRvavhYGYDDKGRLT 755
Cdd:COG3209    820 tvgsgGGTDLQDRTYTYDAAGNITSITdalRAGTLTQTYTYDALGRLTSATDPGTTES----YTYDANGNLT 887
RHS pfam03527
RHS protein;
1155-1191 1.78e-12

RHS protein;


Pssm-ID: 427349 [Multi-domain]  Cd Length: 38  Bit Score: 62.71  E-value: 1.78e-12
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 1487552165 1155 HLYHCDHRGLPLALISEDGNTAWSAEYDEWGNQLNEE 1191
Cdd:pfam03527    2 YYYHTDHLGTPEELTDEAGEIVWSAEYDAWGNVTEER 38
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
579-615 1.96e-08

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 51.45  E-value: 1.96e-08
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 1487552165  579 YDNRGRLTSVKDAQGRETRYEYNAAGDLTAVITPDGN 615
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
PAAR_2 cd14738
proline-alanine-alanine-arginine (PAAR) domain; This domain is found in the PAAR ...
1-29 7.59e-08

proline-alanine-alanine-arginine (PAAR) domain; This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli.


Pssm-ID: 269823  Cd Length: 94  Bit Score: 51.48  E-value: 7.59e-08
                           10        20
                   ....*....|....*....|....*....
gi 1487552165    1 MSGKPAARQGDMTQYGGPIVQGSAGVRIG 29
Cdd:cd14738     66 IGGKPAARMGDSTAHGGVIVSGVPTVLIG 94
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
579-620 1.86e-07

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 48.74  E-value: 1.86e-07
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|..
gi 1487552165  579 YDNRGRLTSVKDAQGRETRYEYNAAGDLTAVITPDGNRSETQ 620
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRYE 42
PAAR COG4104
Zn-binding Pro-Ala-Ala-Arg (PAAR) domain, involved in Type VI secretion [Intracellular ...
3-48 1.51e-05

Zn-binding Pro-Ala-Ala-Arg (PAAR) domain, involved in Type VI secretion [Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 443280  Cd Length: 87  Bit Score: 44.81  E-value: 1.51e-05
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1487552165    3 GKPAARQGDMTQYGGPIVQGSAGVRIG----APTG--VACSVC-PGGMTSGNP 48
Cdd:COG4104      2 PKPAARLGDKTSHGGPVISGSPTVLIGgrpaARVGdkVSCPKHgPDTIAEGSP 54
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
429-466 6.46e-05

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 41.43  E-value: 6.46e-05
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1487552165  429 YDAAGRLTAQTDAAGRRTEYGLNvVSGDITDITTPDGR 466
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYD-AAGRLTAVTDPDGT 37
PAAR COG4104
Zn-binding Pro-Ala-Ala-Arg (PAAR) domain, involved in Type VI secretion [Intracellular ...
3-29 7.53e-05

Zn-binding Pro-Ala-Ala-Arg (PAAR) domain, involved in Type VI secretion [Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 443280  Cd Length: 87  Bit Score: 42.88  E-value: 7.53e-05
                           10        20
                   ....*....|....*....|....*..
gi 1487552165    3 GKPAARQGDMTQYGGPIVQGSAGVRIG 29
Cdd:COG4104     60 GKPAARVGDKTACGGTIISGSPTVLIG 86
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
641-677 1.47e-04

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 40.27  E-value: 1.47e-04
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 1487552165  641 YDAAGRVISLTNENGSHSVFSYDALDRLVQQGGFDGR 677
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
537-572 2.10e-04

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 39.89  E-value: 2.10e-04
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 1487552165  537 WSRYGQLLAFTDCSGYQTRYEYDRFGQMTAVHREEG 572
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
413-444 3.33e-04

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 39.50  E-value: 3.33e-04
                           10        20        30
                   ....*....|....*....|....*....|..
gi 1487552165  413 RVVKKELADGSVTRSGYDAAGRLTAQTDAAGR 444
Cdd:pfam05593    6 RLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
413-448 9.38e-04

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 38.34  E-value: 9.38e-04
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 1487552165  413 RVVKKELADGSVTRSGYDAAGRLTAQTDAAGRRTEY 448
Cdd:TIGR01643    6 RLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
493-531 1.52e-03

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 37.58  E-value: 1.52e-03
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 1487552165  493 YDEPGRLVSETSRSGETVRYRYDDAHseLPATTTDATGS 531
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAG--RLTAVTDPDGT 37
PAAR_2 cd14738
proline-alanine-alanine-arginine (PAAR) domain; This domain is found in the PAAR ...
3-29 1.86e-03

proline-alanine-alanine-arginine (PAAR) domain; This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli.


Pssm-ID: 269823  Cd Length: 94  Bit Score: 39.15  E-value: 1.86e-03
                           10        20        30
                   ....*....|....*....|....*....|
gi 1487552165    3 GKPAARQGDMTQYGGP---IVQGSAGVRIG 29
Cdd:cd14738     38 GLPAARVGDMCVCVGPpdtIVQGSSTVLIG 67
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
472-512 2.21e-03

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 37.18  E-value: 2.21e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|.
gi 1487552165  472 YNDGNQLTAVVSPDGLESRREYDEPGRLVSETSRSGETVRY 512
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
558-594 3.11e-03

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 36.42  E-value: 3.11e-03
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 1487552165  558 YDRFGQMTAVHREEGISLYRRYDNRGRLTSVKDAQGR 594
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
472-508 3.50e-03

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 36.42  E-value: 3.50e-03
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 1487552165  472 YNDGNQLTAVVSPDGLESRREYDEPGRLVSETSRSGE 508
Cdd:pfam05593    1 YDAAGRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
PAAR_like cd14671
proline-alanine-alanine-arginine (PAAR) repeat superfamily; This domain is found in the PAAR ...
5-48 4.07e-03

proline-alanine-alanine-arginine (PAAR) repeat superfamily; This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat superfamily, where it forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). The T6SS is responsible for translocation of a wide variety of toxic effector molecules, allowing predatory cells to kill prokaryotic as well as eukaryotic prey cells. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. The PAAR-repeat proteins form a diverse superfamily with several subgroups extended both N- and C-terminally by domains with various predicted functions; the termini are exposed to solution, and do not distort the VgrG binding site. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes. It has been shown that PAAR proteins are essential for T6SS-mediated secretion and target cell killing by Vibrio cholerae (encodes two PAAR proteins) and Acinetobacter baylyi (encodes three PAAR proteins); inactivation of all these PAAR genes results in inactivation of Hcp secretion as well as T6SS-dependent killing of E. coli.


Pssm-ID: 269821  Cd Length: 77  Bit Score: 37.69  E-value: 4.07e-03
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1487552165    5 PAARQGDMTQ--YGGPIVQGSAGVRIG---APTGVACSVCPGG---MTSGNP 48
Cdd:cd14671      1 PAARVGDPTAhtPGGPVISGSPNVFINgrpAARVGDVGDHPGGgnaIVSGSG 52
PAAR_CT_1 cd14743
proline-alanine-alanine-arginine (PAAR) domain with C-terminal extension; This domain is found ...
2-39 4.96e-03

proline-alanine-alanine-arginine (PAAR) domain with C-terminal extension; This domain is found in the PAAR (proline-alanine-alanine-arginine) repeat family of mostly gamma-proteobacteria, and forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS). Some members contains C-terminal domain extensions corresponding to Rearrangement hotspot (Rhs) protein repeats and conserved Rhs repeat-associated unique core sequences as well as uncharacterized domains. However, these terminal domains are exposed to solution, and do not distort the binding site of VgrG. Rhs and related YD-peptide repeat proteins are widely distributed in bacteria. Rhs shares similar architecture with distantly related WapA proteins of Bacillus and Listeria species, suggesting intercellular growth inhibition as its primary function. Additionally, a plasmid-encoded Rhs protein has been implicated in bacteriocin production in Pseudomonas savastanoi. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes.


Pssm-ID: 269828  Cd Length: 78  Bit Score: 37.28  E-value: 4.96e-03
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 1487552165    2 SGKPAARQGDMTQYGGPIVQGSAGVRI-GAPTGVACSVC 39
Cdd:cd14743     30 DGLPAARVGDKTSCGATIVSGSINVLInGKPAAVLGSTT 68
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
600-637 5.05e-03

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 36.03  E-value: 5.05e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|
gi 1487552165  600 YNAAGDLTAVITPDGNRSETQYDAWGKAVSTT--QGGLTR 637
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITdaDGGSTR 40
PAAR_RHS cd14742
proline-alanine-alanine-arginine (PAAR) domain, also containing C-terminal Rearrangement ...
3-29 7.17e-03

proline-alanine-alanine-arginine (PAAR) domain, also containing C-terminal Rearrangement hotspot (Rhs) extensions; This PAAR (proline-alanine-alanine-arginine) repeat subfamily, which forms a sharp conical extension on the VgrG spike, a trimeric protein complex of the bacterial type VI secretion system (T6SS), contains C- and N-terminal domain extensions. These include Rearrangement hotspot (Rhs) protein repeats and conserved Rhs repeat-associated unique core sequences at the C-terminal, and various predicted functions at N- and C-terminal extensions. However, these terminal domains are exposed to solution, and do not distort the binding site of VgrG. Rhs and related YD-peptide repeat proteins are widely distributed in bacteria. Rhs shares similar architecture with distantly related WapA proteins of Bacillus and Listeria species, suggesting intercellular growth inhibition as its primary function. Additionally, a plasmid-encoded Rhs protein has been implicated in bacteriocin production in Pseudomonas savastanoi. The pointed tip of the PAAR domain is stabilized by a zinc atom positioned close to the cone's vertex and is likely to be important for its integrity during penetration of the target cell envelope. VgrG proteins are orthologous to the central baseplate spikes of bacteriophages with contractile tails, and genes encoding proteins with PAAR motifs have been frequently found immediately downstream from vgrG-like genes.


Pssm-ID: 269827  Cd Length: 86  Bit Score: 37.18  E-value: 7.17e-03
                           10        20
                   ....*....|....*....|....*..
gi 1487552165    3 GKPAARQGDMTQYGGPIVQGSAGVRIG 29
Cdd:cd14742     60 GQPAARKGDKTTCSAVISEGSPNVFIG 86
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH