NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|556195944|ref|WP_023279427|]
View 

MULTISPECIES: phage tail protein [Klebsiella]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
COG4733 COG4733
Phage-related protein, tail protein J [Mobilome: prophages, transposons];
244-1066 8.18e-177

Phage-related protein, tail protein J [Mobilome: prophages, transposons];


:

Pssm-ID: 443767 [Multi-domain]  Cd Length: 978  Bit Score: 573.43  E-value: 8.18e-177
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  244 LVGLKVNSEQFGSSMPSRSYLIRGLKIRVPSNYDENtntynGVWDGTFKLLSSSNPAWILFDLLTNARYGLGKFVSESMI 323
Cdd:COG4733   148 LVGLRFDAEQFNGSIPNVNALVRGRKIRVPSNYDPS-----GVWDGTFKWAWTNNPAWVFYDLLTGDRYGLGRRLTAADI 222
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  324 DLGQLYQIGRYCDEEVDDGFGGKEKRFAINTQITSRQDAYRLIQDIAGAFRGMVFWAGGMVNIMQDSPSD-PVMLFTNAN 402
Cdd:COG4733   223 DKWSLYAIAQYCDQKVPDGGGGTEPRFTCNVYIQSQASAWDVLRDIAAAFRGMPYWDGGKLGVVADRPRDpPVATFTPAN 302
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  403 VKDGLFTYKGSARKDRPSVALITYNNKQDGYKQNVEYVEDQEAMARYGERKTEAVAFGCTSRGQAHRVGLWLLYTARMES 482
Cdd:COG4733   303 VVDGSFTYSYSSRKERPNAALVSFSDPDNGYQQAEEPVEDPDLIARYGVNQTELTAPGCTSRGQAQREGRWALLTNRYRT 382
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  483 DMITFTAGLDASFLMPGETVLIQNKYRAGKRNSGRIVSFTKNSITLDAPVSLKKSGSFIRIINQEGKIVERDIneTGDNI 562
Cdd:COG4733   383 RTVTFSVGLDGLVATPGDVIAVADDVLAGRRIGGRVSSVDGRVVTLDRPVTMEAGDRYLRVRLPDGTSVARTV--QSVAG 460
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  563 TKVTFKTALATAdqPVANGVWTITEPDLVPMRARVVAIAQGEtPGSFDITVVQNNASKYQAIDNGAALVPenttvldPTY 642
Cdd:COG4733   461 RTLTVSTAYSET--PEAGAVWAFGPDELETQLFRVVSIEENE-DGTYTITAVQHAPEKYAAIDAGAFDDV-------PPQ 530
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  643 SKPSNLVISEGTYLSSPGNLSVKLMLAWEGK--SPEYWVSWRRSDegnvSNWQSARATEEQYEIVNVAENGRYDFQLYSV 720
Cdd:COG4733   531 WPPVNVTTSESLSVVAQGTAVTTLTVSWDAPagAVAYEVEWRRDD----GNWVSVPRTSGTSFEVPGIYAGDYEVRVRAI 606
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  721 SFGGKKSEIITAV-YQVKGTMTPPGAPTSLTAVGDYRNVVLNWVNPDSVDLAQINVYASKTNKLDTATLI-AQAATTTFT 798
Cdd:COG4733   607 NALGVSSAWAASSeTTVTGKTAPPPAPTGLTATGGLGGITLSWSFPVDADTLRTEIRYSTTGDWASATVAqALYPGNTYT 686
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  799 HAGLGDNETWYYWIRAVNKRGMVGQPNSNLGTEATTRDVLSFLKDKITSSELGKElLDEIDSKATQEAVDNAIGEVQNSV 878
Cdd:COG4733   687 LAGLKAGQTYYYRARAVDRSGNVSAWWVSGQASADAAGILDAITGQILETELGQE-LDAIIQNATVAEVVAATVTDVTAQ 765
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  879 NESIQQVENDLAQTSSEIKAQVDSVnqslkedidTVNQTIVDNIDTVNQTINTNISNVNSQIEAAKQSIKDGDAALSQEI 958
Cdd:COG4733   766 IDTAVLFAGVATAAAIGAEARVAAT---------VAESATAAAATGTAADAAGDASGGVTAGTSGTTGAGDTAASTTRVA 836
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  959 KKAQSSLTTSLSQTSKDLTAAIQKETNDRIADVNDAAKQAADQLLSAkNELKTSIDSLSEVVTSGDENLARQISQIAAGT 1038
Cdd:COG4733   837 AAVVLAGVVVYGDAIIESGNTGDIVATGDIASAAAGAVATTVSGTTA-ADVSAVADSTAASLTAIVIAATTIIDAIGDGT 915
                         810       820       830
                  ....*....|....*....|....*....|.
gi 556195944 1039 GEQFDSLK---IWYFDQDAEGWTEDDNGYTP 1066
Cdd:COG4733   916 TREPAGDIgasGGAQGFAVTIVGSFDGAGAV 946
DUF1983 pfam09327
Domain of unknown function (DUF1983); Members of this family of functionally uncharacterized ...
3763-3833 1.10e-16

Domain of unknown function (DUF1983); Members of this family of functionally uncharacterized domains are found in various bacteriophage host specificity proteins.


:

Pssm-ID: 430529  Cd Length: 75  Bit Score: 77.34  E-value: 1.10e-16
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 556195944  3763 QELKKTVVENGNVNSTWMVKMETNSNGKKYAAGIALGIDGKN--LQSQFLVQADRFGLINTSNGNTTTPFVVE 3833
Cdd:pfam09327    3 QKSTAVADLDGKLSAMYSIKAQVKANGQKYVAGIALGAESGGgvTTSQVLFMADRFAIVNPANGNVTPPFVVQ 75
ILEI pfam15711
Interleukin-like EMT inducer; ILEI is a family of proteins found in vertebrates. It is heavily ...
3059-3140 8.35e-09

Interleukin-like EMT inducer; ILEI is a family of proteins found in vertebrates. It is heavily involved in the process of the transition from epithelial to mesenchymal tissue - EMT - during all of embryonic development, cancer progression, metastasis, and chronic inflammation/fibrosis. ILEI is upregulated exclusively at the level of translation, and abnormal ILEI expression, ie cytoplasmic over-expression instead of vesicular localization, is associated with EMT in human cancerous tissue. In order to induce and maintain the EMT of hepatocytes in a TGF-beta-independent fashion ILEI needs the cooperation of oncogenic Ras.


:

Pssm-ID: 464817  Cd Length: 89  Bit Score: 55.34  E-value: 8.35e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  3059 DGSTV-IATSKTYDVFgSANNGATMSADIEALASGTYVCVLTFDEPTGN-RGKVLSALESLGgtSEVVNSLPYRGAYILL 3136
Cdd:pfam15711    9 DACTGkVLDSKSFDTY-SYSDSSRLANFLKSIPDGSIVLIATKDEASSKlSDEARKALESLG--SSKIDNLGFRDSWAFI 85

                   ....
gi 556195944  3137 GRKG 3140
Cdd:pfam15711   86 GFKG 89
DUF3672 super family cl13808
Fibronectin type III protein; This domain family is found in bacteria and viruses, and is ...
3872-3948 1.14e-05

Fibronectin type III protein; This domain family is found in bacteria and viruses, and is typically between 126 and 146 amino acids in length. The family is found in association with pfam09327, pfam00041. There are two completely conserved G residues that may be functionally important. Many of the proteins in this family are annotated as fibronectin type III however there is little accompanying literature to confirm this.


The actual alignment was detected with superfamily member pfam12421:

Pssm-ID: 289206  Cd Length: 133  Bit Score: 47.65  E-value: 1.14e-05
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 556195944  3872 VGKDGSSQFHNVIVRGHVEAESGSFKGTIDATDGVFRGTVQASRFVGDICSAGVFKQGVRPDITHYDSGSVGGTKTY 3948
Cdd:pfam12421    1 LTPDGHLTAKNGDFRGSINANSGTLNNVTIAENCTISGTLRAEKILGDIVKAGVWEFPYVREPASSNHRYFSGTLTV 77
CBM_4_9 super family cl19911
Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.
3235-3345 2.29e-05

Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.


The actual alignment was detected with superfamily member pfam02018:

Pssm-ID: 418717  Cd Length: 134  Bit Score: 47.06  E-value: 2.29e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  3235 GNLIVNPSFE-RGTEGYTGWSGIATVVTLQVPHLGTKAAKLAAGGSAGVGQ----KISFKKDRSYKIGIWAKQDpnttiq 3309
Cdd:pfam02018    1 GNLIKNGTFEdGGLDGWKARGGSGKATVDVTSYNGTYSLKVSGRTATWDGQiidiTIRLEKGTTYTVSFWVKAS------ 74
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 556195944  3310 STDNTKFRVAEGnGLIASKAYGPFTSNWQEVSWTWK 3345
Cdd:pfam02018   75 SGPPQTVSVTLQ-ITDASGNYDTVADEKVVLTGEWT 109
FhaB super family cl27105
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ...
1874-3465 2.62e-05

Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];


The actual alignment was detected with superfamily member COG3210:

Pssm-ID: 442443 [Multi-domain]  Cd Length: 1698  Bit Score: 50.54  E-value: 2.62e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 1874 VNADTTITMAPGSRVFDDTGAVYVNGVRVAFGNASWNTVSFDLKAGWSTVEFLVNQWTGQAYINLGFKLSEKVAQLNSAL 1953
Cdd:COG3210    77 STGGIGAAAANTAGTLETGLTSNIGGGSVNGSNSTGNGTLTTTAASATTGNNTGGTTTSSTNTVTTLGGTTTGNTVLSTS 156
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 1954 GMNALSNAISAVTSNVSTVGDRVTSTSQSVTDLRNSLEQTNANLENKADAQALSTLQNTVSKQRDTISSQGNSITNLNNT 2033
Cdd:COG3210   157 GAGNNTNTNNSSSGTNIGNSIPTTGGSLNVVAANPTGVTGVGGALINATAGVLANAGGGTAGGVASANSTLTGGVVAAGT 236
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2034 LTAARNAGDNLIPNYDFLQGATAWDTQYPAGVTFGNFGDGKAGVKLNRTTTTSPGIFSNNNKPLPLNGQRK--------- 2104
Cdd:COG3210   237 GAGVISTGGTDISSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGgtgvlgggt 316
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2105 -YRVVVKAKGVSGAMNMLIRRQNKIGQTDSNYEDKNVTLTSEWQTITWETGLTASNADGQNFKLYAHPANAEIWVDTFKV 2183
Cdd:COG3210   317 aAGITTTNTVGGNGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASS 396
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2184 FDITDEVKIKANSDALSTLSSTVTQQGDKITSQGNSITKLTNDLEAADANIAKKADQSAVTTLTGRVEKTESGLTAANSN 2263
Cdd:COG3210   397 TTVLGSGSLATGNTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSA 476
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2264 ITSLSSSLNQQSKRGANLLPDGTFESYAVGHNLSNNRVIVTTDDSHGGNKCIRVTRPNDYNANATDNSDNHIFSGFQVRD 2343
Cdd:COG3210   477 GNTTSATTLAGGGIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTGGDGTTLSGSGLTTTVSGGASGTT 556
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2344 NAVFYMecWVKLDAKSTAMAENAQISIGLSLQYQDNSWQWPAVTKAAKDLSTAQWTKVSGYLKSTKSGIKQAMVRISIPN 2423
Cdd:COG3210   557 AASGSN--TANTLGVLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGL 634
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2424 VSSVKAGNSFLIDDLVITEVTDAYNAQSTADANANAISTLDSTVSQQGDQITSQGNSITKLTNDLSTTNSNVAKKADGAA 2503
Cdd:COG3210   635 TGSAVGAALSGTGSGTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAGNTLTISTGSIT 714
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2504 VTALTnrvteaeGNISSQSNQLATLSNSLAEGSLICNGGLNVDASFWEDSGPGSAFTYDANEKAIRTTTGSirvANLTRI 2583
Cdd:COG3210   715 VTGQI-------GALANANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLANAN---GNTSAG 784
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2584 PVEAETTLTISFEMKASETISNVSSDSVGVIADLATPTNWISSVSPWLGGVSTNWQTKSVELTIPANFIGKYVYLRFAAG 2663
Cdd:COG3210   785 ATLDNAGAEISIDITADGTITAAGTTAINVTGSGGTITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAG 864
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2664 GWTPSNSARLYIRKVDVFSSTGVAKKANATAVSDLTSRVDSTEGKLASQSQALTKLQNDLATTNNNVSKKADANALTALT 2743
Cdd:COG3210   865 ANSGSLAATAASITVGSGGVATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTA 944
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2744 NRVTQTEKDINSTSSSVTNLNNKVDAISVGGTNLIKNSGDMTGWSNVVSDTYRGNAVIGATVKAGSGYRDLREITLESPV 2823
Cdd:COG3210   945 LSGTQGNAGLSAASASDGAGDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTG 1024
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2824 DAGEYVYSFYAKGGVAGQTMTAFFYNPNTTTSIETSQGAKGNNSDGRAQFTLTTSWARYWVKWKQTPTTGTKRLILCRIE 2903
Cdd:COG3210  1025 TTGTASATGTGTAATAGGQNGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATG 1104
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2904 SNTSKDQTVYINSPKFEVGNVVSDWNESPSDSASASAVDSLTTKVNQQGTSISSIGNRTTSLENGLSTAQNNIAKKADAS 2983
Cdd:COG3210  1105 TSGGTTTSTGGVTASKVGGTTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGG 1184
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2984 ALQDLRNTVTSQGDDLTAANSSITSLQASMNRRTVFTVTARGNGNSVTHGVFDESGKNLFTPGRSWALVTFAKHSDGSTV 3063
Cdd:COG3210  1185 ADSAATEGTAGTDLKGGDSTGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGA 1264
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3064 IATSKTYDVFGSANNGATMSADIEALASGTYVCVLTFDEPTGNRGKVLSALESLGGTSEVVNSLPYRGAYILLGRKGMKP 3143
Cdd:COG3210  1265 VSNGATSTVAGNAGATATGSTVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGTTATGTAVAAVNSGGVNAGGGTIN 1344
                        1290      1300      1310      1320      1330      1340      1350      1360
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3144 GDGLELRAPTGGDATAHISTSVEFVNGVMMGLGAAGGVMMKADANASAITTLQNTVKTQGDNIDSLSSSTTALENSLASS 3223
Cdd:COG3210  1345 TTAANTGLNGGNGATDSAAGAGSGGAAGSLAATAGAGTVLTGAGNNTGAEGTNAGRDGGVTTSGTGVGNNGGVSGTTVAG 1424
                        1370      1380      1390      1400      1410      1420      1430      1440
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3224 NASVDAASQIPGNLIVNPSFERGTEGYTGWSGIATVVTLQVPHLGTKAAKLAAGGSAGVGQKISFKKDRSYKIGIWAKQD 3303
Cdd:COG3210  1425 TTGSSATTGTGGTGNTTGTSVAGAGGGNADASAINTGNASSLGAGGSTAGNAVGGAVIGGTTTGGNGAGVAGATASNGGT 1504
                        1450      1460      1470      1480      1490      1500      1510      1520
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3304 PNTTIQSTDNTKFRVAEGNGLIASKAYGPFTSNWQEVSWTWKATKDVVADVQFTAFLSAGAMYFDDFYVVDVTDSVETQA 3383
Cdd:COG3210  1505 STGAGGTAGGTTAEVAKASLEGGEGTYGGSSVAEAGTGGGILGAVSGAGSEGGAAGGVTGSVGVGGTDGAGGDTGGADDT 1584
                        1530      1540      1550      1560      1570      1580      1590      1600
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3384 NSSAITNLDSRVTKTENDITSQGSQVTQLKNDLATTNTNVSKKADAAALTALTNRVTQNEKEIETQSSQTTSLKNSLSTV 3463
Cdd:COG3210  1585 GAQAPTAGNTATLTLSLAEGTNAEYGGTTNVTSGTAGNAGATGANSNTVVTTNGGEGVLALVAGGNTTNGTTLSGAVNGA 1664

                  ..
gi 556195944 3464 QA 3465
Cdd:COG3210  1665 GN 1666
CALCOCO1 super family cl37761
Calcium binding and coiled-coil domain (CALCOCO1) like; Proteins found in this family are ...
1196-1378 4.17e-04

Calcium binding and coiled-coil domain (CALCOCO1) like; Proteins found in this family are similar to the coiled-coil transcriptional coactivator protein coexpressed by Mus musculus (CoCoA/CALCOCO1). This protein binds to a highly conserved N-terminal domain of p160 coactivators, such as GRIP1, and thus enhances transcriptional activation by a number of nuclear receptors. CALCOCO1 has a central coiled-coil region with three leucine zipper motifs, which is required for its interaction with GRIP1 and may regulate the autonomous transcriptional activation activity of the C-terminal region.


The actual alignment was detected with superfamily member pfam07888:

Pssm-ID: 462303 [Multi-domain]  Cd Length: 488  Bit Score: 46.04  E-value: 4.17e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  1196 LQEEQQARANADTA-EAQARSTLAAQIRGSSESGNLDDIRSGLiyQEKNARITADAAEASAR----ESLQTEFNRNKASV 1270
Cdd:pfam07888   36 LEECLQERAELLQAqEAANRQREKEKERYKRDREQWERQRREL--ESRVAELKEELRQSREKheelEEKYKELSASSEEL 113
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  1271 AEELHTLSTEQASQASKITGLQ---TSLGQKADASAVQ--TISQKVEEQGNTLKSQGAALSTLDNRVGSVESGVSANSKA 1345
Cdd:pfam07888  114 SEEKDALLAQRAAHEARIRELEediKTLTQRVLERETEleRMKERAKKAGAQRKEEEAERKQLQAKLQQTEEELRSLSKE 193
                          170       180       190
                   ....*....|....*....|....*....|...
gi 556195944  1346 ITGLQSTVTQQDKTLSSQSESITTLNNSLSDIQ 1378
Cdd:pfam07888  194 FQELRNSLAQRDTQVLQLQDTITTLTQKLTTAH 226
COG4372 COG4372
Uncharacterized protein, contains DUF3084 domain [Function unknown];
3629-3771 6.47e-04

Uncharacterized protein, contains DUF3084 domain [Function unknown];


:

Pssm-ID: 443500 [Multi-domain]  Cd Length: 370  Bit Score: 45.28  E-value: 6.47e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3629 ITDAKEAQDSANANASA----LTSLTSRVTNVEGLVTSQASQLSSLTSQVNDASSKVDQmaqtitnnekTQSSLNTSLQS 3704
Cdd:COG4372    26 IAALSEQLRKALFELDKlqeeLEQLREELEQAREELEQLEEELEQARSELEQLEEELEE----------LNEQLQAAQAE 95
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 556195944 3705 QIDAQASANIKNQtELNNATTSLAAIKSTQQTQATTISALSQQQTNLTAQVGGQSAELQELKKTVVE 3771
Cdd:COG4372    96 LAQAQEELESLQE-EAEELQEELEELQKERQDLEQQRKQLEAQIAELQSEIAEREEELKELEEQLES 161
CBM_4_9 super family cl19911
Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.
1389-1507 1.18e-03

Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.


The actual alignment was detected with superfamily member pfam02018:

Pssm-ID: 418717  Cd Length: 134  Bit Score: 42.05  E-value: 1.18e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  1389 SNLLVNASFE-RDLAGWSAGNSVSSIIKASAPHSGSKILVCAAGT-----VQITQSVSVVEGRTYKLSSFVRcttdavIS 1462
Cdd:pfam02018    1 GNLIKNGTFEdGGLDGWKARGGSGKATVDVTSYNGTYSLKVSGRTatwdgQIIDITIRLEKGTTYTVSFWVK------AS 74
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|
gi 556195944  1463 SPGNNKLRI-----GAATLLKEIPIRPENLPKDetWKEVSDTWKATLTGK 1507
Cdd:pfam02018   75 SGPPQTVSVtlqitDASGNYDTVADEKVVLTGE--WTKLEGTFTIPKTAS 122
 
Name Accession Description Interval E-value
COG4733 COG4733
Phage-related protein, tail protein J [Mobilome: prophages, transposons];
244-1066 8.18e-177

Phage-related protein, tail protein J [Mobilome: prophages, transposons];


Pssm-ID: 443767 [Multi-domain]  Cd Length: 978  Bit Score: 573.43  E-value: 8.18e-177
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  244 LVGLKVNSEQFGSSMPSRSYLIRGLKIRVPSNYDENtntynGVWDGTFKLLSSSNPAWILFDLLTNARYGLGKFVSESMI 323
Cdd:COG4733   148 LVGLRFDAEQFNGSIPNVNALVRGRKIRVPSNYDPS-----GVWDGTFKWAWTNNPAWVFYDLLTGDRYGLGRRLTAADI 222
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  324 DLGQLYQIGRYCDEEVDDGFGGKEKRFAINTQITSRQDAYRLIQDIAGAFRGMVFWAGGMVNIMQDSPSD-PVMLFTNAN 402
Cdd:COG4733   223 DKWSLYAIAQYCDQKVPDGGGGTEPRFTCNVYIQSQASAWDVLRDIAAAFRGMPYWDGGKLGVVADRPRDpPVATFTPAN 302
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  403 VKDGLFTYKGSARKDRPSVALITYNNKQDGYKQNVEYVEDQEAMARYGERKTEAVAFGCTSRGQAHRVGLWLLYTARMES 482
Cdd:COG4733   303 VVDGSFTYSYSSRKERPNAALVSFSDPDNGYQQAEEPVEDPDLIARYGVNQTELTAPGCTSRGQAQREGRWALLTNRYRT 382
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  483 DMITFTAGLDASFLMPGETVLIQNKYRAGKRNSGRIVSFTKNSITLDAPVSLKKSGSFIRIINQEGKIVERDIneTGDNI 562
Cdd:COG4733   383 RTVTFSVGLDGLVATPGDVIAVADDVLAGRRIGGRVSSVDGRVVTLDRPVTMEAGDRYLRVRLPDGTSVARTV--QSVAG 460
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  563 TKVTFKTALATAdqPVANGVWTITEPDLVPMRARVVAIAQGEtPGSFDITVVQNNASKYQAIDNGAALVPenttvldPTY 642
Cdd:COG4733   461 RTLTVSTAYSET--PEAGAVWAFGPDELETQLFRVVSIEENE-DGTYTITAVQHAPEKYAAIDAGAFDDV-------PPQ 530
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  643 SKPSNLVISEGTYLSSPGNLSVKLMLAWEGK--SPEYWVSWRRSDegnvSNWQSARATEEQYEIVNVAENGRYDFQLYSV 720
Cdd:COG4733   531 WPPVNVTTSESLSVVAQGTAVTTLTVSWDAPagAVAYEVEWRRDD----GNWVSVPRTSGTSFEVPGIYAGDYEVRVRAI 606
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  721 SFGGKKSEIITAV-YQVKGTMTPPGAPTSLTAVGDYRNVVLNWVNPDSVDLAQINVYASKTNKLDTATLI-AQAATTTFT 798
Cdd:COG4733   607 NALGVSSAWAASSeTTVTGKTAPPPAPTGLTATGGLGGITLSWSFPVDADTLRTEIRYSTTGDWASATVAqALYPGNTYT 686
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  799 HAGLGDNETWYYWIRAVNKRGMVGQPNSNLGTEATTRDVLSFLKDKITSSELGKElLDEIDSKATQEAVDNAIGEVQNSV 878
Cdd:COG4733   687 LAGLKAGQTYYYRARAVDRSGNVSAWWVSGQASADAAGILDAITGQILETELGQE-LDAIIQNATVAEVVAATVTDVTAQ 765
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  879 NESIQQVENDLAQTSSEIKAQVDSVnqslkedidTVNQTIVDNIDTVNQTINTNISNVNSQIEAAKQSIKDGDAALSQEI 958
Cdd:COG4733   766 IDTAVLFAGVATAAAIGAEARVAAT---------VAESATAAAATGTAADAAGDASGGVTAGTSGTTGAGDTAASTTRVA 836
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  959 KKAQSSLTTSLSQTSKDLTAAIQKETNDRIADVNDAAKQAADQLLSAkNELKTSIDSLSEVVTSGDENLARQISQIAAGT 1038
Cdd:COG4733   837 AAVVLAGVVVYGDAIIESGNTGDIVATGDIASAAAGAVATTVSGTTA-ADVSAVADSTAASLTAIVIAATTIIDAIGDGT 915
                         810       820       830
                  ....*....|....*....|....*....|.
gi 556195944 1039 GEQFDSLK---IWYFDQDAEGWTEDDNGYTP 1066
Cdd:COG4733   916 TREPAGDIgasGGAQGFAVTIVGSFDGAGAV 946
Phage-tail_3 pfam13550
Putative phage tail protein; This putative domain is found in the large gene transfer agent ...
357-518 5.20e-32

Putative phage tail protein; This putative domain is found in the large gene transfer agent protein. These produce defective phage like particles. This domain is similar to other phage-tail protein families.


Pssm-ID: 433300 [Multi-domain]  Cd Length: 163  Bit Score: 124.35  E-value: 5.20e-32
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   357 TSRQDAYRLIQDIAGAFRGMVFWAGGMVNIMQDSPsDPVMLFTNANVKDG----LFTYKGSARKDRPSVALITYNNKQDG 432
Cdd:pfam13550    1 DEQMSARDALEPLARAFGFDAVESGGTLRFRPRGV-APVATLTDDDLVDGsdgdPVERTRAAEAELPNAVRLTYTDPAND 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   433 YKQNVEYVEDQEAMaryGERKTEAVAFGCTSRGQAHRVGLWLLYTARMESDMITFTAGLDASFLMPGETVLIQNKYRAGK 512
Cdd:pfam13550   80 YQPATVEARDAAGI---GERVSTVELPLVLSAGQAQRVAQRLLQEARAERETVTFSLPPSYLALEPGDVVELTDDGRAGR 156

                   ....*.
gi 556195944   513 RNSGRI 518
Cdd:pfam13550  157 WRIDRI 162
attach_TipJ_rel NF040662
host specificity factor TipJ family phage tail protein; Members of this family form a family ...
151-532 1.52e-20

host specificity factor TipJ family phage tail protein; Members of this family form a family related to that of host specificity protein J of phage lambda, a tail tip protein that mediates attachment to LamB on the surface of E. coli. Binding of the phage tail to the LamB receptor triggers the injection of phage DNA into host cells. Proteins with this domain are likely also to be phage tail proteins.


Pssm-ID: 468628  Cd Length: 473  Bit Score: 98.51  E-value: 1.52e-20
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  151 TSVQFKFQLANGNGsfydVIATGESSSDVTLTAKKTGVYYRSYEIQLPKPGRaYKVRVLRLSADSNDQY--LFNDT-WVD 227
Cdd:NF040662   22 VSVEVEYRKVDDNG----VPGGAWTSLYGVGTAATRDTLGRTRRIKLPPPGR-GEVRVRRRRTRDNNSNsrARDEVkWYG 96
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  228 SIGEIVDTPMNYPNSVLVGLKVNSEQFGSSMPSRsylirglKIRVPSNydENTNTYNGVwDGTFKLLSSSNPAWILFDLL 307
Cdd:NF040662   97 LRAYLPRSPTVYPNVTLLAVRVRATDNLSSQSER-------KLNCIAT--RKLPVYNGG-GGWSDPTPTRSIAFALADLA 166
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  308 TNARYGLGkfvSESMIDLGQLYQIgrycDEEVDDGFGGkEKRFAINTQITSRQDAyrlIQDIAGAFRGMVFWAGGMVNIM 387
Cdd:NF040662  167 RDPVIGRG---LPDEIDLDTLYAL----DDEVWTGRGD-EFDYTFDDESVSFEEA---LQMIANAGRAEPYRDGGLLSFV 235
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  388 QDSPSD-PVMLFTNANVKDGLF--TYKGSARKDRPSVALiTYNNKQDGYKQNVEYVEDQEAMarygeRKTEAV-AFGCTS 463
Cdd:NF040662  236 RDEPRTvPGALFNPRNIVPDSFkrSYTMPVEDDYDGVEV-EYVDPDTWKKETVRCRLPGSAG-----RNPKKIeLDGIRN 309
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 556195944  464 RGQAHRVGLWLLYTARMESDMITFTAGLDASFLMPGETVLIQNKYRAGKRnSGRIVSFTKNSITLDAPV 532
Cdd:NF040662  310 RDQAWRRAMREARKLRYQRRSVSFTTELDGLLVNYGDRVAVADDIPGWTQ-SGEVTARDGLTLTTSEPL 377
DUF1983 pfam09327
Domain of unknown function (DUF1983); Members of this family of functionally uncharacterized ...
3763-3833 1.10e-16

Domain of unknown function (DUF1983); Members of this family of functionally uncharacterized domains are found in various bacteriophage host specificity proteins.


Pssm-ID: 430529  Cd Length: 75  Bit Score: 77.34  E-value: 1.10e-16
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 556195944  3763 QELKKTVVENGNVNSTWMVKMETNSNGKKYAAGIALGIDGKN--LQSQFLVQADRFGLINTSNGNTTTPFVVE 3833
Cdd:pfam09327    3 QKSTAVADLDGKLSAMYSIKAQVKANGQKYVAGIALGAESGGgvTTSQVLFMADRFAIVNPANGNVTPPFVVQ 75
ILEI pfam15711
Interleukin-like EMT inducer; ILEI is a family of proteins found in vertebrates. It is heavily ...
3059-3140 8.35e-09

Interleukin-like EMT inducer; ILEI is a family of proteins found in vertebrates. It is heavily involved in the process of the transition from epithelial to mesenchymal tissue - EMT - during all of embryonic development, cancer progression, metastasis, and chronic inflammation/fibrosis. ILEI is upregulated exclusively at the level of translation, and abnormal ILEI expression, ie cytoplasmic over-expression instead of vesicular localization, is associated with EMT in human cancerous tissue. In order to induce and maintain the EMT of hepatocytes in a TGF-beta-independent fashion ILEI needs the cooperation of oncogenic Ras.


Pssm-ID: 464817  Cd Length: 89  Bit Score: 55.34  E-value: 8.35e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  3059 DGSTV-IATSKTYDVFgSANNGATMSADIEALASGTYVCVLTFDEPTGN-RGKVLSALESLGgtSEVVNSLPYRGAYILL 3136
Cdd:pfam15711    9 DACTGkVLDSKSFDTY-SYSDSSRLANFLKSIPDGSIVLIATKDEASSKlSDEARKALESLG--SSKIDNLGFRDSWAFI 85

                   ....
gi 556195944  3137 GRKG 3140
Cdd:pfam15711   86 GFKG 89
ApoLp-III_like cd13769
Apolipophorin-III and similar insect proteins; Exchangeable apolipoproteins play vital roles ...
855-1013 2.75e-07

Apolipophorin-III and similar insect proteins; Exchangeable apolipoproteins play vital roles in the transport of lipids and lipoprotein metabolism. Apolipophorin III (apoLp-III) assists in the loading of diacylglycerol, generated from triacylglycerol stores in the fat body through the action of adipokinetic hormone, into lipophorin, the hemolymph lipoprotein. ApoLp-III increases the lipid carrying capacity of lipophorin by covering the expanding hydrophobic surface resulting from diacylglycerol uptake. It plays a critical role in the transport of lipids during insect flight, and may also play a role in defense mechanisms and innate immunity.


Pssm-ID: 259842 [Multi-domain]  Cd Length: 158  Bit Score: 53.10  E-value: 2.75e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  855 LDEIDSKAtQEAVDNAIGEVQNSVNESIQQ-VENDLAQTSSEIKAQVDSVNQSLKEDIDTVNQTIVDNIDTVNQTINTNI 933
Cdd:cd13769     3 LSELIQKA-QEAINNLAQQVQKQLGLQNPEeVVNTLKEQSDNFANNLQEVSSSLKEEAKKKQGEVEEAWNEFKTKLSETV 81
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  934 SNVNSQIEAAKQSikdgdaalsQEIK-KAQSSLTTSLSQtSKDLTAAIQKETNDRIADVNDAAKQAADQLLSAKNELKTS 1012
Cdd:cd13769    82 PELRKSLPVEEKA---------QELQaKLQSGLQTLVTE-SQKLAKAISENSQKAQEELQKATKQAYDIAVEAAQNLQNQ 151

                  .
gi 556195944 1013 I 1013
Cdd:cd13769   152 L 152
DUF3672 pfam12421
Fibronectin type III protein; This domain family is found in bacteria and viruses, and is ...
3872-3948 1.14e-05

Fibronectin type III protein; This domain family is found in bacteria and viruses, and is typically between 126 and 146 amino acids in length. The family is found in association with pfam09327, pfam00041. There are two completely conserved G residues that may be functionally important. Many of the proteins in this family are annotated as fibronectin type III however there is little accompanying literature to confirm this.


Pssm-ID: 289206  Cd Length: 133  Bit Score: 47.65  E-value: 1.14e-05
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 556195944  3872 VGKDGSSQFHNVIVRGHVEAESGSFKGTIDATDGVFRGTVQASRFVGDICSAGVFKQGVRPDITHYDSGSVGGTKTY 3948
Cdd:pfam12421    1 LTPDGHLTAKNGDFRGSINANSGTLNNVTIAENCTISGTLRAEKILGDIVKAGVWEFPYVREPASSNHRYFSGTLTV 77
CBM_4_9 pfam02018
Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.
3235-3345 2.29e-05

Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.


Pssm-ID: 396553  Cd Length: 134  Bit Score: 47.06  E-value: 2.29e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  3235 GNLIVNPSFE-RGTEGYTGWSGIATVVTLQVPHLGTKAAKLAAGGSAGVGQ----KISFKKDRSYKIGIWAKQDpnttiq 3309
Cdd:pfam02018    1 GNLIKNGTFEdGGLDGWKARGGSGKATVDVTSYNGTYSLKVSGRTATWDGQiidiTIRLEKGTTYTVSFWVKAS------ 74
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 556195944  3310 STDNTKFRVAEGnGLIASKAYGPFTSNWQEVSWTWK 3345
Cdd:pfam02018   75 SGPPQTVSVTLQ-ITDASGNYDTVADEKVVLTGEWT 109
FhaB COG3210
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ...
1874-3465 2.62e-05

Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442443 [Multi-domain]  Cd Length: 1698  Bit Score: 50.54  E-value: 2.62e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 1874 VNADTTITMAPGSRVFDDTGAVYVNGVRVAFGNASWNTVSFDLKAGWSTVEFLVNQWTGQAYINLGFKLSEKVAQLNSAL 1953
Cdd:COG3210    77 STGGIGAAAANTAGTLETGLTSNIGGGSVNGSNSTGNGTLTTTAASATTGNNTGGTTTSSTNTVTTLGGTTTGNTVLSTS 156
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 1954 GMNALSNAISAVTSNVSTVGDRVTSTSQSVTDLRNSLEQTNANLENKADAQALSTLQNTVSKQRDTISSQGNSITNLNNT 2033
Cdd:COG3210   157 GAGNNTNTNNSSSGTNIGNSIPTTGGSLNVVAANPTGVTGVGGALINATAGVLANAGGGTAGGVASANSTLTGGVVAAGT 236
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2034 LTAARNAGDNLIPNYDFLQGATAWDTQYPAGVTFGNFGDGKAGVKLNRTTTTSPGIFSNNNKPLPLNGQRK--------- 2104
Cdd:COG3210   237 GAGVISTGGTDISSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGgtgvlgggt 316
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2105 -YRVVVKAKGVSGAMNMLIRRQNKIGQTDSNYEDKNVTLTSEWQTITWETGLTASNADGQNFKLYAHPANAEIWVDTFKV 2183
Cdd:COG3210   317 aAGITTTNTVGGNGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASS 396
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2184 FDITDEVKIKANSDALSTLSSTVTQQGDKITSQGNSITKLTNDLEAADANIAKKADQSAVTTLTGRVEKTESGLTAANSN 2263
Cdd:COG3210   397 TTVLGSGSLATGNTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSA 476
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2264 ITSLSSSLNQQSKRGANLLPDGTFESYAVGHNLSNNRVIVTTDDSHGGNKCIRVTRPNDYNANATDNSDNHIFSGFQVRD 2343
Cdd:COG3210   477 GNTTSATTLAGGGIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTGGDGTTLSGSGLTTTVSGGASGTT 556
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2344 NAVFYMecWVKLDAKSTAMAENAQISIGLSLQYQDNSWQWPAVTKAAKDLSTAQWTKVSGYLKSTKSGIKQAMVRISIPN 2423
Cdd:COG3210   557 AASGSN--TANTLGVLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGL 634
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2424 VSSVKAGNSFLIDDLVITEVTDAYNAQSTADANANAISTLDSTVSQQGDQITSQGNSITKLTNDLSTTNSNVAKKADGAA 2503
Cdd:COG3210   635 TGSAVGAALSGTGSGTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAGNTLTISTGSIT 714
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2504 VTALTnrvteaeGNISSQSNQLATLSNSLAEGSLICNGGLNVDASFWEDSGPGSAFTYDANEKAIRTTTGSirvANLTRI 2583
Cdd:COG3210   715 VTGQI-------GALANANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLANAN---GNTSAG 784
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2584 PVEAETTLTISFEMKASETISNVSSDSVGVIADLATPTNWISSVSPWLGGVSTNWQTKSVELTIPANFIGKYVYLRFAAG 2663
Cdd:COG3210   785 ATLDNAGAEISIDITADGTITAAGTTAINVTGSGGTITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAG 864
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2664 GWTPSNSARLYIRKVDVFSSTGVAKKANATAVSDLTSRVDSTEGKLASQSQALTKLQNDLATTNNNVSKKADANALTALT 2743
Cdd:COG3210   865 ANSGSLAATAASITVGSGGVATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTA 944
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2744 NRVTQTEKDINSTSSSVTNLNNKVDAISVGGTNLIKNSGDMTGWSNVVSDTYRGNAVIGATVKAGSGYRDLREITLESPV 2823
Cdd:COG3210   945 LSGTQGNAGLSAASASDGAGDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTG 1024
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2824 DAGEYVYSFYAKGGVAGQTMTAFFYNPNTTTSIETSQGAKGNNSDGRAQFTLTTSWARYWVKWKQTPTTGTKRLILCRIE 2903
Cdd:COG3210  1025 TTGTASATGTGTAATAGGQNGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATG 1104
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2904 SNTSKDQTVYINSPKFEVGNVVSDWNESPSDSASASAVDSLTTKVNQQGTSISSIGNRTTSLENGLSTAQNNIAKKADAS 2983
Cdd:COG3210  1105 TSGGTTTSTGGVTASKVGGTTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGG 1184
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2984 ALQDLRNTVTSQGDDLTAANSSITSLQASMNRRTVFTVTARGNGNSVTHGVFDESGKNLFTPGRSWALVTFAKHSDGSTV 3063
Cdd:COG3210  1185 ADSAATEGTAGTDLKGGDSTGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGA 1264
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3064 IATSKTYDVFGSANNGATMSADIEALASGTYVCVLTFDEPTGNRGKVLSALESLGGTSEVVNSLPYRGAYILLGRKGMKP 3143
Cdd:COG3210  1265 VSNGATSTVAGNAGATATGSTVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGTTATGTAVAAVNSGGVNAGGGTIN 1344
                        1290      1300      1310      1320      1330      1340      1350      1360
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3144 GDGLELRAPTGGDATAHISTSVEFVNGVMMGLGAAGGVMMKADANASAITTLQNTVKTQGDNIDSLSSSTTALENSLASS 3223
Cdd:COG3210  1345 TTAANTGLNGGNGATDSAAGAGSGGAAGSLAATAGAGTVLTGAGNNTGAEGTNAGRDGGVTTSGTGVGNNGGVSGTTVAG 1424
                        1370      1380      1390      1400      1410      1420      1430      1440
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3224 NASVDAASQIPGNLIVNPSFERGTEGYTGWSGIATVVTLQVPHLGTKAAKLAAGGSAGVGQKISFKKDRSYKIGIWAKQD 3303
Cdd:COG3210  1425 TTGSSATTGTGGTGNTTGTSVAGAGGGNADASAINTGNASSLGAGGSTAGNAVGGAVIGGTTTGGNGAGVAGATASNGGT 1504
                        1450      1460      1470      1480      1490      1500      1510      1520
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3304 PNTTIQSTDNTKFRVAEGNGLIASKAYGPFTSNWQEVSWTWKATKDVVADVQFTAFLSAGAMYFDDFYVVDVTDSVETQA 3383
Cdd:COG3210  1505 STGAGGTAGGTTAEVAKASLEGGEGTYGGSSVAEAGTGGGILGAVSGAGSEGGAAGGVTGSVGVGGTDGAGGDTGGADDT 1584
                        1530      1540      1550      1560      1570      1580      1590      1600
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3384 NSSAITNLDSRVTKTENDITSQGSQVTQLKNDLATTNTNVSKKADAAALTALTNRVTQNEKEIETQSSQTTSLKNSLSTV 3463
Cdd:COG3210  1585 GAQAPTAGNTATLTLSLAEGTNAEYGGTTNVTSGTAGNAGATGANSNTVVTTNGGEGVLALVAGGNTTNGTTLSGAVNGA 1664

                  ..
gi 556195944 3464 QA 3465
Cdd:COG3210  1665 GN 1666
Mplasa_alph_rch TIGR04523
helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of ...
842-1027 1.10e-04

helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.


Pssm-ID: 275316 [Multi-domain]  Cd Length: 745  Bit Score: 48.48  E-value: 1.10e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   842 KDKITSSELGKELLDEIDSKATQEA--VDNAIGEVQNSVNESIQQVENDLAQTS---SEIKaqvdsvnqSLKEDiDTVNQ 916
Cdd:TIGR04523  383 KQEIKNLESQINDLESKIQNQEKLNqqKDEQIKKLQQEKELLEKEIERLKETIIknnSEIK--------DLTNQ-DSVKE 453
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   917 TIVDNIDTVNQTINTNISNVNSQIEAAKQSIKDgdaaLSQEIKKAQSSL------TTSLSQTSKDLTAAI------QKET 984
Cdd:TIGR04523  454 LIIKNLDNTRESLETQLKVLSRSINKIKQNLEQ----KQKELKSKEKELkklneeKKELEEKVKDLTKKIsslkekIEKL 529
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|...
gi 556195944   985 NDRIADVNDAAKQAADQLLSAKNELKTsiDSLSEVVTSGDENL 1027
Cdd:TIGR04523  530 ESEKKEKESKISDLEDELNKDDFELKK--ENLEKEIDEKNKEI 570
MA smart00283
Methyl-accepting chemotaxis-like domains (chemotaxis sensory transducer); Thought to undergo ...
853-1045 3.69e-04

Methyl-accepting chemotaxis-like domains (chemotaxis sensory transducer); Thought to undergo reversible methylation in response to attractants or repellants during bacterial chemotaxis.


Pssm-ID: 214599 [Multi-domain]  Cd Length: 262  Bit Score: 45.35  E-value: 3.69e-04
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944    853 ELLDEIDSKATQ-EAVDNAIGEVQNSVNESIQQVE--NDLA----QTSSEIKAQVDSVNQSLKEDIDTVNQTI--VDNID 923
Cdd:smart00283   15 EQAEELEELAERmEELSASIEEVAANADEIAATAQsaAEAAeegrEAVEDAITAMDQIREVVEEAVSAVEELEesSDEIG 94
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944    924 TVNQTIN-----TNISNVNSQIEAAKQSIK-DGDAALSQEIKKaqsslttsLSQTSKDLTAAIQ---KETNDRIADVNDA 994
Cdd:smart00283   95 EIVSVIDdiadqTNLLALNAAIEAARAGEAgRGFAVVADEVRK--------LAERSAESAKEIEsliKEIQEETNEAVAA 166
                           170       180       190       200       210
                    ....*....|....*....|....*....|....*....|....*....|.
gi 556195944    995 AKQAADQLLSAKNELKTSIDSLSEVVTSGDENlARQISQIAAGTGEQFDSL 1045
Cdd:smart00283  167 MEESSSEVEEGVELVEETGDALEEIVDSVEEI-ADLVQEIAAATDEQAAGS 216
CALCOCO1 pfam07888
Calcium binding and coiled-coil domain (CALCOCO1) like; Proteins found in this family are ...
1196-1378 4.17e-04

Calcium binding and coiled-coil domain (CALCOCO1) like; Proteins found in this family are similar to the coiled-coil transcriptional coactivator protein coexpressed by Mus musculus (CoCoA/CALCOCO1). This protein binds to a highly conserved N-terminal domain of p160 coactivators, such as GRIP1, and thus enhances transcriptional activation by a number of nuclear receptors. CALCOCO1 has a central coiled-coil region with three leucine zipper motifs, which is required for its interaction with GRIP1 and may regulate the autonomous transcriptional activation activity of the C-terminal region.


Pssm-ID: 462303 [Multi-domain]  Cd Length: 488  Bit Score: 46.04  E-value: 4.17e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  1196 LQEEQQARANADTA-EAQARSTLAAQIRGSSESGNLDDIRSGLiyQEKNARITADAAEASAR----ESLQTEFNRNKASV 1270
Cdd:pfam07888   36 LEECLQERAELLQAqEAANRQREKEKERYKRDREQWERQRREL--ESRVAELKEELRQSREKheelEEKYKELSASSEEL 113
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  1271 AEELHTLSTEQASQASKITGLQ---TSLGQKADASAVQ--TISQKVEEQGNTLKSQGAALSTLDNRVGSVESGVSANSKA 1345
Cdd:pfam07888  114 SEEKDALLAQRAAHEARIRELEediKTLTQRVLERETEleRMKERAKKAGAQRKEEEAERKQLQAKLQQTEEELRSLSKE 193
                          170       180       190
                   ....*....|....*....|....*....|...
gi 556195944  1346 ITGLQSTVTQQDKTLSSQSESITTLNNSLSDIQ 1378
Cdd:pfam07888  194 FQELRNSLAQRDTQVLQLQDTITTLTQKLTTAH 226
COG4372 COG4372
Uncharacterized protein, contains DUF3084 domain [Function unknown];
3629-3771 6.47e-04

Uncharacterized protein, contains DUF3084 domain [Function unknown];


Pssm-ID: 443500 [Multi-domain]  Cd Length: 370  Bit Score: 45.28  E-value: 6.47e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3629 ITDAKEAQDSANANASA----LTSLTSRVTNVEGLVTSQASQLSSLTSQVNDASSKVDQmaqtitnnekTQSSLNTSLQS 3704
Cdd:COG4372    26 IAALSEQLRKALFELDKlqeeLEQLREELEQAREELEQLEEELEQARSELEQLEEELEE----------LNEQLQAAQAE 95
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 556195944 3705 QIDAQASANIKNQtELNNATTSLAAIKSTQQTQATTISALSQQQTNLTAQVGGQSAELQELKKTVVE 3771
Cdd:COG4372    96 LAQAQEELESLQE-EAEELQEELEELQKERQDLEQQRKQLEAQIAELQSEIAEREEELKELEEQLES 161
CBM_4_9 pfam02018
Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.
1389-1507 1.18e-03

Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.


Pssm-ID: 396553  Cd Length: 134  Bit Score: 42.05  E-value: 1.18e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  1389 SNLLVNASFE-RDLAGWSAGNSVSSIIKASAPHSGSKILVCAAGT-----VQITQSVSVVEGRTYKLSSFVRcttdavIS 1462
Cdd:pfam02018    1 GNLIKNGTFEdGGLDGWKARGGSGKATVDVTSYNGTYSLKVSGRTatwdgQIIDITIRLEKGTTYTVSFWVK------AS 74
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|
gi 556195944  1463 SPGNNKLRI-----GAATLLKEIPIRPENLPKDetWKEVSDTWKATLTGK 1507
Cdd:pfam02018   75 SGPPQTVSVtlqitDASGNYDTVADEKVVLTGE--WTKLEGTFTIPKTAS 122
NTTRR-F1 NF033675
NTTRR-F1 domain; NTTRR-F1 (N-terminal To Repetitive Region - Firmicutes 1) is a homology ...
3236-3277 7.09e-03

NTTRR-F1 domain; NTTRR-F1 (N-terminal To Repetitive Region - Firmicutes 1) is a homology domain found strictly as the N-terminal non-repetitive region of otherwise highly repetitive proteins of various Firmicutes. The repetitive region that follows typically is collagen-like, with every third residue a glycine.


Pssm-ID: 468135  Cd Length: 155  Bit Score: 40.23  E-value: 7.09e-03
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|..
gi 556195944 3236 NLIVNPSFERGTegYTGWSGIATVVTLQVPHLGTKAAKLAAG 3277
Cdd:NF033675    4 NLIVNGGFETGS--LTPWSGVNASITSQFSHSGFYSARLLGG 43
 
Name Accession Description Interval E-value
COG4733 COG4733
Phage-related protein, tail protein J [Mobilome: prophages, transposons];
244-1066 8.18e-177

Phage-related protein, tail protein J [Mobilome: prophages, transposons];


Pssm-ID: 443767 [Multi-domain]  Cd Length: 978  Bit Score: 573.43  E-value: 8.18e-177
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  244 LVGLKVNSEQFGSSMPSRSYLIRGLKIRVPSNYDENtntynGVWDGTFKLLSSSNPAWILFDLLTNARYGLGKFVSESMI 323
Cdd:COG4733   148 LVGLRFDAEQFNGSIPNVNALVRGRKIRVPSNYDPS-----GVWDGTFKWAWTNNPAWVFYDLLTGDRYGLGRRLTAADI 222
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  324 DLGQLYQIGRYCDEEVDDGFGGKEKRFAINTQITSRQDAYRLIQDIAGAFRGMVFWAGGMVNIMQDSPSD-PVMLFTNAN 402
Cdd:COG4733   223 DKWSLYAIAQYCDQKVPDGGGGTEPRFTCNVYIQSQASAWDVLRDIAAAFRGMPYWDGGKLGVVADRPRDpPVATFTPAN 302
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  403 VKDGLFTYKGSARKDRPSVALITYNNKQDGYKQNVEYVEDQEAMARYGERKTEAVAFGCTSRGQAHRVGLWLLYTARMES 482
Cdd:COG4733   303 VVDGSFTYSYSSRKERPNAALVSFSDPDNGYQQAEEPVEDPDLIARYGVNQTELTAPGCTSRGQAQREGRWALLTNRYRT 382
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  483 DMITFTAGLDASFLMPGETVLIQNKYRAGKRNSGRIVSFTKNSITLDAPVSLKKSGSFIRIINQEGKIVERDIneTGDNI 562
Cdd:COG4733   383 RTVTFSVGLDGLVATPGDVIAVADDVLAGRRIGGRVSSVDGRVVTLDRPVTMEAGDRYLRVRLPDGTSVARTV--QSVAG 460
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  563 TKVTFKTALATAdqPVANGVWTITEPDLVPMRARVVAIAQGEtPGSFDITVVQNNASKYQAIDNGAALVPenttvldPTY 642
Cdd:COG4733   461 RTLTVSTAYSET--PEAGAVWAFGPDELETQLFRVVSIEENE-DGTYTITAVQHAPEKYAAIDAGAFDDV-------PPQ 530
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  643 SKPSNLVISEGTYLSSPGNLSVKLMLAWEGK--SPEYWVSWRRSDegnvSNWQSARATEEQYEIVNVAENGRYDFQLYSV 720
Cdd:COG4733   531 WPPVNVTTSESLSVVAQGTAVTTLTVSWDAPagAVAYEVEWRRDD----GNWVSVPRTSGTSFEVPGIYAGDYEVRVRAI 606
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  721 SFGGKKSEIITAV-YQVKGTMTPPGAPTSLTAVGDYRNVVLNWVNPDSVDLAQINVYASKTNKLDTATLI-AQAATTTFT 798
Cdd:COG4733   607 NALGVSSAWAASSeTTVTGKTAPPPAPTGLTATGGLGGITLSWSFPVDADTLRTEIRYSTTGDWASATVAqALYPGNTYT 686
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  799 HAGLGDNETWYYWIRAVNKRGMVGQPNSNLGTEATTRDVLSFLKDKITSSELGKElLDEIDSKATQEAVDNAIGEVQNSV 878
Cdd:COG4733   687 LAGLKAGQTYYYRARAVDRSGNVSAWWVSGQASADAAGILDAITGQILETELGQE-LDAIIQNATVAEVVAATVTDVTAQ 765
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  879 NESIQQVENDLAQTSSEIKAQVDSVnqslkedidTVNQTIVDNIDTVNQTINTNISNVNSQIEAAKQSIKDGDAALSQEI 958
Cdd:COG4733   766 IDTAVLFAGVATAAAIGAEARVAAT---------VAESATAAAATGTAADAAGDASGGVTAGTSGTTGAGDTAASTTRVA 836
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  959 KKAQSSLTTSLSQTSKDLTAAIQKETNDRIADVNDAAKQAADQLLSAkNELKTSIDSLSEVVTSGDENLARQISQIAAGT 1038
Cdd:COG4733   837 AAVVLAGVVVYGDAIIESGNTGDIVATGDIASAAAGAVATTVSGTTA-ADVSAVADSTAASLTAIVIAATTIIDAIGDGT 915
                         810       820       830
                  ....*....|....*....|....*....|.
gi 556195944 1039 GEQFDSLK---IWYFDQDAEGWTEDDNGYTP 1066
Cdd:COG4733   916 TREPAGDIgasGGAQGFAVTIVGSFDGAGAV 946
Phage-tail_3 pfam13550
Putative phage tail protein; This putative domain is found in the large gene transfer agent ...
357-518 5.20e-32

Putative phage tail protein; This putative domain is found in the large gene transfer agent protein. These produce defective phage like particles. This domain is similar to other phage-tail protein families.


Pssm-ID: 433300 [Multi-domain]  Cd Length: 163  Bit Score: 124.35  E-value: 5.20e-32
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   357 TSRQDAYRLIQDIAGAFRGMVFWAGGMVNIMQDSPsDPVMLFTNANVKDG----LFTYKGSARKDRPSVALITYNNKQDG 432
Cdd:pfam13550    1 DEQMSARDALEPLARAFGFDAVESGGTLRFRPRGV-APVATLTDDDLVDGsdgdPVERTRAAEAELPNAVRLTYTDPAND 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   433 YKQNVEYVEDQEAMaryGERKTEAVAFGCTSRGQAHRVGLWLLYTARMESDMITFTAGLDASFLMPGETVLIQNKYRAGK 512
Cdd:pfam13550   80 YQPATVEARDAAGI---GERVSTVELPLVLSAGQAQRVAQRLLQEARAERETVTFSLPPSYLALEPGDVVELTDDGRAGR 156

                   ....*.
gi 556195944   513 RNSGRI 518
Cdd:pfam13550  157 WRIDRI 162
attach_TipJ_rel NF040662
host specificity factor TipJ family phage tail protein; Members of this family form a family ...
151-532 1.52e-20

host specificity factor TipJ family phage tail protein; Members of this family form a family related to that of host specificity protein J of phage lambda, a tail tip protein that mediates attachment to LamB on the surface of E. coli. Binding of the phage tail to the LamB receptor triggers the injection of phage DNA into host cells. Proteins with this domain are likely also to be phage tail proteins.


Pssm-ID: 468628  Cd Length: 473  Bit Score: 98.51  E-value: 1.52e-20
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  151 TSVQFKFQLANGNGsfydVIATGESSSDVTLTAKKTGVYYRSYEIQLPKPGRaYKVRVLRLSADSNDQY--LFNDT-WVD 227
Cdd:NF040662   22 VSVEVEYRKVDDNG----VPGGAWTSLYGVGTAATRDTLGRTRRIKLPPPGR-GEVRVRRRRTRDNNSNsrARDEVkWYG 96
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  228 SIGEIVDTPMNYPNSVLVGLKVNSEQFGSSMPSRsylirglKIRVPSNydENTNTYNGVwDGTFKLLSSSNPAWILFDLL 307
Cdd:NF040662   97 LRAYLPRSPTVYPNVTLLAVRVRATDNLSSQSER-------KLNCIAT--RKLPVYNGG-GGWSDPTPTRSIAFALADLA 166
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  308 TNARYGLGkfvSESMIDLGQLYQIgrycDEEVDDGFGGkEKRFAINTQITSRQDAyrlIQDIAGAFRGMVFWAGGMVNIM 387
Cdd:NF040662  167 RDPVIGRG---LPDEIDLDTLYAL----DDEVWTGRGD-EFDYTFDDESVSFEEA---LQMIANAGRAEPYRDGGLLSFV 235
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  388 QDSPSD-PVMLFTNANVKDGLF--TYKGSARKDRPSVALiTYNNKQDGYKQNVEYVEDQEAMarygeRKTEAV-AFGCTS 463
Cdd:NF040662  236 RDEPRTvPGALFNPRNIVPDSFkrSYTMPVEDDYDGVEV-EYVDPDTWKKETVRCRLPGSAG-----RNPKKIeLDGIRN 309
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 556195944  464 RGQAHRVGLWLLYTARMESDMITFTAGLDASFLMPGETVLIQNKYRAGKRnSGRIVSFTKNSITLDAPV 532
Cdd:NF040662  310 RDQAWRRAMREARKLRYQRRSVSFTTELDGLLVNYGDRVAVADDIPGWTQ-SGEVTARDGLTLTTSEPL 377
DUF1983 pfam09327
Domain of unknown function (DUF1983); Members of this family of functionally uncharacterized ...
3763-3833 1.10e-16

Domain of unknown function (DUF1983); Members of this family of functionally uncharacterized domains are found in various bacteriophage host specificity proteins.


Pssm-ID: 430529  Cd Length: 75  Bit Score: 77.34  E-value: 1.10e-16
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 556195944  3763 QELKKTVVENGNVNSTWMVKMETNSNGKKYAAGIALGIDGKN--LQSQFLVQADRFGLINTSNGNTTTPFVVE 3833
Cdd:pfam09327    3 QKSTAVADLDGKLSAMYSIKAQVKANGQKYVAGIALGAESGGgvTTSQVLFMADRFAIVNPANGNVTPPFVVQ 75
ILEI pfam15711
Interleukin-like EMT inducer; ILEI is a family of proteins found in vertebrates. It is heavily ...
3059-3140 8.35e-09

Interleukin-like EMT inducer; ILEI is a family of proteins found in vertebrates. It is heavily involved in the process of the transition from epithelial to mesenchymal tissue - EMT - during all of embryonic development, cancer progression, metastasis, and chronic inflammation/fibrosis. ILEI is upregulated exclusively at the level of translation, and abnormal ILEI expression, ie cytoplasmic over-expression instead of vesicular localization, is associated with EMT in human cancerous tissue. In order to induce and maintain the EMT of hepatocytes in a TGF-beta-independent fashion ILEI needs the cooperation of oncogenic Ras.


Pssm-ID: 464817  Cd Length: 89  Bit Score: 55.34  E-value: 8.35e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  3059 DGSTV-IATSKTYDVFgSANNGATMSADIEALASGTYVCVLTFDEPTGN-RGKVLSALESLGgtSEVVNSLPYRGAYILL 3136
Cdd:pfam15711    9 DACTGkVLDSKSFDTY-SYSDSSRLANFLKSIPDGSIVLIATKDEASSKlSDEARKALESLG--SSKIDNLGFRDSWAFI 85

                   ....
gi 556195944  3137 GRKG 3140
Cdd:pfam15711   86 GFKG 89
FN3 COG3401
Fibronectin type 3 domain [General function prediction only];
558-834 5.90e-08

Fibronectin type 3 domain [General function prediction only];


Pssm-ID: 442628 [Multi-domain]  Cd Length: 603  Bit Score: 58.86  E-value: 5.90e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  558 TGDNITKVTFKTALATADQPVANGVWTITEPDLVPMRARVVAIAQGETPGSFDITVVQNNASKYQAIDNGAALVPENTTV 637
Cdd:COG3401   133 GGAATAGTYALGAGLYGVDGANASGTTASSVAGAGVVVSPDTSATAAVATTSLTVTSTTLVDGGGDIEPGTTYYYRVAAT 212
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  638 LDPTYSKPSN--LVISEGTYLSSPGNLSVKLM------LAWEGKSPEYWVSWR--RSDEGNVSNWQSARATEEQYEIVNV 707
Cdd:COG3401   213 DTGGESAPSNevSVTTPTTPPSAPTGLTATADtpgsvtLSWDPVTESDATGYRvyRSNSGDGPFTKVATVTTTSYTDTGL 292
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  708 AENGRYDFQLYSVSFGGKKSEIITAVyQVKGTMTPPGAPTSLTAVGDY-RNVVLNWVNPDSVDLAQINVYASKTNKLDTA 786
Cdd:COG3401   293 TNGTTYYYRVTAVDAAGNESAPSNVV-SVTTDLTPPAAPSGLTATAVGsSSITLSWTASSDADVTGYNVYRSTSGGGTYT 371
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*...
gi 556195944  787 TLIAQAATTTFTHAGLGDNETWYYWIRAVNKRGMVGQPNSNLGTEATT 834
Cdd:COG3401   372 KIAETVTTTSYTDTGLTPGTTYYYKVTAVDAAGNESAPSEEVSATTAS 419
HEC1 COG5185
Chromosome segregation protein NDC80, interacts with SMC proteins [Cell cycle control, cell ...
835-1083 9.49e-08

Chromosome segregation protein NDC80, interacts with SMC proteins [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 444066 [Multi-domain]  Cd Length: 594  Bit Score: 58.05  E-value: 9.49e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  835 RDVLSFLKDKITSSELGKELLDEIdsKATQEAVDNAIGEV---QNSVNESIQQVENDLAQTS-----SEIKAQVDSVNQ- 905
Cdd:COG5185   308 KKATESLEEQLAAAEAEQELEESK--RETETGIQNLTAEIeqgQESLTENLEAIKEEIENIVgevelSKSSEELDSFKDt 385
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  906 --SLKEDIDTVNQTIVDNIDTVNQTINTNISNVNSQIEAAKQSIKDGD---AALSQEIKKAQSSLTTSLSQTSKDLTAAI 980
Cdd:COG5185   386 ieSTKESLDEIPQNQRGYAQEILATLEDTLKAADRQIEELQRQIEQATssnEEVSKLLNELISELNKVMREADEESQSRL 465
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  981 QKETND-------RIADVNDAAKQAADQLLSAKNELKTSIDSLSEVVTSGDENLARQISQI--AAGTGEQFDSLKIWYFD 1051
Cdd:COG5185   466 EEAYDEinrsvrsKKEDLNEELTQIESRVSTLKATLEKLRAKLERQLEGVRSKLDQVAESLkdFMRARGYAHILALENLI 545
                         250       260       270
                  ....*....|....*....|....*....|....
gi 556195944 1052 QDAEGWT--EDDNGYTPMSVTSDGWLKANNPTST 1083
Cdd:COG5185   546 PASELIQasNAKTDGQAANLRTAVIDELTQYLST 579
ApoLp-III_like cd13769
Apolipophorin-III and similar insect proteins; Exchangeable apolipoproteins play vital roles ...
855-1013 2.75e-07

Apolipophorin-III and similar insect proteins; Exchangeable apolipoproteins play vital roles in the transport of lipids and lipoprotein metabolism. Apolipophorin III (apoLp-III) assists in the loading of diacylglycerol, generated from triacylglycerol stores in the fat body through the action of adipokinetic hormone, into lipophorin, the hemolymph lipoprotein. ApoLp-III increases the lipid carrying capacity of lipophorin by covering the expanding hydrophobic surface resulting from diacylglycerol uptake. It plays a critical role in the transport of lipids during insect flight, and may also play a role in defense mechanisms and innate immunity.


Pssm-ID: 259842 [Multi-domain]  Cd Length: 158  Bit Score: 53.10  E-value: 2.75e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  855 LDEIDSKAtQEAVDNAIGEVQNSVNESIQQ-VENDLAQTSSEIKAQVDSVNQSLKEDIDTVNQTIVDNIDTVNQTINTNI 933
Cdd:cd13769     3 LSELIQKA-QEAINNLAQQVQKQLGLQNPEeVVNTLKEQSDNFANNLQEVSSSLKEEAKKKQGEVEEAWNEFKTKLSETV 81
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  934 SNVNSQIEAAKQSikdgdaalsQEIK-KAQSSLTTSLSQtSKDLTAAIQKETNDRIADVNDAAKQAADQLLSAKNELKTS 1012
Cdd:cd13769    82 PELRKSLPVEEKA---------QELQaKLQSGLQTLVTE-SQKLAKAISENSQKAQEELQKATKQAYDIAVEAAQNLQNQ 151

                  .
gi 556195944 1013 I 1013
Cdd:cd13769   152 L 152
MCP_signal cd11386
Methyl-accepting chemotaxis protein (MCP), signaling domain; Methyl-accepting chemotaxis ...
860-1041 1.67e-06

Methyl-accepting chemotaxis protein (MCP), signaling domain; Methyl-accepting chemotaxis proteins (MCPs or chemotaxis receptors) are an integral part of the transmembrane protein complex that controls bacterial chemotaxis, together with the histidine kinase CheA, the receptor-coupling protein CheW, receptor-modification enzymes, and localized phosphatases. MCPs contain a four helix trans membrane region, an N-terminal periplasmic ligand binding domain, and a C-terminal HAMP domain followed by a cytoplasmic signaling domain. This C-terminal signaling domain dimerizes into a four-helix bundle and interacts with CheA through the adaptor protein CheW.


Pssm-ID: 206779 [Multi-domain]  Cd Length: 200  Bit Score: 51.85  E-value: 1.67e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  860 SKATQEAVDNAigevqNSVNESIQQVE---NDLAQTSSEIKAQVDSVNQSLKEDIDTVNQ--TIVDNIDTVNQTIN---- 930
Cdd:cd11386     4 SASIEEVAASA-----DQVAETSQQAAelaEKGREAAEDAINQMNQIDESVDEAVSAVEEleESSAEIGEIVEVIDdiae 78
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  931 -TNISNVNSQIEAA---KQSikDGDAALSQEIKKaqsslttsLSQTSKDLTAAIQK---ETNDRIADVNDAAKQAADQLL 1003
Cdd:cd11386    79 qTNLLALNAAIEAAragEAG--RGFAVVADEVRK--------LAEESAEAAKEIEElieEIQEQTEEAVEAMEETSEEVE 148
                         170       180       190
                  ....*....|....*....|....*....|....*...
gi 556195944 1004 SAKNELKTSIDSLSEVVTSGDEnLARQISQIAAGTGEQ 1041
Cdd:cd11386   149 EGVELVEETGRAFEEIVASVEE-VADGIQEISAATQEQ 185
FN3 COG3401
Fibronectin type 3 domain [General function prediction only];
518-824 1.92e-06

Fibronectin type 3 domain [General function prediction only];


Pssm-ID: 442628 [Multi-domain]  Cd Length: 603  Bit Score: 53.85  E-value: 1.92e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  518 IVSFTKNSITLDAPVSLKKSGSFIRIINQEGKIVERDINETGDNITKVTFKTALATADQPVANGVWTITEPDLVPMRARV 597
Cdd:COG3401     3 SSYLTSLDAGIAASAAANTAVNALSKAGGSGKTILVYLAVVLSVTTKESPGTLLVAAGLSSGGGLGTGGRAGTTSGVAAV 82
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  598 VAIAQGETPGSFDITVVQNN-----ASKYQAIDNGAALVPENTTVLDPTYSKPSNLVISEGTYLSSPGNLSVKLMLAWEG 672
Cdd:COG3401    83 AVAAAPPTATGLTTLTGSGSvggatNTGLTSSDEVPSPAVGTATTATAVAGGAATAGTYALGAGLYGVDGANASGTTASS 162
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  673 KSPEYWVSWRRSDEGNVSNWQSARATEEQYEIVNVAENGRYDFQLYSVSFGGKKSEIITAVYQVKGTMTPPGAPTSLTAV 752
Cdd:COG3401   163 VAGAGVVVSPDTSATAAVATTSLTVTSTTLVDGGGDIEPGTTYYYRVAATDTGGESAPSNEVSVTTPTTPPSAPTGLTAT 242
                         250       260       270       280       290       300       310
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 556195944  753 GDYRN-VVLNWVNPDSVDLAQINVYASKTNKlDTATLIAQAATTTFTHAGLGDNETWYYWIRAVNKRGMVGQP 824
Cdd:COG3401   243 ADTPGsVTLSWDPVTESDATGYRVYRSNSGD-GPFTKVATVTTTSYTDTGLTNGTTYYYRVTAVDAAGNESAP 314
SF-assemblin pfam06705
SF-assemblin/beta giardin; This family consists of several eukaryotic SF-assemblin and related ...
848-1006 1.02e-05

SF-assemblin/beta giardin; This family consists of several eukaryotic SF-assemblin and related beta giardin proteins. During mitosis the SF-assemblin-based cytoskeleton is reorganized; it divides in prophase and is reduced to two dot-like structures at each spindle pole in metaphase. During anaphase, the two dots present at each pole are connected again. In telophase there is an asymmetrical outgrowth of new fibres. It has been suggested that SF-assemblin is involved in re-establishing the microtubular root system characteriztic of interphase cells after mitosis.


Pssm-ID: 284187 [Multi-domain]  Cd Length: 247  Bit Score: 49.93  E-value: 1.02e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   848 SELGKELLDEIDSKATQEA-----VDNAIGEVQNSVNESIQQVENDLAQTSSEIKAQVDSVNQSLKEDIDTVNQTIVDNI 922
Cdd:pfam06705   15 SGFHDKMENEIEVKRVDEDtrvkmIKEAIAHLEKLIQTESKKRQESFEDIQEEFKKEIDNMQETIKEEIDDMAANFRKAL 94
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   923 DTVNQTINTNISNVNSQIEAAKQSIK-----------DGDAALSQEI---KKAQSSLTTSLSQTSKDLTAAIQKETNDRI 988
Cdd:pfam06705   95 AELNDTINNVETNLQNEIAIHNDAIEalrkealkslnDLETGIATENaerKKMYDQLNKKVAEGFARISAAIDTEKNARD 174
                          170
                   ....*....|....*...
gi 556195944   989 ADVNDAAKQAADQLLSAK 1006
Cdd:pfam06705  175 SAVSAATTELTNTKLVEK 192
DUF3672 pfam12421
Fibronectin type III protein; This domain family is found in bacteria and viruses, and is ...
3872-3948 1.14e-05

Fibronectin type III protein; This domain family is found in bacteria and viruses, and is typically between 126 and 146 amino acids in length. The family is found in association with pfam09327, pfam00041. There are two completely conserved G residues that may be functionally important. Many of the proteins in this family are annotated as fibronectin type III however there is little accompanying literature to confirm this.


Pssm-ID: 289206  Cd Length: 133  Bit Score: 47.65  E-value: 1.14e-05
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 556195944  3872 VGKDGSSQFHNVIVRGHVEAESGSFKGTIDATDGVFRGTVQASRFVGDICSAGVFKQGVRPDITHYDSGSVGGTKTY 3948
Cdd:pfam12421    1 LTPDGHLTAKNGDFRGSINANSGTLNNVTIAENCTISGTLRAEKILGDIVKAGVWEFPYVREPASSNHRYFSGTLTV 77
EnvC COG4942
Septal ring factor EnvC, activator of murein hydrolases AmiA and AmiB [Cell cycle control, ...
876-1036 2.09e-05

Septal ring factor EnvC, activator of murein hydrolases AmiA and AmiB [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 443969 [Multi-domain]  Cd Length: 377  Bit Score: 50.15  E-value: 2.09e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  876 NSVNESIQQVENDLAQTSSEIKAQVDSVnQSLKEDIDTVNQTIvDNIDTVNQTINTNISNVNSQIEAAKQSIKDGDAALS 955
Cdd:COG4942    30 EQLQQEIAELEKELAALKKEEKALLKQL-AALERRIAALARRI-RALEQELAALEAELAELEKEIAELRAELEAQKEELA 107
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  956 QEIKKAQ-----SSLTTSLSQTS-----------KDLTAAIQ------KETNDRIADVNDAAKQAADQLLSAKNELKTSI 1013
Cdd:COG4942   108 ELLRALYrlgrqPPLALLLSPEDfldavrrlqylKYLAPARReqaeelRADLAELAALRAELEAERAELEALLAELEEER 187
                         170       180
                  ....*....|....*....|...
gi 556195944 1014 DSLSEVVTSGDENLARQISQIAA 1036
Cdd:COG4942   188 AALEALKAERQKLLARLEKELAE 210
Apolipoprotein pfam01442
Apolipoprotein A1/A4/E domain; These proteins contain several 22 residue repeats which form a ...
865-1032 2.10e-05

Apolipoprotein A1/A4/E domain; These proteins contain several 22 residue repeats which form a pair of alpha helices. This family includes: Apolipoprotein A-I. Apolipoprotein A-IV. Apolipoprotein E.


Pssm-ID: 460211 [Multi-domain]  Cd Length: 175  Bit Score: 48.03  E-value: 2.10e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   865 EAVDNAIGEVQNSVN----ESIQQVENDLAQTSSEIKAQVDSVNQSLKEDIDTVNQTIVDNIDTVNQTINTNISNVNSQI 940
Cdd:pfam01442    7 DELSTYAEELQEQLGpvaqELVDRLEKETEALRERLQKDLEEVRAKLEPYLEELQAKLGQNVEELRQRLEPYTEELRKRL 86
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   941 EAAKQSIKDGDAALSQEIKKAQSSLTTSLSQTSKDLTAAIQKETNDRIADVNDAAKQAADQLlsaKNELKTSIDSLSEVV 1020
Cdd:pfam01442   87 NADAEELQEKLAPYGEELRERLEQNVDALRARLAPYAEELRQKLAERLEELKESLAPYAEEV---QAQLSQRLQELREKL 163
                          170
                   ....*....|..
gi 556195944  1021 TSGDENLARQIS 1032
Cdd:pfam01442  164 EPQAEDLREKLD 175
CBM_4_9 pfam02018
Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.
3235-3345 2.29e-05

Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.


Pssm-ID: 396553  Cd Length: 134  Bit Score: 47.06  E-value: 2.29e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  3235 GNLIVNPSFE-RGTEGYTGWSGIATVVTLQVPHLGTKAAKLAAGGSAGVGQ----KISFKKDRSYKIGIWAKQDpnttiq 3309
Cdd:pfam02018    1 GNLIKNGTFEdGGLDGWKARGGSGKATVDVTSYNGTYSLKVSGRTATWDGQiidiTIRLEKGTTYTVSFWVKAS------ 74
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 556195944  3310 STDNTKFRVAEGnGLIASKAYGPFTSNWQEVSWTWK 3345
Cdd:pfam02018   75 SGPPQTVSVTLQ-ITDASGNYDTVADEKVVLTGEWT 109
FhaB COG3210
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ...
1874-3465 2.62e-05

Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];


Pssm-ID: 442443 [Multi-domain]  Cd Length: 1698  Bit Score: 50.54  E-value: 2.62e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 1874 VNADTTITMAPGSRVFDDTGAVYVNGVRVAFGNASWNTVSFDLKAGWSTVEFLVNQWTGQAYINLGFKLSEKVAQLNSAL 1953
Cdd:COG3210    77 STGGIGAAAANTAGTLETGLTSNIGGGSVNGSNSTGNGTLTTTAASATTGNNTGGTTTSSTNTVTTLGGTTTGNTVLSTS 156
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 1954 GMNALSNAISAVTSNVSTVGDRVTSTSQSVTDLRNSLEQTNANLENKADAQALSTLQNTVSKQRDTISSQGNSITNLNNT 2033
Cdd:COG3210   157 GAGNNTNTNNSSSGTNIGNSIPTTGGSLNVVAANPTGVTGVGGALINATAGVLANAGGGTAGGVASANSTLTGGVVAAGT 236
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2034 LTAARNAGDNLIPNYDFLQGATAWDTQYPAGVTFGNFGDGKAGVKLNRTTTTSPGIFSNNNKPLPLNGQRK--------- 2104
Cdd:COG3210   237 GAGVISTGGTDISSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSNTAGASSGDTTTNGTSSVTGAGgtgvlgggt 316
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2105 -YRVVVKAKGVSGAMNMLIRRQNKIGQTDSNYEDKNVTLTSEWQTITWETGLTASNADGQNFKLYAHPANAEIWVDTFKV 2183
Cdd:COG3210   317 aAGITTTNTVGGNGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGTGNGGGLTTAGAGTVASTVGTATASTGNASS 396
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2184 FDITDEVKIKANSDALSTLSSTVTQQGDKITSQGNSITKLTNDLEAADANIAKKADQSAVTTLTGRVEKTESGLTAANSN 2263
Cdd:COG3210   397 TTVLGSGSLATGNTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTIGGLTGSGTTNGAGLSGNTDVSGTGTVTNSA 476
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2264 ITSLSSSLNQQSKRGANLLPDGTFESYAVGHNLSNNRVIVTTDDSHGGNKCIRVTRPNDYNANATDNSDNHIFSGFQVRD 2343
Cdd:COG3210   477 GNTTSATTLAGGGIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGNATSGGTGGDGTTLSGSGLTTTVSGGASGTT 556
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2344 NAVFYMecWVKLDAKSTAMAENAQISIGLSLQYQDNSWQWPAVTKAAKDLSTAQWTKVSGYLKSTKSGIKQAMVRISIPN 2423
Cdd:COG3210   557 AASGSN--TANTLGVLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTGSAGATGTITLGAGTSGAGANATGGGAGL 634
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2424 VSSVKAGNSFLIDDLVITEVTDAYNAQSTADANANAISTLDSTVSQQGDQITSQGNSITKLTNDLSTTNSNVAKKADGAA 2503
Cdd:COG3210   635 TGSAVGAALSGTGSGTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGTTGTTLNAATGGTLNNAGNTLTISTGSIT 714
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2504 VTALTnrvteaeGNISSQSNQLATLSNSLAEGSLICNGGLNVDASFWEDSGPGSAFTYDANEKAIRTTTGSirvANLTRI 2583
Cdd:COG3210   715 VTGQI-------GALANANGDTVTFGNLGTGATLTLNAGVTITSGNAGTLSIGLTANTTASGTTLTLANAN---GNTSAG 784
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2584 PVEAETTLTISFEMKASETISNVSSDSVGVIADLATPTNWISSVSPWLGGVSTNWQTKSVELTIPANFIGKYVYLRFAAG 2663
Cdd:COG3210   785 ATLDNAGAEISIDITADGTITAAGTTAINVTGSGGTITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAG 864
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2664 GWTPSNSARLYIRKVDVFSSTGVAKKANATAVSDLTSRVDSTEGKLASQSQALTKLQNDLATTNNNVSKKADANALTALT 2743
Cdd:COG3210   865 ANSGSLAATAASITVGSGGVATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTA 944
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2744 NRVTQTEKDINSTSSSVTNLNNKVDAISVGGTNLIKNSGDMTGWSNVVSDTYRGNAVIGATVKAGSGYRDLREITLESPV 2823
Cdd:COG3210   945 LSGTQGNAGLSAASASDGAGDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAGGNGVTG 1024
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2824 DAGEYVYSFYAKGGVAGQTMTAFFYNPNTTTSIETSQGAKGNNSDGRAQFTLTTSWARYWVKWKQTPTTGTKRLILCRIE 2903
Cdd:COG3210  1025 TTGTASATGTGTAATAGGQNGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATG 1104
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2904 SNTSKDQTVYINSPKFEVGNVVSDWNESPSDSASASAVDSLTTKVNQQGTSISSIGNRTTSLENGLSTAQNNIAKKADAS 2983
Cdd:COG3210  1105 TSGGTTTSTGGVTASKVGGTTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGG 1184
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 2984 ALQDLRNTVTSQGDDLTAANSSITSLQASMNRRTVFTVTARGNGNSVTHGVFDESGKNLFTPGRSWALVTFAKHSDGSTV 3063
Cdd:COG3210  1185 ADSAATEGTAGTDLKGGDSTGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGA 1264
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3064 IATSKTYDVFGSANNGATMSADIEALASGTYVCVLTFDEPTGNRGKVLSALESLGGTSEVVNSLPYRGAYILLGRKGMKP 3143
Cdd:COG3210  1265 VSNGATSTVAGNAGATATGSTVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGTTATGTAVAAVNSGGVNAGGGTIN 1344
                        1290      1300      1310      1320      1330      1340      1350      1360
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3144 GDGLELRAPTGGDATAHISTSVEFVNGVMMGLGAAGGVMMKADANASAITTLQNTVKTQGDNIDSLSSSTTALENSLASS 3223
Cdd:COG3210  1345 TTAANTGLNGGNGATDSAAGAGSGGAAGSLAATAGAGTVLTGAGNNTGAEGTNAGRDGGVTTSGTGVGNNGGVSGTTVAG 1424
                        1370      1380      1390      1400      1410      1420      1430      1440
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3224 NASVDAASQIPGNLIVNPSFERGTEGYTGWSGIATVVTLQVPHLGTKAAKLAAGGSAGVGQKISFKKDRSYKIGIWAKQD 3303
Cdd:COG3210  1425 TTGSSATTGTGGTGNTTGTSVAGAGGGNADASAINTGNASSLGAGGSTAGNAVGGAVIGGTTTGGNGAGVAGATASNGGT 1504
                        1450      1460      1470      1480      1490      1500      1510      1520
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3304 PNTTIQSTDNTKFRVAEGNGLIASKAYGPFTSNWQEVSWTWKATKDVVADVQFTAFLSAGAMYFDDFYVVDVTDSVETQA 3383
Cdd:COG3210  1505 STGAGGTAGGTTAEVAKASLEGGEGTYGGSSVAEAGTGGGILGAVSGAGSEGGAAGGVTGSVGVGGTDGAGGDTGGADDT 1584
                        1530      1540      1550      1560      1570      1580      1590      1600
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3384 NSSAITNLDSRVTKTENDITSQGSQVTQLKNDLATTNTNVSKKADAAALTALTNRVTQNEKEIETQSSQTTSLKNSLSTV 3463
Cdd:COG3210  1585 GAQAPTAGNTATLTLSLAEGTNAEYGGTTNVTSGTAGNAGATGANSNTVVTTNGGEGVLALVAGGNTTNGTTLSGAVNGA 1664

                  ..
gi 556195944 3464 QA 3465
Cdd:COG3210  1665 GN 1666
ApoLp-III pfam07464
Apolipophorin-III precursor (apoLp-III); This family consists of several insect ...
848-1002 5.28e-05

Apolipophorin-III precursor (apoLp-III); This family consists of several insect apolipoprotein-III sequences. Exchangeable apolipoproteins constitute a functionally important family of proteins that play critical roles in lipid transport and lipoprotein metabolism. Apolipophorin III (apoLp-III) is a prototypical exchangeable apolipoprotein found in many insect species that functions in transport of diacylglycerol (DAG) from the fat body lipid storage depot to flight muscles in the adult life stage.


Pssm-ID: 462172 [Multi-domain]  Cd Length: 143  Bit Score: 46.21  E-value: 5.28e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   848 SELGKELLDEIDSKATQEAVDnaigEVQNSVNESIQQVENDLAQTSSEIKAQVDSVNQSLKEDIDTVNQTIvdnidtvnQ 927
Cdd:pfam07464    3 EELQQSVQKQLGLPSQQEVVE----TIKENTENLVDQLKQVQKSLQEELKKASGEAEEALKELNTKIVETA--------D 70
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 556195944   928 TINTNISNVNSQIEAAKQSIKDGDAALSQEIKKAQSSLTTSLSQTSKDLTAAIQKETNdriaDVNDAAKQAADQL 1002
Cdd:pfam07464   71 KLSEANPEVVQKANELQEKFQSGVQSLVTESQKLAKSISENSQGATEKLQKATKQAYD----DAVQAAQKLANQL 141
Mplasa_alph_rch TIGR04523
helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of ...
842-1027 1.10e-04

helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.


Pssm-ID: 275316 [Multi-domain]  Cd Length: 745  Bit Score: 48.48  E-value: 1.10e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   842 KDKITSSELGKELLDEIDSKATQEA--VDNAIGEVQNSVNESIQQVENDLAQTS---SEIKaqvdsvnqSLKEDiDTVNQ 916
Cdd:TIGR04523  383 KQEIKNLESQINDLESKIQNQEKLNqqKDEQIKKLQQEKELLEKEIERLKETIIknnSEIK--------DLTNQ-DSVKE 453
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   917 TIVDNIDTVNQTINTNISNVNSQIEAAKQSIKDgdaaLSQEIKKAQSSL------TTSLSQTSKDLTAAI------QKET 984
Cdd:TIGR04523  454 LIIKNLDNTRESLETQLKVLSRSINKIKQNLEQ----KQKELKSKEKELkklneeKKELEEKVKDLTKKIsslkekIEKL 529
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|...
gi 556195944   985 NDRIADVNDAAKQAADQLLSAKNELKTsiDSLSEVVTSGDENL 1027
Cdd:TIGR04523  530 ESEKKEKESKISDLEDELNKDDFELKK--ENLEKEIDEKNKEI 570
Mplasa_alph_rch TIGR04523
helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of ...
825-1027 3.24e-04

helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.


Pssm-ID: 275316 [Multi-domain]  Cd Length: 745  Bit Score: 46.94  E-value: 3.24e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   825 NSNLGTEATTRDV-LSFLKDKITSSElgKELLDEIDSKA-TQEAVDNAIGEVQNSvNESIQQVENDLaqtsSEIKAQVDS 902
Cdd:TIGR04523  227 NNQLKDNIEKKQQeINEKTTEISNTQ--TQLNQLKDEQNkIKKQLSEKQKELEQN-NKKIKELEKQL----NQLKSEISD 299
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   903 VNQSLKEDIDTVNQTIVDNIDTVNQTINTNISNVNSQIEAAKQSIKDgdaalsqeIKKAQSSLTTSLSQTSKDLtaaiqK 982
Cdd:TIGR04523  300 LNNQKEQDWNKELKSELKNQEKKLEEIQNQISQNNKIISQLNEQISQ--------LKKELTNSESENSEKQREL-----E 366
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*...
gi 556195944   983 ETNDRIADV---NDAAKQAADQLLSAKNELKTSIDSLSEVVTSGDENL 1027
Cdd:TIGR04523  367 EKQNEIEKLkkeNQSYKQEIKNLESQINDLESKIQNQEKLNQQKDEQI 414
ApoLp-III_like cd13769
Apolipophorin-III and similar insect proteins; Exchangeable apolipoproteins play vital roles ...
906-1041 3.48e-04

Apolipophorin-III and similar insect proteins; Exchangeable apolipoproteins play vital roles in the transport of lipids and lipoprotein metabolism. Apolipophorin III (apoLp-III) assists in the loading of diacylglycerol, generated from triacylglycerol stores in the fat body through the action of adipokinetic hormone, into lipophorin, the hemolymph lipoprotein. ApoLp-III increases the lipid carrying capacity of lipophorin by covering the expanding hydrophobic surface resulting from diacylglycerol uptake. It plays a critical role in the transport of lipids during insect flight, and may also play a role in defense mechanisms and innate immunity.


Pssm-ID: 259842 [Multi-domain]  Cd Length: 158  Bit Score: 43.85  E-value: 3.48e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  906 SLKEDIDTVNQTIVD------------NIDTVNQTINTNISNVNSQIEAAKQSIKDgdaalsqEIKKAQSSLTTSLSQTS 973
Cdd:cd13769     2 QLSELIQKAQEAINNlaqqvqkqlglqNPEEVVNTLKEQSDNFANNLQEVSSSLKE-------EAKKKQGEVEEAWNEFK 74
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 556195944  974 KDLTaaiqkETNDRIADVNDAAKQAADQllsaKNELKTSIDSLSEVVtsgdENLARQISQIAAGTGEQ 1041
Cdd:cd13769    75 TKLS-----ETVPELRKSLPVEEKAQEL----QAKLQSGLQTLVTES----QKLAKAISENSQKAQEE 129
MA smart00283
Methyl-accepting chemotaxis-like domains (chemotaxis sensory transducer); Thought to undergo ...
853-1045 3.69e-04

Methyl-accepting chemotaxis-like domains (chemotaxis sensory transducer); Thought to undergo reversible methylation in response to attractants or repellants during bacterial chemotaxis.


Pssm-ID: 214599 [Multi-domain]  Cd Length: 262  Bit Score: 45.35  E-value: 3.69e-04
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944    853 ELLDEIDSKATQ-EAVDNAIGEVQNSVNESIQQVE--NDLA----QTSSEIKAQVDSVNQSLKEDIDTVNQTI--VDNID 923
Cdd:smart00283   15 EQAEELEELAERmEELSASIEEVAANADEIAATAQsaAEAAeegrEAVEDAITAMDQIREVVEEAVSAVEELEesSDEIG 94
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944    924 TVNQTIN-----TNISNVNSQIEAAKQSIK-DGDAALSQEIKKaqsslttsLSQTSKDLTAAIQ---KETNDRIADVNDA 994
Cdd:smart00283   95 EIVSVIDdiadqTNLLALNAAIEAARAGEAgRGFAVVADEVRK--------LAERSAESAKEIEsliKEIQEETNEAVAA 166
                           170       180       190       200       210
                    ....*....|....*....|....*....|....*....|....*....|.
gi 556195944    995 AKQAADQLLSAKNELKTSIDSLSEVVTSGDENlARQISQIAAGTGEQFDSL 1045
Cdd:smart00283  167 MEESSSEVEEGVELVEETGDALEEIVDSVEEI-ADLVQEIAAATDEQAAGS 216
COG4372 COG4372
Uncharacterized protein, contains DUF3084 domain [Function unknown];
846-1062 4.09e-04

Uncharacterized protein, contains DUF3084 domain [Function unknown];


Pssm-ID: 443500 [Multi-domain]  Cd Length: 370  Bit Score: 46.05  E-value: 4.09e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  846 TSSELGKELLDEIDskATQEAVDNAIGEVQNSVNEsIQQVENDLAQTSSEIKA---QVDSVNQSLKEDIDTVNQTiVDNI 922
Cdd:COG4372    28 ALSEQLRKALFELD--KLQEELEQLREELEQAREE-LEQLEEELEQARSELEQleeELEELNEQLQAAQAELAQA-QEEL 103
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  923 DTVNQ----------TINTNISNVNSQIEAAKQSIKDGDAALSQ---EIKKAQSSLTtslsqtskDLTAAIQKETNDRIA 989
Cdd:COG4372   104 ESLQEeaeelqeeleELQKERQDLEQQRKQLEAQIAELQSEIAEreeELKELEEQLE--------SLQEELAALEQELQA 175
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 556195944  990 DVNDAAKQAADQLLSAKNELKTSIDSLSEvvtsgDENLARQISQIAAGTGEQFDSLKIWYFDQDAEGWTEDDN 1062
Cdd:COG4372   176 LSEAEAEQALDELLKEANRNAEKEEELAE-----AEKLIESLPRELAEELLEAKDSLEAKLGLALSALLDALE 243
CALCOCO1 pfam07888
Calcium binding and coiled-coil domain (CALCOCO1) like; Proteins found in this family are ...
1196-1378 4.17e-04

Calcium binding and coiled-coil domain (CALCOCO1) like; Proteins found in this family are similar to the coiled-coil transcriptional coactivator protein coexpressed by Mus musculus (CoCoA/CALCOCO1). This protein binds to a highly conserved N-terminal domain of p160 coactivators, such as GRIP1, and thus enhances transcriptional activation by a number of nuclear receptors. CALCOCO1 has a central coiled-coil region with three leucine zipper motifs, which is required for its interaction with GRIP1 and may regulate the autonomous transcriptional activation activity of the C-terminal region.


Pssm-ID: 462303 [Multi-domain]  Cd Length: 488  Bit Score: 46.04  E-value: 4.17e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  1196 LQEEQQARANADTA-EAQARSTLAAQIRGSSESGNLDDIRSGLiyQEKNARITADAAEASAR----ESLQTEFNRNKASV 1270
Cdd:pfam07888   36 LEECLQERAELLQAqEAANRQREKEKERYKRDREQWERQRREL--ESRVAELKEELRQSREKheelEEKYKELSASSEEL 113
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  1271 AEELHTLSTEQASQASKITGLQ---TSLGQKADASAVQ--TISQKVEEQGNTLKSQGAALSTLDNRVGSVESGVSANSKA 1345
Cdd:pfam07888  114 SEEKDALLAQRAAHEARIRELEediKTLTQRVLERETEleRMKERAKKAGAQRKEEEAERKQLQAKLQQTEEELRSLSKE 193
                          170       180       190
                   ....*....|....*....|....*....|...
gi 556195944  1346 ITGLQSTVTQQDKTLSSQSESITTLNNSLSDIQ 1378
Cdd:pfam07888  194 FQELRNSLAQRDTQVLQLQDTITTLTQKLTTAH 226
COG4372 COG4372
Uncharacterized protein, contains DUF3084 domain [Function unknown];
3629-3771 6.47e-04

Uncharacterized protein, contains DUF3084 domain [Function unknown];


Pssm-ID: 443500 [Multi-domain]  Cd Length: 370  Bit Score: 45.28  E-value: 6.47e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944 3629 ITDAKEAQDSANANASA----LTSLTSRVTNVEGLVTSQASQLSSLTSQVNDASSKVDQmaqtitnnekTQSSLNTSLQS 3704
Cdd:COG4372    26 IAALSEQLRKALFELDKlqeeLEQLREELEQAREELEQLEEELEQARSELEQLEEELEE----------LNEQLQAAQAE 95
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 556195944 3705 QIDAQASANIKNQtELNNATTSLAAIKSTQQTQATTISALSQQQTNLTAQVGGQSAELQELKKTVVE 3771
Cdd:COG4372    96 LAQAQEELESLQE-EAEELQEELEELQKERQDLEQQRKQLEAQIAELQSEIAEREEELKELEEQLES 161
ATG17_like pfam04108
Autophagy protein ATG17-like domain; This domain is found in the autophagy-related proteins ...
832-1022 1.06e-03

Autophagy protein ATG17-like domain; This domain is found in the autophagy-related proteins ATG17 and ATG11, conserved across eukaryotes. ATG17 forms a complex with ATG29 and ATG31, critical for both PAS (preautophagosomal structure) formation and autophagy. Together with ATG13, it is required for ATG1 kinase activation. ATG11 is a scaffold protein required for the cytoplasm-to-vacuole targeting (Cvt) pathway during starvation and to recruit ATG proteins to the pre-autophagosome. It is also required for ATG1 kinase activation. In many eukaryotes, ATG11 (the orthologue in mammals is RB1-inducible coiled-coil protein 1 (RB1CC1) and in S. pombe is Taz1-interacting factor 1 (taf1)) is essential for bulk autophagy, except in S.cerevisiae. ATG17 and ATG11 are large similar proteins, both predicted to be almost entirely helical, containing conserved coiled-coil regions and lack obvious functional motifs.


Pssm-ID: 427715 [Multi-domain]  Cd Length: 360  Bit Score: 44.69  E-value: 1.06e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   832 ATTRDVLSFLKDKITSS----ELGKELLDEIDS--KATQEAVDNAigevqNSVNESIQQVENDLAQTSSEIKAQVDSVNQ 905
Cdd:pfam04108    3 SSAQDLCRWANELLTDArsllEELVVLLAKIAFlrRGLSVQLANL-----EKVREGLEKVLNELKKDFKQLLKDLDAALE 77
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   906 SLKEDIDTVNQTIVDNIDTVNQTINTNISN-VN-SQIEAAKQSIKDgdaaLSQEIKKAQSSLTTSLSQTSKDLTaAIQKE 983
Cdd:pfam04108   78 RLEETLDKLRNTPVEPALPPGEEKQKTLLDfIDeDSVEILRDALKE----LIDELQAAQESLDSDLKRFDDDLR-DLQKE 152
                          170       180       190
                   ....*....|....*....|....*....|....*....
gi 556195944   984 TNDriADVNDAAKQAADQLLSAKNELKTSIDSLSEVVTS 1022
Cdd:pfam04108  153 LES--LSSPSESISLIPTLLKELESLEEEMASLLESLTN 189
CBM_4_9 pfam02018
Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.
1389-1507 1.18e-03

Carbohydrate binding domain; This family includes diverse carbohydrate binding domains.


Pssm-ID: 396553  Cd Length: 134  Bit Score: 42.05  E-value: 1.18e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  1389 SNLLVNASFE-RDLAGWSAGNSVSSIIKASAPHSGSKILVCAAGT-----VQITQSVSVVEGRTYKLSSFVRcttdavIS 1462
Cdd:pfam02018    1 GNLIKNGTFEdGGLDGWKARGGSGKATVDVTSYNGTYSLKVSGRTatwdgQIIDITIRLEKGTTYTVSFWVK------AS 74
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|
gi 556195944  1463 SPGNNKLRI-----GAATLLKEIPIRPENLPKDetWKEVSDTWKATLTGK 1507
Cdd:pfam02018   75 SGPPQTVSVtlqitDASGNYDTVADEKVVLTGE--WTKLEGTFTIPKTAS 122
ClyA_Cry6Aa-like cd22656
Bacillus thuringiensis crystal 6Aa (Cry6Aa) toxin, and similar proteins; This model includes ...
856-997 1.35e-03

Bacillus thuringiensis crystal 6Aa (Cry6Aa) toxin, and similar proteins; This model includes pesticidal Cry6Aa toxin from Bacillus thuringiensis, one of the many parasporal crystal (Cry) toxins produced during the sporulation phase of growth. Many of these proteins are toxic to numerous insect species and have been effectively used as proteinaceous insecticides to directly kill insect pests; some have been used to control insect growth on transgenic agricultural plants. Cry6Aa exists as a protoxin, which is activated by cleavage using trypsin. Structure studies for Cry6Aa support a mechanism of action by pore formation, similar to cytolysin A (ClyA)-type alpha pore-forming toxins (alpha-PFTs) such as HblB, and bioassay and mutation studies show that Cry6Aa is an active pore-forming toxin. Cry6Aa shows atypical features compared to other members of alpha-PFTs, including internal repeat sequences and small loop regions within major alpha helices.


Pssm-ID: 439154 [Multi-domain]  Cd Length: 309  Bit Score: 43.90  E-value: 1.35e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  856 DEIDSKATQEAVDNAIG-EVQNSVNESIQQVENDLAQTSSEIKAQVDSVNQSLKEDIDTVNQTIVDNIDTVN--QTINTN 932
Cdd:cd22656   154 DQTALETLEKALKDLLTdEGGAIARKEIKDLQKELEKLNEEYAAKLKAKIDELKALIADDEAKLAAALRLIAdlTAADTD 233
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 556195944  933 ISNVNSQIEAAKQSIkdgdaalsQEIKKAQSSLTTSLSQTSKDLTAAIQKETNDRIA--DVNDAAKQ 997
Cdd:cd22656   234 LDNLLALIGPAIPAL--------EKLQGAWQAIATDLDSLKDLLEDDISKIPAAILAklELEKAIEK 292
Mplasa_alph_rch TIGR04523
helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of ...
841-1034 1.46e-03

helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.


Pssm-ID: 275316 [Multi-domain]  Cd Length: 745  Bit Score: 44.63  E-value: 1.46e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   841 LKDKITSSELG-KELLDEIDSKATQ-EAVDNAIgevqNSVNESIQQVENDLAQTSSEIKaQVDSVNQSLKEDIDTVNQTI 918
Cdd:TIGR04523  445 LTNQDSVKELIiKNLDNTRESLETQlKVLSRSI----NKIKQNLEQKQKELKSKEKELK-KLNEEKKELEEKVKDLTKKI 519
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   919 vdnidtvnQTINTNISNVNSQIEAAKQSIKDgdaaLSQEIKKAQSSLTTSLSQTSKD-LTAAIQ--KETNDRIADVNDAA 995
Cdd:TIGR04523  520 --------SSLKEKIEKLESEKKEKESKISD----LEDELNKDDFELKKENLEKEIDeKNKEIEelKQTQKSLKKKQEEK 587
                          170       180       190
                   ....*....|....*....|....*....|....*....
gi 556195944   996 KQAADQLLSAKNELKTSIDSLSEVVtsgdENLARQISQI 1034
Cdd:TIGR04523  588 QELIDQKEKEKKDLIKEIEEKEKKI----SSLEKELEKA 622
DR0291 COG1579
Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General ...
898-1036 1.56e-03

Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General function prediction only];


Pssm-ID: 441187 [Multi-domain]  Cd Length: 236  Bit Score: 43.37  E-value: 1.56e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  898 AQVDSVNQSLKEDIDTVNQTIVDNIDTVnQTINTNISNVNSQIEAAKQSIKDGDAALSQE---IKKAQSSLTTSlsQTSK 974
Cdd:COG1579    13 QELDSELDRLEHRLKELPAELAELEDEL-AALEARLEAAKTELEDLEKEIKRLELEIEEVearIKKYEEQLGNV--RNNK 89
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 556195944  975 DLTAaIQKE----------TNDRIADVNDAAKQAADQLLSAKNELKTSIDSLSEVVTSGDENLARQISQIAA 1036
Cdd:COG1579    90 EYEA-LQKEieslkrrisdLEDEILELMERIEELEEELAELEAELAELEAELEEKKAELDEELAELEAELEE 160
Tar COG0840
Methyl-accepting chemotaxis protein (MCP) [Signal transduction mechanisms];
770-1045 2.08e-03

Methyl-accepting chemotaxis protein (MCP) [Signal transduction mechanisms];


Pssm-ID: 440602 [Multi-domain]  Cd Length: 533  Bit Score: 43.86  E-value: 2.08e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  770 LAQINVYASKTNKLDTATLIAQAATTTFTHAGLGDNETWYYWIRAVNKRGMVGQPNSNLGTEATTRDVLSFLKDKITSSE 849
Cdd:COG0840    16 LLALSLLALLAAALLILLALLLAALTALALLLLLSLLALLLLLLLLALALLLVLLALLLLLALVVLLALLLALLLLLLAL 95
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  850 LGKELLDEIDSKATQEAVDNAIGEVQNSVNESIQQVENDLAQTSSEIKAQVDSVNQSLKEDIDTVNQTIVDNIDTVNQTI 929
Cdd:COG0840    96 LALALAALALLAALAALLALLELLLAALLAALAIALLALAALLALAALALALLALALLAAAAAAAAALAALLEAAALALA 175
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  930 NTNISNVNSQIEAAKQSIKDGDAA--LSQEIKKAQSSLTTSLSQTSK-DLTAAIQKETNDRIADVNDAAKQAADQLlsak 1006
Cdd:COG0840   176 AAALALALLAAALLALVALAIILAllLSRSITRPLRELLEVLERIAEgDLTVRIDVDSKDEIGQLADAFNRMIENL---- 251
                         250       260       270
                  ....*....|....*....|....*....|....*....
gi 556195944 1007 NELKTSIDSLSEVVTSGDENLARQISQIAAGTGEQFDSL 1045
Cdd:COG0840   252 RELVGQVRESAEQVASASEELAASAEELAAGAEEQAASL 290
ClyA_NheA-like cd22654
Bacillus cereus non-hemolytic enterotoxin (Nhe) component A (NheA), and similar proteins; This ...
869-1019 2.70e-03

Bacillus cereus non-hemolytic enterotoxin (Nhe) component A (NheA), and similar proteins; This model contains Bacillus cereus tripartite non-hemolytic enterotoxin (Nhe) component A (NheA), a member of the cytolysin A (ClyA) family of alpha pore-forming toxins (alpha-PFTs). Non-hemolytic enterotoxin (Nhe), despite its name, is hemolytic and able to lyse erythrocytes from various mammalian organisms. It consists of three proteins, NheA, NheB and NheC, encoded by one operon containing three genes nheA, nheB and nheC, respectively. Separately, these three proteins show no toxicity; maximal activity is seen only when all three components are presented. The NheB and NheC components are able to bind to cell membranes while NheA is not; NheC primes the host cell for the formation of ion permeable NheB/C pores. Binding of NheA to NheB/NheC is thought to be the final stage of pore formation. Structure of NheA shows an elongated, almost entirely alpha-helical protein with an enlarged "head" domain compared with other cytolysins, displaying on its surface an enlarged beta-tongue which is of amphipathic rather than hydrophobic nature. It has been proposed that NheA could even form beta-barrel pores.


Pssm-ID: 439152 [Multi-domain]  Cd Length: 333  Bit Score: 43.02  E-value: 2.70e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  869 NAIGEVQNSVnesiQQVENDLAQTSSEIKA---QVDSVNQSLKEDID-------TVNQTIV---DNIDTVNQTINTNISN 935
Cdd:cd22654    97 NGINKLQSQL----QTIQNSMEQTSSNLNRfktLLDADSKNFSTDAKkaidslsGSNGEIAqlrTQIKTINDEIQEELTK 172
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  936 -VNSQIEAAKQSIKDGDAALSqeikkaqssLTTSLSQTSKDLTAAIQKETNDRIADVNDAAKQAADQLLSAKNELKTSID 1014
Cdd:cd22654   173 iLNRPIEVGDGSINIGKQVFT---------ITITTATTKTVDVTSIGGLINGIGNASDDEVKEAANKIQQKQKELVDLIK 243

                  ....*
gi 556195944 1015 SLSEV 1019
Cdd:cd22654   244 KLSDA 248
ClyA-like cd21116
family of the cytolysin A (ClyA) family alpha pore-forming toxins (alpha-PFT) including ...
859-1022 2.84e-03

family of the cytolysin A (ClyA) family alpha pore-forming toxins (alpha-PFT) including Bacillus cereus HblB, Aeromonas hydrophila AhlB, Bacillus thuringiensis Cry6Aa and similar proteins; This family belongs to the ClyA family of alpha-PFT bacterial toxins. PFTs form the major group of virulence factors in many pathogenic bacteria and in general are critical components of the molecular offensive and defensive machinery of cells in all kingdoms of life. Bacterial PFTs facilitate the takeover of host resources by puncturing holes in the membrane. PFTs can be classified as alpha-PFTs and beta-PFTs depending on the secondary structures of their membrane component. Alpha-PFTs use a ring of amphipathic helices while beta-PFTs use a beta-barrel to construct the pore. Members of this family include the toxins: Bacillus cereus hemolysin binding component B (HblB or HBL-B) of the diarrheal enterotoxin hemolysin BL, Aeromonas hydrophila hemolytic (Ahl) component B (AhlB) of the tripartite AhlABC toxin, Vibrio cholerae cytotoxin motility associated killing factor A (MakA) cytotoxin, Xenorhabdus nematophila alpha-xenorhabdolysin (XaxA), Bacillus thuringiensis crystal 6Aa (Cry6Aa) parasporal crystal (Cry) toxin, and Bacillus cereus non-hemolytic enterotoxin (Nhe) component A (NheA) of the non-hemolytic enterotoxin Nhe, which, despite its name, is hemolytic, among others. In solution, ClyA proteins have an elongated, almost entirely alpha-helical structure, except for a short hydrophobic beta-hairpin known as the beta-tongue. Pore formation by ClyA requires circular oligomerization of the toxin by a sequential mechanism. This, in turn, concentrates the amphipathic helices in the center of the ring-like structure, forming a helical barrel that inserts into the membrane by a wedge-like mechanism. Compared with ClyA, NheA is almost entirely alpha-helical with an enlarged "head" domain, and an enlarged beta-tongue; it has been proposed that NheA could even form beta-barrel pores. Alpha-PFTs with similar structures are increasingly being found in eukaryotes, in particular as components of the immune systems of animals. This family may be distantly related to Escherichia coli alpha-PFT hemolysin E (HlyE, also known as ClyA or SheA).


Pssm-ID: 439149 [Multi-domain]  Cd Length: 224  Bit Score: 42.40  E-value: 2.84e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  859 DSKATQEAVDNAIGEVQNSVNESIQQVEN--DLAQTSSEIKAQVDSVNQSLKEDIDTVNQTIVDNIDTVNQ---TINTNI 933
Cdd:cd21116     3 DLGAASALVQAYVTAILNQPNINLIPLDLlpSLNTHQALARAHALEWLNEIKPKLLSLPNDIIGYNNTFQSyypDLIELA 82
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  934 SNVNSQIEAAKQSIKDGDAALSQEIKKAQSSLT---TSLSQTSKDLTAAIQKETNdriaDVNDAAKQAAdQLLSAKNELK 1010
Cdd:cd21116    83 DNLIKGDQGAKQQLLQGLEALQSQVTKKQTSVTsfiNELTTFKNDLDDDSRNLQT----DATKAQAQVA-VLNALKNQLN 157
                         170
                  ....*....|..
gi 556195944 1011 TSIDSLSEVVTS 1022
Cdd:cd21116   158 SLAEQIDAAIDA 169
ApoLp-III pfam07464
Apolipophorin-III precursor (apoLp-III); This family consists of several insect ...
921-1041 3.60e-03

Apolipophorin-III precursor (apoLp-III); This family consists of several insect apolipoprotein-III sequences. Exchangeable apolipoproteins constitute a functionally important family of proteins that play critical roles in lipid transport and lipoprotein metabolism. Apolipophorin III (apoLp-III) is a prototypical exchangeable apolipoprotein found in many insect species that functions in transport of diacylglycerol (DAG) from the fat body lipid storage depot to flight muscles in the adult life stage.


Pssm-ID: 462172 [Multi-domain]  Cd Length: 143  Bit Score: 40.81  E-value: 3.60e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   921 NIDTVNQTINTNISNVNSQIEAAKQSIKDgdaalsqEIKKAQSSLTTSLSQTSKDLTaaiqkETNDRIADVNDAAKQAAD 1000
Cdd:pfam07464   17 SQQEVVETIKENTENLVDQLKQVQKSLQE-------ELKKASGEAEEALKELNTKIV-----ETADKLSEANPEVVQKAN 84
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 556195944  1001 QLlsaKNELKTSIDSLSEVVtsgdENLARQISQIAAGTGEQ 1041
Cdd:pfam07464   85 EL---QEKFQSGVQSLVTES----QKLAKSISENSQGATEK 118
EzrA pfam06160
Septation ring formation regulator, EzrA; During the bacterial cell cycle, the tubulin-like ...
841-1009 4.35e-03

Septation ring formation regulator, EzrA; During the bacterial cell cycle, the tubulin-like cell-division protein FtsZ polymerizes into a ring structure that establishes the location of the nascent division site. EzrA modulates the frequency and position of FtsZ ring formation. The structure contains 5 spectrin like alpha helical repeats.


Pssm-ID: 428797 [Multi-domain]  Cd Length: 542  Bit Score: 42.92  E-value: 4.35e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   841 LKDKITSSElgkELLDEIDSKATQEAVDNAIGEVQNSVN------ESIQQVENDLAQTSSEIKaQVDSVNQSLKEDIDTV 914
Cdd:pfam06160  242 LEEQLEENL---ALLENLELDEAEEALEEIEERIDQLYDllekevDAKKYVEKNLPEIEDYLE-HAEEQNKELKEELERV 317
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   915 NQT-IVDNIDTVN-QTINTNISNVNSQIEAAKQSIKDGDAALSqEIKKAQSSLTTSLSQTSKDltaaiQKETNDRIADVN 992
Cdd:pfam06160  318 QQSyTLNENELERvRGLEKQLEELEKRYDEIVERLEEKEVAYS-ELQEELEEILEQLEEIEEE-----QEEFKESLQSLR 391
                          170
                   ....*....|....*..
gi 556195944   993 DAAKQAADQLLSAKNEL 1009
Cdd:pfam06160  392 KDELEAREKLDEFKLEL 408
Tar COG0840
Methyl-accepting chemotaxis protein (MCP) [Signal transduction mechanisms];
865-1041 5.26e-03

Methyl-accepting chemotaxis protein (MCP) [Signal transduction mechanisms];


Pssm-ID: 440602 [Multi-domain]  Cd Length: 533  Bit Score: 42.70  E-value: 5.26e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  865 EAVDNAIGEVQNSVNESIQQVEN--DLAQTSSEIKAQVDSVNQSLKEDIDTVNQTI------VDNIDTVNQTIN-----T 931
Cdd:COG0840   298 EELSATVQEVAENAQQAAELAEEasELAEEGGEVVEEAVEGIEEIRESVEETAETIeelgesSQEIGEIVDVIDdiaeqT 377
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  932 NISNVNSQIEAAKQsikdGD-----AALSQEIKK-AQSSlTTSLSQTsKDLTAAIQKETNDRIADVNDAAKQAADQLLSA 1005
Cdd:COG0840   378 NLLALNAAIEAARA----GEagrgfAVVADEVRKlAERS-AEATKEI-EELIEEIQSETEEAVEAMEEGSEEVEEGVELV 451
                         170       180       190
                  ....*....|....*....|....*....|....*.
gi 556195944 1006 kNELKTSIDSLSEVVtsgdENLARQISQIAAGTGEQ 1041
Cdd:COG0840   452 -EEAGEALEEIVEAV----EEVSDLIQEIAAASEEQ 482
NTTRR-F1 NF033675
NTTRR-F1 domain; NTTRR-F1 (N-terminal To Repetitive Region - Firmicutes 1) is a homology ...
3236-3277 7.09e-03

NTTRR-F1 domain; NTTRR-F1 (N-terminal To Repetitive Region - Firmicutes 1) is a homology domain found strictly as the N-terminal non-repetitive region of otherwise highly repetitive proteins of various Firmicutes. The repetitive region that follows typically is collagen-like, with every third residue a glycine.


Pssm-ID: 468135  Cd Length: 155  Bit Score: 40.23  E-value: 7.09e-03
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|..
gi 556195944 3236 NLIVNPSFERGTegYTGWSGIATVVTLQVPHLGTKAAKLAAG 3277
Cdd:NF033675    4 NLIVNGGFETGS--LTPWSGVNASITSQFSHSGFYSARLLGG 43
HEC1 COG5185
Chromosome segregation protein NDC80, interacts with SMC proteins [Cell cycle control, cell ...
834-1020 8.94e-03

Chromosome segregation protein NDC80, interacts with SMC proteins [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 444066 [Multi-domain]  Cd Length: 594  Bit Score: 41.87  E-value: 8.94e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  834 TRDVLSFLKDKITSSE-LGKELLDEIDSKATQ-EAVDNAIGEVQNSVNESIQQVENDLAQTSSEIKAQVDSVNQSLKEDI 911
Cdd:COG5185   404 AQEILATLEDTLKAADrQIEELQRQIEQATSSnEEVSKLLNELISELNKVMREADEESQSRLEEAYDEINRSVRSKKEDL 483
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944  912 DTVNQTIVDNIDTVNQTINTNISNVNSQIEAAKQSIKDGdaalsqeikkaQSSLTTSLSQTSKDLTAAIQKETNDRIA-- 989
Cdd:COG5185   484 NEELTQIESRVSTLKATLEKLRAKLERQLEGVRSKLDQV-----------AESLKDFMRARGYAHILALENLIPASELiq 552
                         170       180       190
                  ....*....|....*....|....*....|.
gi 556195944  990 DVNDAAKQAADQLLSAKNELKTSIDSLSEVV 1020
Cdd:COG5185   553 ASNAKTDGQAANLRTAVIDELTQYLSTIESQ 583
Tektin pfam03148
Tektin family; Tektins are cytoskeletal proteins. They have been demonstrated in such cellular ...
861-965 9.65e-03

Tektin family; Tektins are cytoskeletal proteins. They have been demonstrated in such cellular sites as centrioles, basal bodies, and along ciliary and flagellar doublet microtubules. Tektins form unique protofilaments, organized as longitudinal polymers of tektin heterodimers with axial periodicity matching tubulin. Tektin polypeptides consist of several alpha-helical regions that are predicted to form coiled coils. Indeed, tektins share considerable structural similarities with intermediate filament proteins. Possible functional roles for tektins are: stabilization of tubulin protofilaments; attachment of A and B-tubules in ciliary/flagellar microtubule doublets and C-tubules in centrioles; binding of axonemal components.


Pssm-ID: 460827 [Multi-domain]  Cd Length: 383  Bit Score: 41.77  E-value: 9.65e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 556195944   861 KATQEAVDNAIGEVQNSVN--ESIQQVendLAQTSSEIKAQVDSVNQSLKEDIDTVNQTIvDNIDTVNQTINTNISNVNS 938
Cdd:pfam03148  204 KFTQDNIERAEKERAASAQlrELIDSI---LEQTANDLRAQADAVNFALRKRIEETEDAK-NKLEWQLKKTLQEIAELEK 279
                           90       100
                   ....*....|....*....|....*..
gi 556195944   939 QIEAAKQSIKDGDAALsqeiKKAQSSL 965
Cdd:pfam03148  280 NIEALEKAIRDKEAPL----KLAQTRL 302
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH