NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1904527047|gb|KAF7469176|]
View 

transcription elongation factor SPT5 [Marmota monax]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
210-289 1.23e-37

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


:

Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 135.74  E-value: 1.23e-37
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  210 IGEERATAISLMRKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLgyWNQQMVPIKEMTDVLK 289
Cdd:cd09888      9 PGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVYL--NTIKLVPIKEMPDVLS 86
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
796-913 2.70e-29

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


:

Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 113.39  E-value: 2.70e-29
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047   796 GSQTPMYG-SGSRTPMYGSQTP----LQDGSRTPHYGSQTPLHDG--SRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEP 868
Cdd:smart01104    1 GGRTPAWGaSGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAWGGAG-PTGSRTPAWGGASAWGNKSSEG 79
                            90       100       110       120
                    ....*....|....*....|....*....|....*....|....*..
gi 1904527047   869 TPSPQA--YGGTPNPQTPGYpdpssPQVNPQYNPQTPGTPAMYNTDQ 913
Cdd:smart01104   80 SASSWAagPGGAYGAPTPGY-----GGTPSAYGPATPGGGAMAGSAS 121
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
495-545 2.62e-27

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240507  Cd Length: 51  Bit Score: 104.91  E-value: 2.62e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527047  495 YFKMGDHVKVIAGRFEGDTGLIVRVEENFIILFSDLTMHELKVLPRDLQLC 545
Cdd:cd06083      1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
1052-1108 7.20e-27

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240510  Cd Length: 58  Bit Score: 104.14  E-value: 7.20e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1904527047 1052 EHLEPITPTKNNKVKVILGEDREATGVLLSIDGEDGIVRMDLDEQLKILNLRFLGKL 1108
Cdd:cd06086      1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMDSDGDIKILPMNFLAKL 57
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
444-494 1.31e-26

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240506  Cd Length: 51  Bit Score: 102.97  E-value: 1.31e-26
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527047  444 FQPGDNVEVCEGELINLQGKVLSVDGNKITIMPKHEDLKDMLEFPAQELRK 494
Cdd:cd06082      1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
726-775 1.38e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240509  Cd Length: 52  Bit Score: 100.25  E-value: 1.38e-25
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 1904527047  726 DNELIGQTVRISQGPYKGYIGVVKDATESTARVELHSTCQTISVDRQRLT 775
Cdd:cd06085      2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLA 51
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
300-337 6.87e-18

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240505  Cd Length: 38  Bit Score: 77.89  E-value: 6.87e-18
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1904527047  300 KSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMIPRIDY 337
Cdd:cd06081      1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
621-663 6.88e-18

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240508  Cd Length: 43  Bit Score: 77.95  E-value: 6.88e-18
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|...
gi 1904527047  621 KDIVKVIDGPHSDREGEIRHLYHSFAFLHCKKLVENGGMFVCK 663
Cdd:cd06084      1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
PHA03269 super family cl29788
envelope glycoprotein C; Provisional
849-994 7.39e-09

envelope glycoprotein C; Provisional


The actual alignment was detected with superfamily member PHA03269:

Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 59.74  E-value: 7.39e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  849 NPNTPSRAEEEYEYAFDDEPTPSPQayggtPNPQTPGYPDPS-SPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQ 927
Cdd:PHA03269    21 NLNTNIPIPELHTSAATQKPDPAPA-----PHQAASRAPDPAvAPTSAASRKPD--------------LAQAPTPAASEK 81
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1904527047  928 PSPSPQSYHQV--APSPAGYQNTHSPasyhPTPSPM-----AYQASPSPSPVGYSPMTPgAPSPGGYNPHTPGS 994
Cdd:PHA03269    82 FDPAPAPHQAAsrAPDPAVAPQLAAA----PKPDAAeaftsAAQAHEAPADAGTSAASK-KPDPAAHTQHSPPP 150
 
Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
210-289 1.23e-37

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 135.74  E-value: 1.23e-37
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  210 IGEERATAISLMRKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLgyWNQQMVPIKEMTDVLK 289
Cdd:cd09888      9 PGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVYL--NTIKLVPIKEMPDVLS 86
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
796-913 2.70e-29

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 113.39  E-value: 2.70e-29
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047   796 GSQTPMYG-SGSRTPMYGSQTP----LQDGSRTPHYGSQTPLHDG--SRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEP 868
Cdd:smart01104    1 GGRTPAWGaSGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAWGGAG-PTGSRTPAWGGASAWGNKSSEG 79
                            90       100       110       120
                    ....*....|....*....|....*....|....*....|....*..
gi 1904527047   869 TPSPQA--YGGTPNPQTPGYpdpssPQVNPQYNPQTPGTPAMYNTDQ 913
Cdd:smart01104   80 SASSWAagPGGAYGAPTPGY-----GGTPSAYGPATPGGGAMAGSAS 121
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
495-545 2.62e-27

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240507  Cd Length: 51  Bit Score: 104.91  E-value: 2.62e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527047  495 YFKMGDHVKVIAGRFEGDTGLIVRVEENFIILFSDLTMHELKVLPRDLQLC 545
Cdd:cd06083      1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
1052-1108 7.20e-27

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240510  Cd Length: 58  Bit Score: 104.14  E-value: 7.20e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1904527047 1052 EHLEPITPTKNNKVKVILGEDREATGVLLSIDGEDGIVRMDLDEQLKILNLRFLGKL 1108
Cdd:cd06086      1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMDSDGDIKILPMNFLAKL 57
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
444-494 1.31e-26

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240506  Cd Length: 51  Bit Score: 102.97  E-value: 1.31e-26
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527047  444 FQPGDNVEVCEGELINLQGKVLSVDGNKITIMPKHEDLKDMLEFPAQELRK 494
Cdd:cd06082      1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
726-775 1.38e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240509  Cd Length: 52  Bit Score: 100.25  E-value: 1.38e-25
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 1904527047  726 DNELIGQTVRISQGPYKGYIGVVKDATESTARVELHSTCQTISVDRQRLT 775
Cdd:cd06085      2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLA 51
Spt5-NGN pfam03439
Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG ...
210-288 1.79e-25

Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit.


Pssm-ID: 397481  Cd Length: 84  Bit Score: 101.12  E-value: 1.79e-25
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527047  210 IGEERATAISLMRKFIAYQfTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLGywNQQMVPIKEMTDVL 288
Cdd:pfam03439    9 PGQEREVALSLMRKILALA-KTNNLGIYSVFAPDGLKGYIYVEADRQAAVKRALEGIPNVRGL--VPGLVPIKEMEHLL 84
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
300-337 6.87e-18

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240505  Cd Length: 38  Bit Score: 77.89  E-value: 6.87e-18
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1904527047  300 KSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMIPRIDY 337
Cdd:cd06081      1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
621-663 6.88e-18

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240508  Cd Length: 43  Bit Score: 77.95  E-value: 6.88e-18
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|...
gi 1904527047  621 KDIVKVIDGPHSDREGEIRHLYHSFAFLHCKKLVENGGMFVCK 663
Cdd:cd06084      1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
796-854 1.25e-15

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 72.48  E-value: 1.25e-15
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  796 GSQTPMYGS--GSRTPMY---GSQTPL--QDGSRTPHY--GSQTPLHD--GSRTPAQSGAWDPnnPNTPS 854
Cdd:pfam12815    1 GSRTPAYNSagGSRTPAWgadGSRTPAygGAGGRTPAYnqGGKTPAWGgaGSRTPAYYGAWGG--SRTPA 68
nusG PRK08559
transcription antitermination protein NusG; Validated
211-340 9.81e-11

transcription antitermination protein NusG; Validated


Pssm-ID: 181467 [Multi-domain]  Cd Length: 153  Bit Score: 61.04  E-value: 9.81e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  211 GEERATAISLMRKFIAYQftdtpLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlGYwNQQMVPIKEMTDVLKV 290
Cdd:PRK08559    16 GQERNVALMLAMRAKKEN-----LPIYAILAPPELKGYVLVEAESKGAVEEAIRGIPHVR-GV-VPGEISFEEVEHFLKP 88
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527047  291 VKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKM------IP---RIDYDRI 340
Cdd:PRK08559    89 KPIVEGIKEGDIVELIAGPFKGEKARVVRVDESKEEVTVELleaavpIPvtvRGDQVRV 147
PHA03269 PHA03269
envelope glycoprotein C; Provisional
849-994 7.39e-09

envelope glycoprotein C; Provisional


Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 59.74  E-value: 7.39e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  849 NPNTPSRAEEEYEYAFDDEPTPSPQayggtPNPQTPGYPDPS-SPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQ 927
Cdd:PHA03269    21 NLNTNIPIPELHTSAATQKPDPAPA-----PHQAASRAPDPAvAPTSAASRKPD--------------LAQAPTPAASEK 81
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1904527047  928 PSPSPQSYHQV--APSPAGYQNTHSPasyhPTPSPM-----AYQASPSPSPVGYSPMTPgAPSPGGYNPHTPGS 994
Cdd:PHA03269    82 FDPAPAPHQAAsrAPDPAVAPQLAAA----PKPDAAeaftsAAQAHEAPADAGTSAASK-KPDPAAHTQHSPPP 150
PHA03247 PHA03247
large tegument protein UL36; Provisional
769-986 1.10e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 59.95  E-value: 1.10e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  769 VDRQRLTTVGSRRPGGMTTTYGRTPMYGSQTPMYGSGSRTPMYGSQTPlqdGSRTPHYGSQTPLHDGSRTPAQSGAWDPN 848
Cdd:PHA03247  2661 VSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTP---EPAPHALVSATPLPPGPAAARQASPALPA 2737
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  849 NPNTPsraeeeyeyafddePTPSPQAYGGTPN----PQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPqg 924
Cdd:PHA03247  2738 APAPP--------------AVPAGPATPGGPArparPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSP-- 2801
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1904527047  925 syqPSPSPQSYHQVAPSPAgYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPMTP-GAPSPGG 986
Cdd:PHA03247  2802 ---WDPADPPAAVLAPAAA-LPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLgGSVAPGG 2860
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
868-994 5.93e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 57.08  E-value: 5.93e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  868 PTPSPQAYGGTPNPQTPGYPDPSSPQVN-PQYNPQTPGTP--AMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAG 944
Cdd:pfam03154  188 PPGTTQAATAGPTPSAPSVPPQGSPATSqPPNQTQSTAAPhtLIQQTPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLP 267
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1904527047  945 YQNTHSPASYHP---------TPSPMAYQASPSPSPVGYSPMTPG----APSPGGYNPHTPGS 994
Cdd:pfam03154  268 QPSLHGQMPPMPhslqtgpshMQHPVPPQPFPLTPQSSQSQVPPGpspaAPGQSQQRIHTPPS 330
KOW_elon_Spt5 TIGR00405
transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial ...
210-332 8.33e-08

transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial NusG and the uL24 (previously L24p/L26e) family of ribosomal proteins. The most recent papers and crystal structures make this a transcription elongation factor rather than a ribosomal protein.


Pssm-ID: 129499 [Multi-domain]  Cd Length: 145  Bit Score: 52.59  E-value: 8.33e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  210 IGEERATAislmrKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlgywnqQMVP----IKEMT 285
Cdd:TIGR00405    7 VGQEKNVA-----RLMARKARKSGLEVYSILAPESLKGYILVEAETKIDMRNPIIGVPHVR------GVVEgeidFEEIE 75
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 1904527047  286 DVLKVVKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMI 332
Cdd:TIGR00405   76 RFLTPKKIIESIKKGDIVEIISGPFKGERAKVIRVDESKEEVTLELI 122
NGN smart00738
In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, ...
210-290 1.21e-07

In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, this domain may confer affinity for Spt4p.Spt4p


Pssm-ID: 197850 [Multi-domain]  Cd Length: 106  Bit Score: 50.83  E-value: 1.21e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047   210 IGEERATAISLMRKFIAYQFTDtplQIKSVVAP-EHVK----------------GYIYVEAYKQTHVKQAIEGV----GN 268
Cdd:smart00738    9 SGQEKRVAENLERKAEALGLED---KIVSILVPtEEVKeirrgkkkvverklfpGYIFVEADLEDEVWTAIRGTpgvrGF 85
                            90       100
                    ....*....|....*....|..
gi 1904527047   269 LRLGYWnQQMVPIKEMTDVLKV 290
Cdd:smart00738   86 VGGGGK-PTPVPDDEIEKILKP 106
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
813-1001 3.03e-07

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 54.30  E-value: 3.03e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  813 SQTPL---QDGSRTPHYGSQTPLHDGSRTPaQSGAWDPNNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPNP----QTP 884
Cdd:COG5180    195 SPEKLdrpKVEVKDEAQEEPPDLTGGADHP-RPEAASSPKVDPPSTSEARSRPATVDaQPEMRPPADAKERRRaaigDTP 273
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  885 GYPDPSSPQVNPQYNPQT--PGTPAMYNTDQFSPYAAPSPQGSYQPSPS-----PQSYHQVAPSPAGYQNTHSPASYHPT 957
Cdd:COG5180    274 AAEPPGLPVLEAGSEPQSdaPEAETARPIDVKGVASAPPATRPVRPPGGardpgTPRPGQPTERPAGVPEAASDAGQPPS 353
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*...
gi 1904527047  958 PSPMAYQASPSpspvgySPMTPGAPSPG--GYN--PHTPGSGIEQNSS 1001
Cdd:COG5180    354 AYPPAEEAVPG------KPLEQGAPRPGssGGDgaPFQPPNGAPQPGL 395
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
495-522 2.22e-05

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 41.93  E-value: 2.22e-05
                            10        20
                    ....*....|....*....|....*...
gi 1904527047   495 YFKMGDHVKVIAGRFEGDTGLIVRVEEN 522
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
SP7_N cd22542
N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins ...
826-994 3.93e-05

N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP7, also called Osterix (Osx) in humans, is highly conserved among bone-forming vertebrates. It plays a major role, along with Runx2 and Dlx5 in driving the differentiation of mesenchymal precursor cells into osteoblasts and eventually osteocytes. SP7 also plays a regulatory role by inhibiting chondrocyte differentiation, maintaining the balance between differentiation of mesenchymal precursor cells into ossified bone or cartilage. Mutations of this gene have been associated with multiple dysfunctional bone phenotypes in vertebrates. SP7 is thought to play a role in diseases such as Osteogenesis imperfecta. SP7 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP7.


Pssm-ID: 411691 [Multi-domain]  Cd Length: 297  Bit Score: 46.82  E-value: 3.93e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  826 YGSQTPLHDgSRTPAQSGAWDPNNPNTP--------SRAEE----EYEYAFDD-----EPTPSPQA---YGGTPNPQTPG 885
Cdd:cd22542     26 FGGSSPIRD-SATPGKPGNNPGKKPYSLgsdlssakSRSSElmgdSYTATFSSgnglmSPSGSPQAsttYGNDYNPFSHS 104
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  886 YPDPSSPQ----VNPQYNPQTPGTPAMYNT-DQFSPY-----AAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYH 955
Cdd:cd22542    105 FPTSSGSQdpslLVSKGHPSADCLPSVYTSlDMAHPYgswykTGIHPGISSSSTNATASWWDMHSNTNWLSAQGQPDGLQ 184
                          170       180       190
                   ....*....|....*....|....*....|....*....
gi 1904527047  956 PTPSPMAYQASPSPSPVGYSPMTPgaPSPGGYNPHTPGS 994
Cdd:cd22542    185 ASLQPVPAQTPLNPQLPSYTEFTT--LNPAPYPAVGISS 221
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
730-761 4.02e-05

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 41.60  E-value: 4.02e-05
                           10        20        30
                   ....*....|....*....|....*....|..
gi 1904527047  730 IGQTVRISQGPYKGYIGVVKDATESTARVELH 761
Cdd:pfam00467    1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
848-984 4.79e-05

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 47.46  E-value: 4.79e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  848 NNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPN-PQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQG- 924
Cdd:NF033839   249 DNVNTKVEIENTVHKIFADmDAVVTKFKKGLTQDtPKEPGNKKPSAPKPGMQPSPQPEKKEVKPEPETPKPEVKPQLEKp 328
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1904527047  925 SYQPSPSPQSYH-QVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSpvgySPMTPGAPSP 984
Cdd:NF033839   329 KPEVKPQPEKPKpEVKPQLETPKPEVKPQPEKPKPEVKPQPEKPKPE----VKPQPETPKP 385
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
499-527 2.22e-04

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 39.29  E-value: 2.22e-04
                           10        20        30
                   ....*....|....*....|....*....|.
gi 1904527047  499 GDHVKVIAGRFEGDTGLIVRVEE--NFIILF 527
Cdd:pfam00467    2 GDVVRVIAGPFKGKVGKVVEVDDkkNRVLVE 32
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
866-1007 7.86e-04

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 43.60  E-value: 7.86e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  866 DEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQynPQTPGTPAMYNTDQFSPYAAPSPQ-GSYQPSPSPQSYH-QVAPSPA 943
Cdd:NF033839   370 EKPKPEVKPQPETPKPEVKPQPEKPKPEVKPQ--PEKPKPEVKPQPEKPKPEVKPQPEkPKPEVKPQPEKPKpEVKPQPE 447
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  944 GYQNTHSPASYHPTPSPMAYQASPSPSpVGYSPMTP----GAPSPGGYNPHTPG--SGIEQNSSDWVTTD 1007
Cdd:NF033839   448 KPKPEVKPQPETPKPEVKPQPEKPKPE-VKPQPEKPkpdnSKPQADDKKPSTPNnlSKDKQPSNQASTNE 516
KLF1_2_4_N cd21972
N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel ...
853-992 2.72e-03

N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF1, KLF2, KLF4, and similar proteins.


Pssm-ID: 409230 [Multi-domain]  Cd Length: 194  Bit Score: 40.35  E-value: 2.72e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  853 PSRAEEEYEYAFDDEPTPSPQAYG----GTPNPQTPGYPDPSSPQVNPQYN-PQTPGTPAMYNTDQFSPYAAPSPQGSYQ 927
Cdd:cd21972     22 LDLEFILSNTVTSDNDNPPPPDPAypppESPESCSTVYDSDGCHPTPNAYCgPNGPGLPGHFLLAGNSPNLGPKIKTENQ 101
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1904527047  928 PS-------PSPQSYHQVAPS------PAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPmtPGAPSPGGYNPHTP 992
Cdd:cd21972    102 EQacmpvagYSGHYGPREPQRvppappPPQYAGHFQYHGHFNMFSPPLRANHPGMSTVMLTP--LSTPPLGFLSPEEA 177
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
443-470 9.20e-03

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 34.61  E-value: 9.20e-03
                            10        20
                    ....*....|....*....|....*...
gi 1904527047   443 NFQPGDNVEVCEGELINLQGKVLSVDGN 470
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
 
Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
210-289 1.23e-37

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 135.74  E-value: 1.23e-37
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  210 IGEERATAISLMRKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLgyWNQQMVPIKEMTDVLK 289
Cdd:cd09888      9 PGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVYL--NTIKLVPIKEMPDVLS 86
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
796-913 2.70e-29

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 113.39  E-value: 2.70e-29
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047   796 GSQTPMYG-SGSRTPMYGSQTP----LQDGSRTPHYGSQTPLHDG--SRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEP 868
Cdd:smart01104    1 GGRTPAWGaSGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAWGGAG-PTGSRTPAWGGASAWGNKSSEG 79
                            90       100       110       120
                    ....*....|....*....|....*....|....*....|....*..
gi 1904527047   869 TPSPQA--YGGTPNPQTPGYpdpssPQVNPQYNPQTPGTPAMYNTDQ 913
Cdd:smart01104   80 SASSWAagPGGAYGAPTPGY-----GGTPSAYGPATPGGGAMAGSAS 121
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
495-545 2.62e-27

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240507  Cd Length: 51  Bit Score: 104.91  E-value: 2.62e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527047  495 YFKMGDHVKVIAGRFEGDTGLIVRVEENFIILFSDLTMHELKVLPRDLQLC 545
Cdd:cd06083      1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
1052-1108 7.20e-27

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240510  Cd Length: 58  Bit Score: 104.14  E-value: 7.20e-27
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1904527047 1052 EHLEPITPTKNNKVKVILGEDREATGVLLSIDGEDGIVRMDLDEQLKILNLRFLGKL 1108
Cdd:cd06086      1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMDSDGDIKILPMNFLAKL 57
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
444-494 1.31e-26

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240506  Cd Length: 51  Bit Score: 102.97  E-value: 1.31e-26
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1904527047  444 FQPGDNVEVCEGELINLQGKVLSVDGNKITIMPKHEDLKDMLEFPAQELRK 494
Cdd:cd06082      1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
726-775 1.38e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240509  Cd Length: 52  Bit Score: 100.25  E-value: 1.38e-25
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 1904527047  726 DNELIGQTVRISQGPYKGYIGVVKDATESTARVELHSTCQTISVDRQRLT 775
Cdd:cd06085      2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLA 51
Spt5-NGN pfam03439
Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG ...
210-288 1.79e-25

Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit.


Pssm-ID: 397481  Cd Length: 84  Bit Score: 101.12  E-value: 1.79e-25
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527047  210 IGEERATAISLMRKFIAYQfTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLGywNQQMVPIKEMTDVL 288
Cdd:pfam03439    9 PGQEREVALSLMRKILALA-KTNNLGIYSVFAPDGLKGYIYVEADRQAAVKRALEGIPNVRGL--VPGLVPIKEMEHLL 84
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
300-337 6.87e-18

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240505  Cd Length: 38  Bit Score: 77.89  E-value: 6.87e-18
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1904527047  300 KSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMIPRIDY 337
Cdd:cd06081      1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
621-663 6.88e-18

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240508  Cd Length: 43  Bit Score: 77.95  E-value: 6.88e-18
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|...
gi 1904527047  621 KDIVKVIDGPHSDREGEIRHLYHSFAFLHCKKLVENGGMFVCK 663
Cdd:cd06084      1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
796-854 1.25e-15

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 72.48  E-value: 1.25e-15
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  796 GSQTPMYGS--GSRTPMY---GSQTPL--QDGSRTPHY--GSQTPLHD--GSRTPAQSGAWDPnnPNTPS 854
Cdd:pfam12815    1 GSRTPAYNSagGSRTPAWgadGSRTPAygGAGGRTPAYnqGGKTPAWGgaGSRTPAYYGAWGG--SRTPA 68
nusG PRK08559
transcription antitermination protein NusG; Validated
211-340 9.81e-11

transcription antitermination protein NusG; Validated


Pssm-ID: 181467 [Multi-domain]  Cd Length: 153  Bit Score: 61.04  E-value: 9.81e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  211 GEERATAISLMRKFIAYQftdtpLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlGYwNQQMVPIKEMTDVLKV 290
Cdd:PRK08559    16 GQERNVALMLAMRAKKEN-----LPIYAILAPPELKGYVLVEAESKGAVEEAIRGIPHVR-GV-VPGEISFEEVEHFLKP 88
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527047  291 VKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKM------IP---RIDYDRI 340
Cdd:PRK08559    89 KPIVEGIKEGDIVELIAGPFKGEKARVVRVDESKEEVTVELleaavpIPvtvRGDQVRV 147
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
778-843 8.81e-10

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 55.91  E-value: 8.81e-10
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527047  778 GSRRPGGMTTTYGRTPMY-------------GSQTPMYGSGSRTPMYGsqtplQDGSRTPHYGSQTplhDGSRTPAQSG 843
Cdd:pfam12815    1 GSRTPAYNSAGGSRTPAWgadgsrtpayggaGGRTPAYNQGGKTPAWG-----GAGSRTPAYYGAW---GGSRTPAYGG 71
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
751-1070 1.43e-09

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 61.90  E-value: 1.43e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  751 ATESTARVELHSTCQTISvdrqrlTTVGSRRP--GGMTTTYGRTPMYGSQTPMYGSGsrTPMYGSQTPlQDGSRTPHYGS 828
Cdd:pfam17823  170 AASPAPRTAASSTTAASS------TTAASSAPttAASSAPATLTPARGISTAATATG--HPAAGTALA-AVGNSSPAAGT 240
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  829 QTPLhDGSRTPAQSGawdpnnpnTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGypDPSSPQVNPQYNPQTPGTPAM 908
Cdd:pfam17823  241 VTAA-VGTVTPAALA--------TLAAAAGTVASAAGTINMGDPHARRLSPAKHMPS--DTMARNPAAPMGAQAQGPIIQ 309
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  909 YNTDQfsPYAAPSPqgsyQPSPSPQSYHQVAPSPAGYQNTHSPASyhPTPSPMAYQASPSPSPVGYSPMTPGA------- 981
Cdd:pfam17823  310 VSTDQ--PVHNTAG----EPTPSPSNTTLEPNTPKSVASTNLAVV--TTTKAQAKEPSASPVPVLHTSMIPEVeatsptt 381
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  982 -PSPGGYNPHTPGSGIEQNSSdwvttdiQVKVRDTyLDTQVVGQT----GVIRSVTGGMCSVYLKDSEKVVSIssehlEP 1056
Cdd:pfam17823  382 qPSPLLPTQGAAGPGILLAPE-------QVATEAT-AGTASAGPTprssGDPKTLAMASCQLSTQGQYLVVTT-----DP 448
                          330
                   ....*....|....*...
gi 1904527047 1057 ITPTKNNK----VKVILG 1070
Cdd:pfam17823  449 LTPALVDKmfllVVLILG 466
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
813-994 1.66e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 62.09  E-value: 1.66e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  813 SQTPLQDGSRTPHYGSQTPLHDgsrtPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPtPSPQAYGGTPNPQTPGYPdPSSP 892
Cdd:pfam03154  259 SQVSPQPLPQPSLHGQMPPMPH----SLQTGPSHMQHPVPPQPFPLTPQSSQSQVP-PGPSPAAPGQSQQRIHTP-PSQS 332
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  893 QVNPQYNP-QTPGTPAMYNTdqfsPYAAPSPQGSYQPSPSPQSY----HQVAPSPagYQ-----------------NTHS 950
Cdd:pfam03154  333 QLQSQQPPrEQPLPPAPLSM----PHIKPPPTTPIPQLPNPQSHkhppHLSGPSP--FQmnsnlppppalkplsslSTHH 406
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*..
gi 1904527047  951 PASYHPTPSPMAYQASPSPSPVGYSPM---TPGAPSPGGYNPHTPGS 994
Cdd:pfam03154  407 PPSAHPPPLQLMPQSQQLPPPPAQPPVltqSQSLPPPAASHPPTSGL 453
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
800-1001 4.41e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 60.94  E-value: 4.41e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  800 PMYGSGSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPsrAEEEYEYAFDDEPTPSPQayggTP 879
Cdd:pfam03154  294 PPQPFPLTPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPP--APLSMPHIKPPPTTPIPQ----LP 367
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  880 NPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQgsyqPSP---SPQSyHQVAPSPAG------YQNTHS 950
Cdd:pfam03154  368 NPQSHKHPPHLSGPSPFQMNSNLPPPPALKPLSSLSTHHPPSAH----PPPlqlMPQS-QQLPPPPAQppvltqSQSLPP 442
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1904527047  951 PASYHPTPSpmAYQASPSPSPVGYSPMTPGAP----SPGGYNPHTP--GSGIEQNSS 1001
Cdd:pfam03154  443 PAASHPPTS--GLHQVPSQSPFPQHPFVPGGPppitPPSGPPTSTSsaMPGIQPPSS 497
PHA03269 PHA03269
envelope glycoprotein C; Provisional
849-994 7.39e-09

envelope glycoprotein C; Provisional


Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 59.74  E-value: 7.39e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  849 NPNTPSRAEEEYEYAFDDEPTPSPQayggtPNPQTPGYPDPS-SPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQ 927
Cdd:PHA03269    21 NLNTNIPIPELHTSAATQKPDPAPA-----PHQAASRAPDPAvAPTSAASRKPD--------------LAQAPTPAASEK 81
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1904527047  928 PSPSPQSYHQV--APSPAGYQNTHSPasyhPTPSPM-----AYQASPSPSPVGYSPMTPgAPSPGGYNPHTPGS 994
Cdd:PHA03269    82 FDPAPAPHQAAsrAPDPAVAPQLAAA----PKPDAAeaftsAAQAHEAPADAGTSAASK-KPDPAAHTQHSPPP 150
PHA03247 PHA03247
large tegument protein UL36; Provisional
769-986 1.10e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 59.95  E-value: 1.10e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  769 VDRQRLTTVGSRRPGGMTTTYGRTPMYGSQTPMYGSGSRTPMYGSQTPlqdGSRTPHYGSQTPLHDGSRTPAQSGAWDPN 848
Cdd:PHA03247  2661 VSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTP---EPAPHALVSATPLPPGPAAARQASPALPA 2737
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  849 NPNTPsraeeeyeyafddePTPSPQAYGGTPN----PQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPqg 924
Cdd:PHA03247  2738 APAPP--------------AVPAGPATPGGPArparPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSP-- 2801
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1904527047  925 syqPSPSPQSYHQVAPSPAgYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPMTP-GAPSPGG 986
Cdd:PHA03247  2802 ---WDPADPPAAVLAPAAA-LPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLgGSVAPGG 2860
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
782-995 1.59e-08

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 58.87  E-value: 1.59e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  782 PGGMTTTYGRTPMYGSQTPMYGSGSRTPMYGSQ-------TPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNP---- 850
Cdd:pfam09606  231 PQQMGGAPNQVAMQQQQPQQQGQQSQLGMGINQmqqmpqgVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIGDQNNYqqqq 310
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  851 --NTPSRAEEEYEYAFDDEPTPS--------PQAYGGTPNPQTPGypdpssPQVNPQYNPQTPGTPAMYNTDQFSPYAAP 920
Cdd:pfam09606  311 trQQQQQQGGNHPAAHQQQMNQSvgqggqvvALGGLNHLETWNPG------NFGGLGANPMQRGQPGMMSSPSPVPGQQV 384
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1904527047  921 SPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASyHPTPSPmAYQASPSPSPVGY--SPMTPGAPSPGGyNPHTPGSG 995
Cdd:pfam09606  385 RQVTPNQFMRQSPQPSVPSPQGPGSQPPQSHPG-GMIPSP-ALIPSPSPQMSQQpaQQRTIGQDSPGG-SLNTPGQS 458
NGN cd08000
N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily; The N-Utilization ...
206-288 3.94e-08

N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily; The N-Utilization Substance G (NusG) and its eukaryotic homolog Spt5 are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus in bacteria and archaea. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. Spt5 forms a Spt4-Spt5 complex that is an essential RNA Polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but its functions and requirements are different. The diverse activities suggest that, after diverging from a common ancestor, NusG proteins became specialized in different bacteria.


Pssm-ID: 193574 [Multi-domain]  Cd Length: 99  Bit Score: 51.94  E-value: 3.94e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  206 LFFQIGEERATAISLMRKFIA---------YQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLGYWN- 275
Cdd:cd08000      5 LFVKTGREEKVEKLLEKRFEAndieafvpkKEVPERKRGKIEEVIKPLFPGYVFVETDLSPELYELIREVPGVIGILGNg 84
                           90
                   ....*....|....*
gi 1904527047  276 --QQMVPIKEMTDVL 288
Cdd:cd08000     85 eePSPVSDEEIEMIL 99
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
868-994 5.93e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 57.08  E-value: 5.93e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  868 PTPSPQAYGGTPNPQTPGYPDPSSPQVN-PQYNPQTPGTP--AMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAG 944
Cdd:pfam03154  188 PPGTTQAATAGPTPSAPSVPPQGSPATSqPPNQTQSTAAPhtLIQQTPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLP 267
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1904527047  945 YQNTHSPASYHP---------TPSPMAYQASPSPSPVGYSPMTPG----APSPGGYNPHTPGS 994
Cdd:pfam03154  268 QPSLHGQMPPMPhslqtgpshMQHPVPPQPFPLTPQSSQSQVPPGpspaAPGQSQQRIHTPPS 330
KOW_elon_Spt5 TIGR00405
transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial ...
210-332 8.33e-08

transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial NusG and the uL24 (previously L24p/L26e) family of ribosomal proteins. The most recent papers and crystal structures make this a transcription elongation factor rather than a ribosomal protein.


Pssm-ID: 129499 [Multi-domain]  Cd Length: 145  Bit Score: 52.59  E-value: 8.33e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  210 IGEERATAislmrKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlgywnqQMVP----IKEMT 285
Cdd:TIGR00405    7 VGQEKNVA-----RLMARKARKSGLEVYSILAPESLKGYILVEAETKIDMRNPIIGVPHVR------GVVEgeidFEEIE 75
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 1904527047  286 DVLKVVKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMI 332
Cdd:TIGR00405   76 RFLTPKKIIESIKKGDIVEIISGPFKGERAKVIRVDESKEEVTLELI 122
PHA03378 PHA03378
EBNA-3B; Provisional
754-986 1.15e-07

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 56.23  E-value: 1.15e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  754 STARVELHSTCQTISvDRQRLTTVGSRRPGGMTTTygrtpmygSQTPMYGSGSRTPMYGSQTPLQD-----------GSR 822
Cdd:PHA03378   579 SPTTSQLASSAPSYA-QTPWPVPHPSQTPEPPTTQ--------SHIPETSAPRQWPMPLRPIPMRPlrmqpitfnvlVFP 649
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  823 TPHYGSQTPLHDGSRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPgyPDPSS-PQVNPQYNPQ 901
Cdd:PHA03378   650 TPHQPPQVEITPYKPTWTQIGHI-PYQPSPTGANTMLPIQWAPGTMQPPPRAPTPMRPPAAP--PGRAQrPAAATGRARP 726
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  902 TPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGyqnthSPASYHPTPSPMA-------YQASPSPSP--- 971
Cdd:PHA03378   727 PAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAA-----APGAPTPQPPPQAppapqqrPRGAPTPQPppq 801
                          250
                   ....*....|....*
gi 1904527047  972 VGYSPMTPGAPSPGG 986
Cdd:PHA03378   802 AGPTSMQLMPRAAPG 816
NGN smart00738
In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, ...
210-290 1.21e-07

In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, this domain may confer affinity for Spt4p.Spt4p


Pssm-ID: 197850 [Multi-domain]  Cd Length: 106  Bit Score: 50.83  E-value: 1.21e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047   210 IGEERATAISLMRKFIAYQFTDtplQIKSVVAP-EHVK----------------GYIYVEAYKQTHVKQAIEGV----GN 268
Cdd:smart00738    9 SGQEKRVAENLERKAEALGLED---KIVSILVPtEEVKeirrgkkkvverklfpGYIFVEADLEDEVWTAIRGTpgvrGF 85
                            90       100
                    ....*....|....*....|..
gi 1904527047   269 LRLGYWnQQMVPIKEMTDVLKV 290
Cdd:smart00738   86 VGGGGK-PTPVPDDEIEKILKP 106
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
447-493 1.68e-07

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 48.75  E-value: 1.68e-07
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*....
gi 1904527047  447 GDNVEVCEGELINLQGKVLSVDG--NKITIMPKHEDLKDMLEFPAQELR 493
Cdd:cd00380      1 GDVVRVLRGPYKGREGVVVDIDPrfGIVTVKGATGSKGAELKVRFDDVD 49
PHA03377 PHA03377
EBNA-3C; Provisional
762-992 2.41e-07

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 55.06  E-value: 2.41e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  762 STCQTISVDRQRLTTVGSRRPGGMTTTygrTPMYGSQTPMYgSGSRTPMYGSQTPLQD---GSRTPHYGSQTPLHDGSRT 838
Cdd:PHA03377   686 SVFVLPSVDAGRAQPSEESHLSSMSPT---QPISHEEQPRY-EDPDDPLDLSLHPDQApppSHQAPYSGHEEPQAQQAPY 761
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  839 PaqsGAWDPNNPNTPSRAEEEyeyafddeptpsPQAYGGTPNpQTPGY--PDPSSPQvNPQY--------------NPQT 902
Cdd:PHA03377   762 P---GYWEPRPPQAPYLGYQE------------PQAQGVQVS-SYPGYagPWGLRAQ-HPRYrhswaywsqypghgHPQG 824
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  903 PGTP-AMYNTDQFSPYAAP-----SPQGSYQPSPSP----QSYHQVAPSPAGYQNTHSPASYHPTPS----PMAYQASPS 968
Cdd:PHA03377   825 PWAPrPPHLPPQWDGSAGHgqdqvSQFPHLQSETGPprlqLSQVPQLPYSQTLVSSSAPSWSSPQPRapirPIPTRFPPP 904
                          250       260
                   ....*....|....*....|....
gi 1904527047  969 PSPVGYSpMTPGAPSPGGYNPHTP 992
Cdd:PHA03377   905 PMPLQDS-MAVGCDSSGTACPSMP 927
PHA03247 PHA03247
large tegument protein UL36; Provisional
786-1006 2.76e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 55.33  E-value: 2.76e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  786 TTTYGRTPMYGSQTPMygsgSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSR-TPAQSGAWDPNNPNTPsRAEEEYEYAF 864
Cdd:PHA03247  2817 ALPPAASPAGPLPPPT----SAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRrPPSRSPAAKPAAPARP-PVRRLARPAV 2891
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  865 DDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQPSPSPQSYHQVAPSPAG 944
Cdd:PHA03247  2892 SRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQ--------------PQPPPPPPPRPQPPLAPTTDPAGAGEPSG 2957
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1904527047  945 YQNTHSPASYHPTPSPMAYQASPSPSPvgySPMTPGAPSPGgyNPHTPGSGIeqnsSDWVTT 1006
Cdd:PHA03247  2958 AVPQPWLGALVPGRVAVPRFRVPQPAP---SREAPASSTPP--LTGHSLSRV----SSWASS 3010
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
813-1001 3.03e-07

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 54.30  E-value: 3.03e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  813 SQTPL---QDGSRTPHYGSQTPLHDGSRTPaQSGAWDPNNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPNP----QTP 884
Cdd:COG5180    195 SPEKLdrpKVEVKDEAQEEPPDLTGGADHP-RPEAASSPKVDPPSTSEARSRPATVDaQPEMRPPADAKERRRaaigDTP 273
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  885 GYPDPSSPQVNPQYNPQT--PGTPAMYNTDQFSPYAAPSPQGSYQPSPS-----PQSYHQVAPSPAGYQNTHSPASYHPT 957
Cdd:COG5180    274 AAEPPGLPVLEAGSEPQSdaPEAETARPIDVKGVASAPPATRPVRPPGGardpgTPRPGQPTERPAGVPEAASDAGQPPS 353
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*...
gi 1904527047  958 PSPMAYQASPSpspvgySPMTPGAPSPG--GYN--PHTPGSGIEQNSS 1001
Cdd:COG5180    354 AYPPAEEAVPG------KPLEQGAPRPGssGGDgaPFQPPNGAPQPGL 395
PRK10263 PRK10263
DNA translocase FtsK; Provisional
829-993 6.37e-07

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 53.94  E-value: 6.37e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  829 QTPLHDGSRTPAQSG--AWDP-NNPNTPsraeeeyeyafddEPTPSPQAYGGTPNPQtpgYPDPSSPQVNP---QYNPQT 902
Cdd:PRK10263   342 QTPPVASVDVPPAQPtvAWQPvPGPQTG-------------EPVIAPAPEGYPQQSQ---YAQPAVQYNEPlqqPVQPQQ 405
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  903 PGTPAMYNTDQFSPYAAPSP-QGSYQPSPSPQSYHQVAPSPAGYQNTHSPasYHPTPSPMAYQASPSPSPVGYSPMTPGA 981
Cdd:PRK10263   406 PYYAPAAEQPAQQPYYAPAPeQPAQQPYYAPAPEQPVAGNAWQAEEQQST--FAPQSTYQTEQTYQQPAAQEPLYQQPQP 483
                          170
                   ....*....|..
gi 1904527047  982 PSPGGYNPHTPG 993
Cdd:PRK10263   484 VEQQPVVEPEPV 495
PHA03247 PHA03247
large tegument protein UL36; Provisional
813-993 9.63e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.40  E-value: 9.63e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  813 SQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYP----- 887
Cdd:PHA03247  2567 SVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHppptv 2646
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  888 --------DPSSPQVNPQYNPQTPGTPAMYN--TDQFSPYAAPSPQGSYQPS---PSPQSYHQVAPSPAGYQNTHSPASY 954
Cdd:PHA03247  2647 ppperprdDPAPGRVSRPRRARRLGRAAQASspPQRPRRRAARPTVGSLTSLadpPPPPPTPEPAPHALVSATPLPPGPA 2726
                          170       180       190
                   ....*....|....*....|....*....|....*....
gi 1904527047  955 HPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPG 993
Cdd:PHA03247  2727 AARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAG 2765
PHA03378 PHA03378
EBNA-3B; Provisional
823-992 1.08e-06

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 53.15  E-value: 1.08e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  823 TPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEE---YEYAFDDEPTPSPQAYGGTPNPQTPGYPDPS-SPQVNPQ- 897
Cdd:PHA03378   582 TSQLASSAPSYAQTPWPVPHPSQTPEPPTTQSHIPETsapRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHqPPQVEITp 661
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  898 ------------YNPQTPG---------TPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHP 956
Cdd:PHA03378   662 ykptwtqighipYQPSPTGantmlpiqwAPGTMQPPPRAPTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAP 741
                          170       180       190
                   ....*....|....*....|....*....|....*...
gi 1904527047  957 TPS--PMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTP 992
Cdd:PHA03378   742 GRArpPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPP 779
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
731-774 1.95e-06

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 45.67  E-value: 1.95e-06
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*...
gi 1904527047  731 GQTVRISQGPYKGYIGVVKDATEST--ARVELH--STCQTISVDRQRL 774
Cdd:cd00380      1 GDVVRVLRGPYKGREGVVVDIDPRFgiVTVKGAtgSKGAELKVRFDDV 48
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
870-998 2.56e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 51.69  E-value: 2.56e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  870 PSPQAYGGTPNPQTPGYPDPSS-PQVNPqynpqTPGTPAMynTDQFSPYAAPSPQGSyQPSPSPQSYHQVAPS--PAGYQ 946
Cdd:pfam03154  172 PVLQAQSGAASPPSPPPPGTTQaATAGP-----TPSAPSV--PPQGSPATSQPPNQT-QSTAAPHTLIQQTPTlhPQRLP 243
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1904527047  947 NTHSPASYHPTPSPMAY-QASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQ 998
Cdd:pfam03154  244 SPHPPLQPMTQPPPPSQvSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQ 296
PRK10263 PRK10263
DNA translocase FtsK; Provisional
788-992 3.18e-06

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 51.62  E-value: 3.18e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  788 TYGRTPMYGSQTPMYGSGSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAE-EEYEYAFDD 866
Cdd:PRK10263   379 GYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQaEEQQSTFAP 458
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  867 EPTPSPQAYGGTPNPQTPGY--PDPSSPQVNPQYNPQT----PGTPAMYNTDQFS-------------------PYAAPS 921
Cdd:PRK10263   459 QSTYQTEQTYQQPAAQEPLYqqPQPVEQQPVVEPEPVVeetkPARPPLYYFEEVEekrarereqlaawyqpipePVKEPE 538
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1904527047  922 PQGSYQPSPSPQSYHQVAPSPAGyqnthSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPG---GYNPHTP 992
Cdd:PRK10263   539 PIKSSLKAPSVAAVPPVEAAAAV-----SPLASGVKKATLATGAAATVAAPVFSLANSGGPRPQvkeGIGPQLP 607
PHA03378 PHA03378
EBNA-3B; Provisional
788-984 4.64e-06

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 50.84  E-value: 4.64e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  788 TYGRTPMYGSQTPMYGSGSRTPMYGSQTP-----------LQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRA 856
Cdd:PHA03378   578 TSPTTSQLASSAPSYAQTPWPVPHPSQTPeppttqshipeTSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPPQV 657
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  857 EeeyeyafddeptPSPQAYGGTPNPQTPGYPDPSSPQVN--PQYNPQTPGTPAMYNTDQFSPYAAPS----PQGSYQPSP 930
Cdd:PHA03378   658 E------------ITPYKPTWTQIGHIPYQPSPTGANTMlpIQWAPGTMQPPPRAPTPMRPPAAPPGraqrPAAATGRAR 725
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 1904527047  931 SPQSYHQVAPSPAGYQNTHSPASYHPTPS-PMAYQASPSPSPVGyspmTPGAPSP 984
Cdd:PHA03378   726 PPAAAPGRARPPAAAPGRARPPAAAPGRArPPAAAPGRARPPAA----APGAPTP 776
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
499-543 4.77e-06

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 44.52  E-value: 4.77e-06
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*....
gi 1904527047  499 GDHVKVIAGRFEGDTGLIVRVEENF----IILFSDLTMHELKVLPRDLQ 543
Cdd:cd00380      1 GDVVRVLRGPYKGREGVVVDIDPRFgivtVKGATGSKGAELKVRFDDVD 49
PHA03247 PHA03247
large tegument protein UL36; Provisional
835-990 6.25e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 50.71  E-value: 6.25e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  835 GSRTPAQSGAWDPNNPNTPSRAEEEYeyaFDDEPTPSP-----------------QAYGGTPNPQTPGYPDPSSPQV--N 895
Cdd:PHA03247  2494 AAPDPGGGGPPDPDAPPAPSRLAPAI---LPDEPVGEPvhprmltwirgleelasDDAGDPPPPLPPAAPPAAPDRSvpP 2570
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  896 PQYNPQTPGtPAMyNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQA-SPSPSPVGY 974
Cdd:PHA03247  2571 PRPAPRPSE-PAV-TSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPdPHPPPTVPP 2648
                          170
                   ....*....|....*.
gi 1904527047  975 SPMTPGAPSPGGYNPH 990
Cdd:PHA03247  2649 PERPRDDPAPGRVSRP 2664
NGN_Arch cd09887
Archaeal N-Utilization Substance G (NusG) N-terminal (NGN) domain; The N-Utilization Substance ...
210-270 7.37e-06

Archaeal N-Utilization Substance G (NusG) N-terminal (NGN) domain; The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. Transcription in archaea has a eukaryotic-type transcription apparatus, but contains bacterial-type transcription factors. NusG is one of the few archaeal transcription factors that has orthologs in both bacteria and eukaryotes. Archaeal NusG is similar to bacterial NusG, composed of an NGN domain and a Kyrpides Ouzounis and Woese (KOW) repeat. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. NusG was originally discovered as a N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Archaeal NusG forms a complex with DNA-directed RNA polymerase subunit E (rpoE) that is similar to the Spt5-Spt4 complex in eukaryotes.


Pssm-ID: 193576  Cd Length: 82  Bit Score: 45.22  E-value: 7.37e-06
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1904527047  210 IGEERATAISLMRKFiayqfTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLR 270
Cdd:cd09887      9 AGQERNVADLLAMRA-----EKENLDVYSILVPEELKGYVFVEAEDPDRVEELIRGIPHVR 64
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
762-985 8.43e-06

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 49.91  E-value: 8.43e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  762 STCQTISVdrqrlTTVGSRRPGGmtTTYGRTPMYGSQTPM-YGSGSRTPMYGSQTPLQDgSRTPHYGSQTPlhdGSRTPA 840
Cdd:pfam05109  463 STGPTVST-----ADVTSPTPAG--TTSGASPVTPSPSPRdNGTESKAPDMTSPTSAVT-TPTPNATSPTP---AVTTPT 531
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  841 QSGAWDPNNPNTPSRAeeeYEYAFDDEPTPSPQAYGGTPNPQTPGY-------------PDPSSPQV---NPQYNPQ--- 901
Cdd:pfam05109  532 PNATSPTLGKTSPTSA---VTTPTPNATSPTPAVTTPTPNATIPTLgktsptsavttptPNATSPTVgetSPQANTTnht 608
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  902 ---TPGTPAMYNTDQFSPYAAPSPQGSYQPSPS------PQSYHQ-VAPSPAGYQNTHSP--ASYHPTPSPMAYQASPSP 969
Cdd:pfam05109  609 lggTSSTPVVTSPPKNATSAVTTGQHNITSSSTssmslrPSSISEtLSPSTSDNSTSHMPllTSAHPTGGENITQVTPAS 688
                          250
                   ....*....|....*.
gi 1904527047  970 SPVGYSPMTPGAPSPG 985
Cdd:pfam05109  689 TSTHHVSTSSPAPRPG 704
PRK10263 PRK10263
DNA translocase FtsK; Provisional
889-1000 9.70e-06

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 50.08  E-value: 9.70e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  889 PSSPQVNPQYNPQTpgtpamynTDQFSPYAAPSPQGSYQPSPSPQSYHQ-VAPSPAGYQNTHSPASYHPTPSPMAYQASP 967
Cdd:PRK10263   740 PHEPLFTPIVEPVQ--------QPQQPVAPQQQYQQPQQPVAPQPQYQQpQQPVAPQPQYQQPQQPVAPQPQYQQPQQPV 811
                           90       100       110
                   ....*....|....*....|....*....|...
gi 1904527047  968 SPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNS 1000
Cdd:PRK10263   812 APQPQYQQPQQPVAPQPQYQQPQQPVAPQPQDT 844
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
840-992 1.07e-05

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 46.95  E-value: 1.07e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  840 AQSGAWDPNNPNTPSR-AEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQfsPYA 918
Cdd:pfam15240   16 AQSSSEDVSQEDSPSLiSEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPGGPQQPPPQGGKQKPQGPPP--QGG 93
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1904527047  919 APSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSPvGYSPMTPGAPSPGGyNPHTP 992
Cdd:pfam15240   94 PRPPPGKPQGPPPQGGNQQQGPPPPGKPQGPPPQGGGPPPQGGNQQGPPPPPP-GNPQGPPQRPPQPG-NPQGP 165
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
495-522 2.22e-05

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 41.93  E-value: 2.22e-05
                            10        20
                    ....*....|....*....|....*...
gi 1904527047   495 YFKMGDHVKVIAGRFEGDTGLIVRVEEN 522
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
815-1002 3.32e-05

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 48.24  E-value: 3.32e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  815 TPLQDGSRTPHYGSQ-----------TPLHDGSRTPAqSGAWDPNNPNTPSRAEE-EYEYAFDDEPTPSPQAYGGTPNPQ 882
Cdd:PHA03307    26 ATPGDAADDLLSGSQgqlvsdsaelaAVTVVAGAAAC-DRFEPPTGPPPGPGTEApANESRSTPTWSLSTLAPASPAREG 104
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  883 TPGYPDPSSPqvnpqynpqtPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNThSPASYHPTP---- 958
Cdd:PHA03307   105 SPTPPGPSSP----------DPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPA-AVASDAASSrqaa 173
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*.
gi 1904527047  959 --SPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNSSD 1002
Cdd:PHA03307   174 lpLSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASS 219
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
867-990 3.44e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.06  E-value: 3.44e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  867 EPTPSPQAYGGTPNPQTPGyPDPSSPQVNPQYNPQTPGTPAmyntdqfsPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQ 946
Cdd:PRK07764   391 AGAPAAAAPSAAAAAPAAA-PAPAAAAPAAAAAPAPAAAPQ--------PAPAPAPAPAPPSPAGNAPAGGAPSPPPAAA 461
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....
gi 1904527047  947 NTHSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPH 990
Cdd:PRK07764   462 PSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAG 505
PHA03247 PHA03247
large tegument protein UL36; Provisional
799-1029 3.73e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.40  E-value: 3.73e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  799 TPMYGSGSRTPmyGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGT 878
Cdd:PHA03247  2741 PPAVPAGPATP--GGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAAL 2818
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  879 PNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFS-----PYAAPSPQGSYQPSPSPQSYHQVA--PSPAGYQNTHSP 951
Cdd:PHA03247  2819 PPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSvapggDVRRRPPSRSPAAKPAAPARPPVRrlARPAVSRSTESF 2898
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1904527047  952 ASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNSSDWVTTDIQVKVRDTYLDTQVVGQTGVIR 1029
Cdd:PHA03247  2899 ALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAVPR 2976
SP7_N cd22542
N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins ...
826-994 3.93e-05

N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP7, also called Osterix (Osx) in humans, is highly conserved among bone-forming vertebrates. It plays a major role, along with Runx2 and Dlx5 in driving the differentiation of mesenchymal precursor cells into osteoblasts and eventually osteocytes. SP7 also plays a regulatory role by inhibiting chondrocyte differentiation, maintaining the balance between differentiation of mesenchymal precursor cells into ossified bone or cartilage. Mutations of this gene have been associated with multiple dysfunctional bone phenotypes in vertebrates. SP7 is thought to play a role in diseases such as Osteogenesis imperfecta. SP7 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP7.


Pssm-ID: 411691 [Multi-domain]  Cd Length: 297  Bit Score: 46.82  E-value: 3.93e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  826 YGSQTPLHDgSRTPAQSGAWDPNNPNTP--------SRAEE----EYEYAFDD-----EPTPSPQA---YGGTPNPQTPG 885
Cdd:cd22542     26 FGGSSPIRD-SATPGKPGNNPGKKPYSLgsdlssakSRSSElmgdSYTATFSSgnglmSPSGSPQAsttYGNDYNPFSHS 104
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  886 YPDPSSPQ----VNPQYNPQTPGTPAMYNT-DQFSPY-----AAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYH 955
Cdd:cd22542    105 FPTSSGSQdpslLVSKGHPSADCLPSVYTSlDMAHPYgswykTGIHPGISSSSTNATASWWDMHSNTNWLSAQGQPDGLQ 184
                          170       180       190
                   ....*....|....*....|....*....|....*....
gi 1904527047  956 PTPSPMAYQASPSPSPVGYSPMTPgaPSPGGYNPHTPGS 994
Cdd:cd22542    185 ASLQPVPAQTPLNPQLPSYTEFTT--LNPAPYPAVGISS 221
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
730-761 4.02e-05

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 41.60  E-value: 4.02e-05
                           10        20        30
                   ....*....|....*....|....*....|..
gi 1904527047  730 IGQTVRISQGPYKGYIGVVKDATESTARVELH 761
Cdd:pfam00467    1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
848-984 4.79e-05

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 47.46  E-value: 4.79e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  848 NNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPN-PQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQG- 924
Cdd:NF033839   249 DNVNTKVEIENTVHKIFADmDAVVTKFKKGLTQDtPKEPGNKKPSAPKPGMQPSPQPEKKEVKPEPETPKPEVKPQLEKp 328
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1904527047  925 SYQPSPSPQSYH-QVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSpvgySPMTPGAPSP 984
Cdd:NF033839   329 KPEVKPQPEKPKpEVKPQLETPKPEVKPQPEKPKPEVKPQPEKPKPE----VKPQPETPKP 385
PHA03291 PHA03291
envelope glycoprotein I; Provisional
863-1010 5.69e-05

envelope glycoprotein I; Provisional


Pssm-ID: 223033 [Multi-domain]  Cd Length: 401  Bit Score: 46.87  E-value: 5.69e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  863 AFDDEPTPSPQAYGGTPnpqTPGYPDPSSPQVNPQYNPqtpgtpamynTDQFSPyAAPSPQGSYQPSPspqsyhQVAPSP 942
Cdd:PHA03291   165 AFPAEGTLAAPPLGEGS---ADGSCDPALPLSAPRLGP----------ADVFVP-ATPRPTPRTTASP------ETTPTP 224
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1904527047  943 AgyqNTHSPASyHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNSSDWVTTDIQV 1010
Cdd:PHA03291   225 S---TTTSPPS-TTIPAPSTTIAAPQAGTTPEAEGTPAPPTPGGGEAPPANATPAPEASRYELTVTQI 288
PRK14959 PRK14959
DNA polymerase III subunits gamma and tau; Provisional
868-982 9.17e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 184923 [Multi-domain]  Cd Length: 624  Bit Score: 46.60  E-value: 9.17e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  868 PTPSPQAYGGTPNPQTPGyPDPSSPQVNPQYNPQTPGTPAmyntdqfsPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQN 947
Cdd:PRK14959   387 EGPASGGAATIPTPGTQG-PQGTAPAAGMTPSSAAPATPA--------PSAAPSPRVPWDDAPPAPPRSGIPPRPAPRMP 457
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 1904527047  948 THSPASYHPTPSPMAYQASPSPS-PVGYSPMTPGAP 982
Cdd:PRK14959   458 EASPVPGAPDSVASASDAPPTLGdPSDTAEHTPSGP 493
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
815-984 1.04e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 46.68  E-value: 1.04e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  815 TPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYE--YAFDDEPTPSPQAYGGTPNPQTPGYPDP--- 889
Cdd:pfam03154   75 SPLKSAKRQREKGASDTEEPERATAKKSKTQEISRPNSPSEGEGESSdgRSVNDEGSSDPKDIDQDNRSTSPSIPSPqdn 154
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  890 -----SS----------PQVNPQYNPQTPGTPAMYNTDQfSPYAAPSPQGsyqPSPSPQSYHQVAPSPAGYQNTHSPASY 954
Cdd:pfam03154  155 esdsdSSaqqqilqtqpPVLQAQSGAASPPSPPPPGTTQ-AATAGPTPSA---PSVPPQGSPATSQPPNQTQSTAAPHTL 230
                          170       180       190
                   ....*....|....*....|....*....|
gi 1904527047  955 HPTPSPMAYQASPSPSPvGYSPMTPGAPSP 984
Cdd:pfam03154  231 IQQTPTLHPQRLPSPHP-PLQPMTQPPPPS 259
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
869-984 1.33e-04

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 46.06  E-value: 1.33e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  869 TPSPQAYGGTPNPQTPGY--PD-----PSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPS 941
Cdd:pfam05109  422 SKAPESTTTSPTLNTTGFaaPNtttglPSSTHVPTNLTAPASTGPTVSTADVTSPTPAGTTSGASPVTPSPSPRDNGTES 501
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|...
gi 1904527047  942 PAgyQNTHSPASYHPTPSPMAyqasPSPSPVGYSPmTPGAPSP 984
Cdd:pfam05109  502 KA--PDMTSPTSAVTTPTPNA----TSPTPAVTTP-TPNATSP 537
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
819-1007 1.44e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 46.13  E-value: 1.44e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  819 DGSRTPHYGSQTPLHDGSR--TPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNP 896
Cdd:PRK07764   596 GGEGPPAPASSGPPEEAARpaAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGG 675
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  897 QYNPQTPGTPAMyntdqfSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSP 976
Cdd:PRK07764   676 AAPAAPPPAPAP------AAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPP 749
                          170       180       190
                   ....*....|....*....|....*....|.
gi 1904527047  977 MTPGAPSPGGYNPHTPGSGIEQNSSDWVTTD 1007
Cdd:PRK07764   750 DPAGAPAQPPPPPAPAPAAAPAAAPPPSPPS 780
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
621-658 1.62e-04

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 40.28  E-value: 1.62e-04
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1904527047  621 KDIVKVIDGPHSDREGEIRHLYHSFAFLHCKKLVENGG 658
Cdd:cd00380      1 GDVVRVLRGPYKGREGVVVDIDPRFGIVTVKGATGSKG 38
dnaA PRK14086
chromosomal replication initiator protein DnaA;
809-973 1.65e-04

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 45.59  E-value: 1.65e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  809 PMY-GSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNnPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYP 887
Cdd:PRK14086   103 RRTsEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTAR-PAYPAYQQRPEPGAWPRAADDYGWQQQRLGFPPRAPYA 181
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  888 DPSSPQVNPQYNPQTPGtpamyntDQFSPYAAPSPQGSY--------------QPSPSPQSYHQV--APSPAGYQNTHSP 951
Cdd:PRK14086   182 SPASYAPEQERDREPYD-------AGRPEYDQRRRDYDHprpdwdrprrdrtdRPEPPPGAGHVHrgGPGPPERDDAPVV 254
                          170       180
                   ....*....|....*....|..
gi 1904527047  952 ASYHPTPSPMAYQASPSPSPVG 973
Cdd:PRK14086   255 PIRPSAPGPLAAQPAPAPGPGE 276
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
839-994 2.21e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 45.36  E-value: 2.21e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  839 PAQSGAWDPNNPNTPSRAEEEYEYAfdDEPTPS---PQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFS 915
Cdd:PRK07764   592 PGAAGGEGPPAPASSGPPEEAARPA--APAAPAapaAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGW 669
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  916 PYAAPSPQGSyQPSPSPQSYHQVAPS--PAGYQNTHSPASYHP---------TPSPMAYQASPSPSPVGYSPMTPGAPSP 984
Cdd:PRK07764   670 PAKAGGAAPA-APPPAPAPAAPAAPAgaAPAQPAPAPAATPPAgqaddpaaqPPQAAQGASAPSPAADDPVPLPPEPDDP 748
                          170
                   ....*....|
gi 1904527047  985 GGYNPHTPGS 994
Cdd:PRK07764   749 PDPAGAPAQP 758
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
499-527 2.22e-04

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 39.29  E-value: 2.22e-04
                           10        20        30
                   ....*....|....*....|....*....|.
gi 1904527047  499 GDHVKVIAGRFEGDTGLIVRVEE--NFIILF 527
Cdd:pfam00467    2 GDVVRVIAGPFKGKVGKVVEVDDkkNRVLVE 32
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
840-984 2.41e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 45.53  E-value: 2.41e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  840 AQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQayGGTPNPQTPGYPDPSSP-----QVNPQYNPQTpgTPAMYNTDQF 914
Cdd:pfam03154  176 AQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQ--GSPATSQPPNQTQSTAAphtliQQTPTLHPQR--LPSPHPPLQP 251
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  915 SPYAAPSPQGSYQPSPSPQSY------------------HQVAPSPAGYQNTHSPASYHPTPSPMAyqASPSPSPVGYSP 976
Cdd:pfam03154  252 MTQPPPPSQVSPQPLPQPSLHgqmppmphslqtgpshmqHPVPPQPFPLTPQSSQSQVPPGPSPAA--PGQSQQRIHTPP 329

                   ....*...
gi 1904527047  977 MTPGAPSP 984
Cdd:pfam03154  330 SQSQLQSQ 337
Pneumo_att_G pfam05539
Pneumovirinae attachment membrane glycoprotein G;
782-958 2.55e-04

Pneumovirinae attachment membrane glycoprotein G;


Pssm-ID: 114270 [Multi-domain]  Cd Length: 408  Bit Score: 44.65  E-value: 2.55e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  782 PGGMTTTYGRTPMyGSQTPMYGSGSRTPMYGSQT----PLQDG-SRTPHYGSQTPLHDgSRTPAQSGAWDP--NNPNTPS 854
Cdd:pfam05539  201 TQGHQTATANQRL-SSTEPVGTQGTTTSSNPEPQteppPSQRGpSGSPQHPPSTTSQD-QSTTGDGQEHTQrrKTPPATS 278
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  855 RAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPyaaPSPQ------GSYQP 928
Cdd:pfam05539  279 NRRSPHSTATPPPTTKRQETGRPTPRPTATTQSGSSPPHSSPPGVQANPTTQNLVDCKELDP---PKPNsicygvGIYNE 355
                          170       180       190
                   ....*....|....*....|....*....|
gi 1904527047  929 SpSPQSYHQVAPSPAGYqNTHSPASYHPTP 958
Cdd:pfam05539  356 A-LPRGCDIVVPLCSTY-TIMCMDTYYSKP 383
DUF1373 pfam07117
Protein of unknown function (DUF1373); This family consists of several hypothetical proteins ...
850-964 2.60e-04

Protein of unknown function (DUF1373); This family consists of several hypothetical proteins which seem to be specific to Oryzias latipes (Japanese ricefish). Members of this family are typically around 200 residues in length. The function of this family is unknown.


Pssm-ID: 462093 [Multi-domain]  Cd Length: 212  Bit Score: 43.63  E-value: 2.60e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  850 PNTPSRAEEEYEY----AFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGS 925
Cdd:pfam07117   42 PPRPEEEEGQGGGggtfPFPGSPEPEPGGGGSGPMPMSASAPEPEPAKAKPQRPAPAQGHGHGGGGDSDSSGSGSGHQGS 121
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|..
gi 1904527047  926 YQP---SPSPQSYHQVAPSPAGYQNTHSPasyHPTPSPMAYQ 964
Cdd:pfam07117  122 GGAgagAGAPGHQHEQEQESSSSDDDDED---EFEFTPEEDE 160
PTZ00395 PTZ00395
Sec24-related protein; Provisional
812-990 3.56e-04

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 45.07  E-value: 3.56e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  812 GSQTPLQDGSRTPHYGSQTPL-HDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTP--NP--QTPGY 886
Cdd:PTZ00395   345 GSPNAASAGAPFNGLGNQADGgHINQVHPDARGAWAGGPHSNASYNCAAYSNAAQSNAAQSNAGFSNAGysNPgnSNPGY 424
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  887 PDP---SSPQVNPQY------NPQTPGTPamYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPT 957
Cdd:PTZ00395   425 NNApnsNTPYNNPPNsntpysNPPNSNPP--YSNLPYSNTPYSNAPLSNAPPSSAKDHHSAYHAAYQHRAANQPAANLPT 502
                          170       180       190
                   ....*....|....*....|....*....|...
gi 1904527047  958 PSPMAyqASPSPSPVGYSPMTPGAPSPGGYNPH 990
Cdd:PTZ00395   503 ANQPA--ANNFHGAAGNSVGNPFASRPFGSAPY 533
PRK10263 PRK10263
DNA translocase FtsK; Provisional
809-974 3.65e-04

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 44.69  E-value: 3.65e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  809 PMYGSQTPLQDGSR--TPHYGSQTPlhDGSRTPAQSGaWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGY 886
Cdd:PRK10263   345 PVASVDVPPAQPTVawQPVPGPQTG--EPVIAPAPEG-YPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYY 421
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  887 -PDPSSPQVNPQYNPQtPGTPAMYNtdqfsPYAAPSPQGSYQPSPSPQSYHQ-VAPSPAGYQNTHSPASYHPT---PSPM 961
Cdd:PRK10263   422 aPAPEQPAQQPYYAPA-PEQPVAGN-----AWQAEEQQSTFAPQSTYQTEQTyQQPAAQEPLYQQPQPVEQQPvvePEPV 495
                          170
                   ....*....|...
gi 1904527047  962 AYQASPSPSPVGY 974
Cdd:PRK10263   496 VEETKPARPPLYY 508
PHA03247 PHA03247
large tegument protein UL36; Provisional
850-992 3.73e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.93  E-value: 3.73e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  850 PNTPS-RAEEEYEYAFDDEPTPSPQAyGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMY----NTDQFSPYAAPSPQG 924
Cdd:PHA03247  2475 PGAPVyRRPAEARFPFAAGAAPDPGG-GGPPDPDAPPAPSRLAPAILPDEPVGEPVHPRMLtwirGLEELASDDAGDPPP 2553
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  925 SYQPSPSPQSYHQVAPSPagyqnthSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAP--SPGGYNPHTP 992
Cdd:PHA03247  2554 PLPPAAPPAAPDRSVPPP-------RPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDrgDPRGPAPPSP 2616
dnaA PRK14086
chromosomal replication initiator protein DnaA;
867-993 3.88e-04

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 44.43  E-value: 3.88e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  867 EPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQsyhqvAPSPAGYQ 946
Cdd:PRK14086    94 EPAPPPPHARRTSEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTARPAYPAYQQRPEPGAWPR-----AADDYGWQ 168
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*...
gi 1904527047  947 NT-HSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGyNPHTPG 993
Cdd:PRK14086   169 QQrLGFPPRAPYASPASYAPEQERDREPYDAGRPEYDQRRR-DYDHPR 215
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
834-994 4.23e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 44.78  E-value: 4.23e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  834 DGSRTPAQSGAWDPNNPNTPSRAeeeyeyAFDDEPTPSPQAYGGTPNPQTPGYPDPSSP--QVNPQYNPQTPGTPAMYNT 911
Cdd:PHA03307   238 DSSSSESSGCGWGPENECPLPRP------APITLPTRIWEASGWNGPSSRPGPASSSSSprERSPSPSPSSPGSGPAPSS 311
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  912 DQFSPYAAPSPQGSyQPSPSPQSyhqVAPSPAGyqnTHSPASYHPTPSPmayqASPSPSPVGYSPMTPGAPSPGGYNPHT 991
Cdd:PHA03307   312 PRASSSSSSSRESS-SSSTSSSS---ESSRGAA---VSPGPSPSRSPSP----SRPPPPADPSSPRKRPRPSRAPSSPAA 380

                   ...
gi 1904527047  992 PGS 994
Cdd:PHA03307   381 SAG 383
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
879-1065 4.44e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 44.42  E-value: 4.44e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  879 PNPQTPGYPDPSSPQVNPQYNPQTPGTPAMyntdqfspyAAPSPQGSYQPSPSPQSyhQVAPSPAGYQNTHSPASYHPTP 958
Cdd:PRK14950   364 PAPQPAKPTAAAPSPVRPTPAPSTRPKAAA---------AANIPPKEPVRETATPP--PVPPRPVAPPVPHTPESAPKLT 432
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  959 spmayqasPSPSPVGYSPM-TPGAPSPGGYNP-HTPGSGIEQNSSDWVTTDIQVKVRDTYLdtQVVGQTGViRSVTggmc 1036
Cdd:PRK14950   433 --------RAAIPVDEKPKyTPPAPPKEEEKAlIADGDVLEQLEAIWKQILRDVPPRSPAV--QALLSSGV-RPVS---- 497
                          170       180       190
                   ....*....|....*....|....*....|
gi 1904527047 1037 svyLKDSEKVVSISSE-HLEPITPTKNNKV 1065
Cdd:PRK14950   498 ---VEKNTLTLSFKSKfHKDKIEEPENRKI 524
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
870-1001 4.76e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 44.10  E-value: 4.76e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  870 PSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTH 949
Cdd:PRK12323   392 PAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPAAAGPRP 471
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 1904527047  950 SPASYHPTPSPMAYQASPSPSPVGYSP---MTPGAPSPGGYNPHTPGSGIEQNSS 1001
Cdd:PRK12323   472 VAAAAAAAPARAAPAAAPAPADDDPPPweeLPPEFASPAPAQPDAAPAGWVAESI 526
Med26_M pfam15694
Mediator complex subunit 26 middle domain; Med26_M is the middle domain of subunit 26 of ...
882-998 7.84e-04

Mediator complex subunit 26 middle domain; Med26_M is the middle domain of subunit 26 of Mediator. Med19 and Med26 act synergistically to mediate the interaction between REST (a Kruppel-type zinc finger transcription factor that binds to a 21-bp RE1 silencing element present in over 900 human genes) and Mediator.


Pssm-ID: 464807 [Multi-domain]  Cd Length: 255  Bit Score: 42.55  E-value: 7.84e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  882 QTPGYPDPSSPQvNPQYNPQTpgTPAMYNTDQFSPYAapsPQGSY-QPSPSPQSYHQVAPSPAGYQNTHSP--------A 952
Cdd:pfam15694   81 ETGGPPQPKSPR-CSSFSPRN--SRHETFARRSSTYA---PKGSVpSPSPRSQVLDAQVPSPLPLSQPSTPpvqakrleK 154
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1904527047  953 SYHPTP-SPMAYQASPS-----PSPVGYSPMTPGAPSPGGYNPHTPGSGIEQ 998
Cdd:pfam15694  155 PPQSSPeSSQHWLEQSDseshqRHQDGSATLLSQSVSPGCKTPLHPGENSLP 206
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
866-1007 7.86e-04

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 43.60  E-value: 7.86e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  866 DEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQynPQTPGTPAMYNTDQFSPYAAPSPQ-GSYQPSPSPQSYH-QVAPSPA 943
Cdd:NF033839   370 EKPKPEVKPQPETPKPEVKPQPEKPKPEVKPQ--PEKPKPEVKPQPEKPKPEVKPQPEkPKPEVKPQPEKPKpEVKPQPE 447
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  944 GYQNTHSPASYHPTPSPMAYQASPSPSpVGYSPMTP----GAPSPGGYNPHTPG--SGIEQNSSDWVTTD 1007
Cdd:NF033839   448 KPKPEVKPQPETPKPEVKPQPEKPKPE-VKPQPEKPkpdnSKPQADDKKPSTPNnlSKDKQPSNQASTNE 516
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
820-992 8.19e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.62  E-value: 8.19e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  820 GSRTPHYGSQTPLHDGSRTPAQSGAWDPnnPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQvnPQYN 899
Cdd:PHA03307   773 ALLEPAEPQRGAGSSPPVRAEAAFRRPG--RLRRSGPAADAASRTASKRKSRSHTPDGGSESSGPARPPGAAAR--PPPA 848
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  900 PQTPGTPAMyntDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASyHPTPSPMAyqaspsPSPVGYSPMTP 979
Cdd:PHA03307   849 RSSESSKSK---PAAAGGRARGKNGRRRPRPPEPRARPGAAAPPKAAAAAPPAG-APAPRPRP------APRVKLGPMPP 918
                          170       180
                   ....*....|....*....|
gi 1904527047  980 GAPSP-GGY------NPHTP 992
Cdd:PHA03307   919 GGPDPrGGFrrvppgDLHTP 938
PHA03369 PHA03369
capsid maturational protease; Provisional
838-935 9.90e-04

capsid maturational protease; Provisional


Pssm-ID: 223061 [Multi-domain]  Cd Length: 663  Bit Score: 43.06  E-value: 9.90e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  838 TPAQSGAWDPNnPNTPSRAEEE-YEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVnpqynPQTPGTPAMYNTDQFSP 916
Cdd:PHA03369   353 LTAPSRVLAAA-AKVAVIAAPQtHTGPADRQRPQRPDGIPYSVPARSPMTAYPPVPQF-----CGDPGLVSPYNPQSPGT 426
                           90
                   ....*....|....*....
gi 1904527047  917 YAAPSPQGSYQPSPSPQSY 935
Cdd:PHA03369   427 SYGPEPVGPVPPQPTNPYV 445
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
816-985 1.12e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.24  E-value: 1.12e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  816 PLQDGSRTPHYGSQT--PLHDGSRTPAQSGAWDPNNP-------NTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGY 886
Cdd:PHA03307   195 PSTPPAAASPRPPRRssPISASASSPAPAPGRSAADDagasssdSSSSESSGCGWGPENECPLPRPAPITLPTRIWEASG 274
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  887 PDPSSPQVNPQYNPQTPGtpamyntdqfSPYAAPSP---QGSYQPSPSPQSYHQVAPSPAGyqnTHSPASYHPTPSPMAY 963
Cdd:PHA03307   275 WNGPSSRPGPASSSSSPR----------ERSPSPSPsspGSGPAPSSPRASSSSSSSRESS---SSSTSSSSESSRGAAV 341
                          170       180
                   ....*....|....*....|..
gi 1904527047  964 QASPSPSPVGYSPMTPGAPSPG 985
Cdd:PHA03307   342 SPGPSPSRSPSPSRPPPPADPS 363
dnaA PRK14086
chromosomal replication initiator protein DnaA;
847-995 1.21e-03

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 42.89  E-value: 1.21e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  847 PNNPNTPSRAEEeyeyafddeptPSPQAYGGTPNPQTPGYPDPSSPQVNPQYnPQTPGTPAMY--NTDQFSPYAAPSPQG 924
Cdd:PRK14086    96 APPPPHARRTSE-----------PELPRPGRRPYEGYGGPRADDRPPGLPRQ-DQLPTARPAYpaYQQRPEPGAWPRAAD 163
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  925 SYQPSPSPQSYhqvaPSPAGYQnthSPASYHPTPSPMAY----------QASPSPSPVGYSPMTPGA-------PSPGGY 987
Cdd:PRK14086   164 DYGWQQQRLGF----PPRAPYA---SPASYAPEQERDREpydagrpeydQRRRDYDHPRPDWDRPRRdrtdrpePPPGAG 236

                   ....*...
gi 1904527047  988 NPHTPGSG 995
Cdd:PRK14086   237 HVHRGGPG 244
PHA03369 PHA03369
capsid maturational protease; Provisional
878-1015 1.26e-03

capsid maturational protease; Provisional


Pssm-ID: 223061 [Multi-domain]  Cd Length: 663  Bit Score: 43.06  E-value: 1.26e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  878 TPNPQTPGYPDPSSPQVNPqynPQTPGTPAM---YNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSpAGYQNTHSPASY 954
Cdd:PHA03369   353 LTAPSRVLAAAAKVAVIAA---PQTHTGPADrqrPQRPDGIPYSVPARSPMTAYPPVPQFCGDPGLV-SPYNPQSPGTSY 428
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1904527047  955 HPTPSPMAYQASPSPS--PVGYSPMT-PGAPSPGGYnpHTPGS-GIEQNSSDWVTTDIQVKVRDT 1015
Cdd:PHA03369   429 GPEPVGPVPPQPTNPYvmPISMANMVyPGHPQEHGH--ERKRKrGGELKEELIETLKLVKKLKEE 491
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
876-995 1.56e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 42.67  E-value: 1.56e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  876 GGTPNPQTPGYPDPSSPQVnPQYNPQTPGTPAMyntdqfspyAAPSPQGSYQPSPSPQSYHQVAPSPAgyqnthSPASYH 955
Cdd:PRK07764   389 GGAGAPAAAAPSAAAAAPA-AAPAPAAAAPAAA---------AAPAPAAAPQPAPAPAPAPAPPSPAG------NAPAGG 452
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|
gi 1904527047  956 PTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSG 995
Cdd:PRK07764   453 APSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAA 492
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
791-984 1.96e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 42.47  E-value: 1.96e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  791 RTPMYGSQTPMYGSGSRTPMYGSQTPLQDGSRTPhygsQTPLHDGSRTPAQSGAwDPNNPNTPSRAEeeyeyafdDEPTP 870
Cdd:PHA03307    64 RFEPPTGPPPGPGTEAPANESRSTPTWSLSTLAP----ASPAREGSPTPPGPSS-PDPPPPTPPPAS--------PPPSP 130
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  871 SPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTdqfSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGyqnths 950
Cdd:PHA03307   131 APDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQ---AALPLSSPEETARAPSSPPAEPPPSTPPAA------ 201
                          170       180       190
                   ....*....|....*....|....*....|....
gi 1904527047  951 PASYHPTPSPMAyqASPSPSPVGYSPMTPGAPSP 984
Cdd:PHA03307   202 ASPRPPRRSSPI--SASASSPAPAPGRSAADDAG 233
PHA03325 PHA03325
nuclear-egress-membrane-like protein; Provisional
828-989 1.99e-03

nuclear-egress-membrane-like protein; Provisional


Pssm-ID: 223044  Cd Length: 418  Bit Score: 41.79  E-value: 1.99e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  828 SQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAyggtpnpqtpgYPDPSSPQVNPQYNPQTPGTPA 907
Cdd:PHA03325   266 SSLPTSAPKRRSRRAGAMRAAAGETADLADDDGSEHSDPEPLPASLP-----------PPPVRRPRVKHPEAGKEEPDGA 334
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  908 MYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGyqnthSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGY 987
Cdd:PHA03325   335 RNAEAKEPAQPATSTSSKGSSSAQNKDSGSTGPGSSL-----AAASSFLEDDDFGSPPLDLTTSLRHMPSPSVTSAPEPP 409

                   ..
gi 1904527047  988 NP 989
Cdd:PHA03325   410 SI 411
PHA03269 PHA03269
envelope glycoprotein C; Provisional
905-1006 2.43e-03

envelope glycoprotein C; Provisional


Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 42.02  E-value: 2.43e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  905 TPAMYN---TDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPMTPGA 981
Cdd:PHA03269    28 IPELHTsaaTQKPDPAPAPHQAASRAPDPAVAPTSAASRKPDLAQAPTPAASEKFDPAPAPHQAASRAPDPAVAPQLAAA 107
                           90       100
                   ....*....|....*....|....*
gi 1904527047  982 PSPggyNPHTPGSGIEQNSSDWVTT 1006
Cdd:PHA03269   108 PKP---DAAEAFTSAAQAHEAPADA 129
KLF1_2_4_N cd21972
N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel ...
853-992 2.72e-03

N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF1, KLF2, KLF4, and similar proteins.


Pssm-ID: 409230 [Multi-domain]  Cd Length: 194  Bit Score: 40.35  E-value: 2.72e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  853 PSRAEEEYEYAFDDEPTPSPQAYG----GTPNPQTPGYPDPSSPQVNPQYN-PQTPGTPAMYNTDQFSPYAAPSPQGSYQ 927
Cdd:cd21972     22 LDLEFILSNTVTSDNDNPPPPDPAypppESPESCSTVYDSDGCHPTPNAYCgPNGPGLPGHFLLAGNSPNLGPKIKTENQ 101
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1904527047  928 PS-------PSPQSYHQVAPS------PAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPmtPGAPSPGGYNPHTP 992
Cdd:cd21972    102 EQacmpvagYSGHYGPREPQRvppappPPQYAGHFQYHGHFNMFSPPLRANHPGMSTVMLTP--LSTPPLGFLSPEEA 177
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
868-985 2.93e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 41.90  E-value: 2.93e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  868 PTPSPQAYGGTPNPQTPgypDPSSPQVNPQYNPQTPGTPAmyntdqfSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQN 947
Cdd:PRK07764   404 AAPAAAPAPAAAAPAAA---AAPAPAAAPQPAPAPAPAPA-------PPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAA 473
                           90       100       110
                   ....*....|....*....|....*....|....*...
gi 1904527047  948 THSPASyhPTPSPMAYQASPSPSPVgysPMTPGAPSPG 985
Cdd:PRK07764   474 PEPTAA--PAPAPPAAPAPAAAPAA---PAAPAAPAGA 506
KLF3_N cd21577
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ...
915-990 3.39e-03

N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.


Pssm-ID: 410554 [Multi-domain]  Cd Length: 214  Bit Score: 40.02  E-value: 3.39e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1904527047  915 SPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHsPASYHP--TPSPMAYQASPSPSPVGYSPMTpgAPSPGGYNPH 990
Cdd:cd21577     41 SSSSSSSSPSSRASPPSPYSKSSPPSPPQQRPLSP-PLSLPPpvAPPPLSPGSVPGGLPVISPVMV--QPVPVLYPPH 115
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
805-943 5.05e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 41.12  E-value: 5.05e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  805 GSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEyafddeptPSPQAYGGTPNPQTP 884
Cdd:PRK07764   674 GGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASA--------PSPAADDPVPLPPEP 745
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1904527047  885 GYPDPSSPQVNPQYNPQTPGTPAmyntdqfspyAAPSPQGSYQPSPSPQSYHQVAPSPA 943
Cdd:PRK07764   746 DDPPDPAGAPAQPPPPPAPAPAA----------APAAAPPPSPPSEEEEMAEDDAPSMD 794
KREPA2 cd23959
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ...
793-984 5.46e-03

Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.


Pssm-ID: 467780 [Multi-domain]  Cd Length: 424  Bit Score: 40.62  E-value: 5.46e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  793 PMYGSQTPMYGSGSRTPMYGSQTPLQDGSRTPHYGSQTPLhDGSRTPAQSGAWDPNNP----NTPSRAEEEYEYAFDDEP 868
Cdd:cd23959     56 PLYGAVSPEGENPFDGPGLVTASTVSDCYVGNANFYEVDM-SDAFAMAPDESLGPFRAarvpNPFSASSSTQRETHKTAQ 134
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1904527047  869 TPSPQAYGGTPnPQTPGYPDPSSPQVNPqynPQTPGTPAMYNTDQ-FSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQN 947
Cdd:cd23959    135 VAPPKAEPQTA-PVTPFGQLPMFGQHPP---PAKPLPAAAAAQQSsASPGEVASPFASGTVSASPFATATDTAPSSGAPD 210
                          170       180       190
                   ....*....|....*....|....*....|....*..
gi 1904527047  948 THSPASyhPTPSPMAyqASPSPSPVGYSPMTPGAPSP 984
Cdd:cd23959    211 GFPAEA--SAPSPFA--APASAASFPAAPVANGEAAT 243
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
443-470 9.20e-03

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 34.61  E-value: 9.20e-03
                            10        20
                    ....*....|....*....|....*...
gi 1904527047   443 NFQPGDNVEVCEGELINLQGKVLSVDGN 470
Cdd:smart00739    1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH